Recipes
This is a loose collection of code snippets for more or less typical tasks. If you would like to suggest improvements or new snippets,Collect files for analysis
If you want your script to analyze a collection of files, you need to compile a list of all target files at the beginning of the script. The following code section presents a procedure that does just that. Download and include the procedure into your Praat script and call it with 3 arguments (see usage example below):
- the path to your corpus (string)
- an indicator as to whether you want to search recursively (1) or not (0)
- a vector of search patterns, e.g.
{"*.wav", "*.WAV"}
to include all files whose file name ends with either wav or WAV
Procedure
procedure fileList: .baseDir$, .recursive, .pattern$#
######
### collect files (incl. path) in Strings object
### (original selection is retained)
###
### arguments:
### .baseDir$ (string): where the search for files should start
### .recursive (integer): search recursively in all sub directories of baseDir (=1) or not (<>1, e.g. 0)
### .pattern$# (string vector): list of search patterns (>=1) as vector, e.g. { "pattern1", "pattern2", ... }
###
### creates:
### Strings object (name: fileList, id: fileList.id): list of collected files (incl. path starting at baseDir; path separator: "/")
######
# get original selection
.selection# = selected# ()
# create empty lists (Strings objects) for files and directories
.fid = Create Strings as tokens: "", " "
.did = Create Strings as tokens: "", " "
Insert string: 0, .baseDir$
# repeat until folder list is empty
repeat
selectObject: .did
.d$ = Get string: 1
Remove string: 1
# find files with specified pattern
for .i to size (.pattern$#)
.fl = Create Strings as file list: "fl", .d$ + "/" + .pattern$# [.i]
.n = Get number of strings
if .n > 0
for .j to .n
selectObject: .fl
.f$ = Get string: .j
selectObject: .fid
# insert path + filename
Insert string: 0, .d$ + "/" + .f$
endfor
endif
nocheck removeObject: .fl
endfor
# if recursive option is true collect folders and continue file search
# if recursive option is false do nothing (folder list is empty - loop will terminate)
if .recursive = 1
.dl = Create Strings as directory list: "dl", .d$ + "/*"
.n = Get number of strings
if .n > 0
for .i to .n
selectObject: .dl
.rd$ = Get string: .i
selectObject: .did
Insert string: 0, .d$ + "/" + .rd$
endfor
endif
nocheck removeObject: .dl
endif
selectObject: .did
.s = Get number of strings
until .s = 0
# replace Windows path separator ("\") with neutral separator ("/")
selectObject: .fid
.id = Replace all: "\", "/", 0, "literals"
Rename: "fileList"
nocheck removeObject: .did
nocheck removeObject: .fid
# restore original selection
selectObject (.selection#)
endproc
Usage example
include /path/to/procedure_fileList.praat
@fileList: "/path/to/your/corpus", 1, {"*.wav", "*.WAV"}
selectObject: fileList.id
num_of_files = Get number of strings
for i to num_of_files
selectObject: fileList.id
file$ = Get string: i
file_id = Read from file: file$
### do something with the file ###
removeObject: file_id
endfor
removeObject: fileList.id
Parallel processing with Python wrapper
If you need to analyze a large collection of files with a computationally intensive Praat script, you can save a lot of time by processing several files in parallel. The only prerequisite is that your computer has a modern CPU with several processing cores.
One way to implement this is to use a Python wrapper script that handles parallel processing by launching multiple Praat instances to analyze multiple files simultaneously. The following code section presents a rudimentary implementation of a Python wrapper script, accompanied by a simple Praat script for illustration (calculating smoothed cepstral peak prominence, CPPS). Of course, the complexity of the Praat script can far exceed that of the example provided.
Python script
from multiprocessing import Process, Queue, cpu_count
import queue
from glob import glob
import subprocess
# customize to your system:
corpus_dir = '/path/to/wav-files/'
praat_script = 'cpps.praat'
praat_binary = '/path/to/praat-binary'
def analyze(files_to_analyze, analysis_results):
while True:
try:
# try to get a file from the queue
# get_nowait() function will raise queue.Empty exception if the queue is empty.
file = files_to_analyze.get_nowait()
except queue.Empty:
# exit while loop if queue is empty (all files are analyzed)
break
else:
# if no exception has been raised we have a file to analyze
# run praat script and add result to analysis_results queue
result = subprocess.run([praat_binary, '--run', '--no-pref-files', '--no-plugins', '--utf8', praat_script, file], check=True, capture_output=True, text=True)
analysis_results.put(file + ',' + str(result.stdout))
return True
def main():
# limit CPU cores to fixed number (must be < cpu_count())
#number_of_processes = 4
# or use all available CPU cores
number_of_processes = cpu_count()
# initialize queues for files and results
files_to_analyze = Queue()
analysis_results = Queue()
processes = []
# add all wav files in corpus_dir to the files queue
for file in glob(corpus_dir + '*.wav'):
files_to_analyze.put(file)
# create and start parallel processes
for _ in range(number_of_processes):
p = Process(target=analyze, args=(files_to_analyze, analysis_results))
processes.append(p)
p.start()
# complete processes
for p in processes:
p.join()
# print results
while not analysis_results.empty():
print(analysis_results.get())
return True
if __name__ == '__main__':
main()
Praat script
# receive all arguments specified in the Python subprocess.run() command
# In the exampale above, only one argument is specified, namely the file to be analyzed
form command line arguments
word file Empty
endform
# read the file and perform analysis
sound = Read from file: file$
pcep = To PowerCepstrogram: 60, 0.002, 5000, 50
cpps = Get CPPS: "yes", 0.02, 0.0005, 60, 330, 0.05, "parabolic", 0.001, 0.05, "Straight", "Robust"
# output results to stdout, which is evaluated by the Python script
writeInfoLine: cpps
If you'd like to try parallel processing:
- Please verify that you have installed reasonably up-to-date versions of Python 3
(run
python --version
in a terminal) and Praat. - Download both scripts (Python script, Praat script) to the same directory.
- Open the Python script in an editor and adapt the paths in lines 7 and 9. To reproduce this example, the corpus_dir should contain some wav files. Your praat_binary is probably located in C:\Program Files\Praat.exe (Windows), /Applications/Praat.app/Contents/MacOS/Praat (Mac), or /usr/local/bin/praat (Linux).
- If you wish, you can limit the number of processes running simultaneously in line 30 (and remove/comment line 32). Default: number of processes = number of available CPU cores. (You can even run the script if you have only one CPU core available; in this case you get concurrency (virtual parallelism), which is much slower compared to simultaneous processing in multiple processor cores (real parallelism).)
- Run the Python script in a terminal:
python praat_parallel_processing.py
and watch your CPU activity. - To compare processing time with the conventional (consecutive) approach, download and adapt another Python script and run it on the same corpus.