Recipes

This is a loose collection of code snippets for more or less typical tasks. If you would like to suggest improvements or new snippets,

Collect files for analysis

If you want your script to analyze a collection of files, you need to compile a list of all target files at the beginning of the script. The following code section presents a procedure that does just that. Download and include the procedure into your Praat script and call it with 3 arguments (see usage example below):

  1. the path to your corpus (string)
  2. an indicator as to whether you want to search recursively (1) or not (0)
  3. a vector of search patterns, e.g. {"*.wav", "*.WAV"} to include all files whose file name ends with either wav or WAV

Procedure

procedure fileList: .baseDir$, .recursive, .pattern$# ###### ### collect files (incl. path) in Strings object ### (original selection is retained) ### ### arguments: ### .baseDir$ (string): where the search for files should start ### .recursive (integer): search recursively in all sub directories of baseDir (=1) or not (<>1, e.g. 0) ### .pattern$# (string vector): list of search patterns (>=1) as vector, e.g. { "pattern1", "pattern2", ... } ### ### creates: ### Strings object (name: fileList, id: fileList.id): list of collected files (incl. path starting at baseDir; path separator: "/") ###### # get original selection .selection# = selected# () # create empty lists (Strings objects) for files and directories .fid = Create Strings as tokens: "", " " .did = Create Strings as tokens: "", " " Insert string: 0, .baseDir$ # repeat until folder list is empty repeat selectObject: .did .d$ = Get string: 1 Remove string: 1 # find files with specified pattern for .i to size (.pattern$#) .fl = Create Strings as file list: "fl", .d$ + "/" + .pattern$# [.i] .n = Get number of strings if .n > 0 for .j to .n selectObject: .fl .f$ = Get string: .j selectObject: .fid # insert path + filename Insert string: 0, .d$ + "/" + .f$ endfor endif nocheck removeObject: .fl endfor # if recursive option is true collect folders and continue file search # if recursive option is false do nothing (folder list is empty - loop will terminate) if .recursive = 1 .dl = Create Strings as directory list: "dl", .d$ + "/*" .n = Get number of strings if .n > 0 for .i to .n selectObject: .dl .rd$ = Get string: .i selectObject: .did Insert string: 0, .d$ + "/" + .rd$ endfor endif nocheck removeObject: .dl endif selectObject: .did .s = Get number of strings until .s = 0 # replace Windows path separator ("\") with neutral separator ("/") selectObject: .fid .id = Replace all: "\", "/", 0, "literals" Rename: "fileList" nocheck removeObject: .did nocheck removeObject: .fid # restore original selection selectObject (.selection#) endproc

Usage example

include /path/to/procedure_fileList.praat @fileList: "/path/to/your/corpus", 1, {"*.wav", "*.WAV"} selectObject: fileList.id num_of_files = Get number of strings for i to num_of_files selectObject: fileList.id file$ = Get string: i file_id = Read from file: file$ ### do something with the file ### removeObject: file_id endfor removeObject: fileList.id

Parallel processing with Python wrapper

If you need to analyze a large collection of files with a computationally intensive Praat script, you can save a lot of time by processing several files in parallel. The only prerequisite is that your computer has a modern CPU with several processing cores.

One way to implement this is to use a Python wrapper script that handles parallel processing by launching multiple Praat instances to analyze multiple files simultaneously. The following code section presents a rudimentary implementation of a Python wrapper script, accompanied by a simple Praat script for illustration (calculating smoothed cepstral peak prominence, CPPS). Of course, the complexity of the Praat script can far exceed that of the example provided.

Python script

from multiprocessing import Process, Queue, cpu_count import queue from glob import glob import subprocess # customize to your system: corpus_dir = '/path/to/wav-files/' praat_script = 'cpps.praat' praat_binary = '/path/to/praat-binary' def analyze(files_to_analyze, analysis_results): while True: try: # try to get a file from the queue # get_nowait() function will raise queue.Empty exception if the queue is empty. file = files_to_analyze.get_nowait() except queue.Empty: # exit while loop if queue is empty (all files are analyzed) break else: # if no exception has been raised we have a file to analyze # run praat script and add result to analysis_results queue result = subprocess.run([praat_binary, '--run', '--no-pref-files', '--no-plugins', '--utf8', praat_script, file], check=True, capture_output=True, text=True) analysis_results.put(file + ',' + str(result.stdout)) return True def main(): # limit CPU cores to fixed number (must be < cpu_count()) #number_of_processes = 4 # or use all available CPU cores number_of_processes = cpu_count() # initialize queues for files and results files_to_analyze = Queue() analysis_results = Queue() processes = [] # add all wav files in corpus_dir to the files queue for file in glob(corpus_dir + '*.wav'): files_to_analyze.put(file) # create and start parallel processes for _ in range(number_of_processes): p = Process(target=analyze, args=(files_to_analyze, analysis_results)) processes.append(p) p.start() # complete processes for p in processes: p.join() # print results while not analysis_results.empty(): print(analysis_results.get()) return True if __name__ == '__main__': main()

Praat script

# receive all arguments specified in the Python subprocess.run() command # In the exampale above, only one argument is specified, namely the file to be analyzed form command line arguments word file Empty endform # read the file and perform analysis sound = Read from file: file$ pcep = To PowerCepstrogram: 60, 0.002, 5000, 50 cpps = Get CPPS: "yes", 0.02, 0.0005, 60, 330, 0.05, "parabolic", 0.001, 0.05, "Straight", "Robust" # output results to stdout, which is evaluated by the Python script writeInfoLine: cpps

If you'd like to try parallel processing:

  1. Please verify that you have installed reasonably up-to-date versions of Python 3 (run python --version in a terminal) and Praat.
  2. Download both scripts (Python script, Praat script) to the same directory.
  3. Open the Python script in an editor and adapt the paths in lines 7 and 9. To reproduce this example, the corpus_dir should contain some wav files. Your praat_binary is probably located in C:\Program Files\Praat.exe (Windows), /Applications/Praat.app/Contents/MacOS/Praat (Mac), or /usr/local/bin/praat (Linux).
  4. If you wish, you can limit the number of processes running simultaneously in line 30 (and remove/comment line 32). Default: number of processes = number of available CPU cores. (You can even run the script if you have only one CPU core available; in this case you get concurrency (virtual parallelism), which is much slower compared to simultaneous processing in multiple processor cores (real parallelism).)
  5. Run the Python script in a terminal: python praat_parallel_processing.py and watch your CPU activity.
  6. To compare processing time with the conventional (consecutive) approach, download and adapt another Python script and run it on the same corpus.