Recipes

This is a loose collection of code snippets for more or less typical tasks. If you would like to suggest improvements or new snippets,

Collect files for analysis

If you want your script to analyze a collection of files, you need to compile a list of all target files at the beginning of the script. The following code section presents a procedure that does just that. Download and include the procedure into your Praat script and call it with 3 arguments (see usage example below):

the path to your corpus (string)
an indicator as to whether you want to search recursively (1) or not (0)
a vector of search patterns, e.g. {"*.wav", "*.WAV"} to include all files whose file name ends with either wav or WAV

Code

Procedure

procedure fileList: .baseDir$, .recursive, .pattern$#
    ######
    ### collect files (incl. path) in Strings object
    ### (original selection is retained)
    ###
    ### arguments:
    ###   .baseDir$  (string):        where the search for files should start
    ###   .recursive (integer):       search recursively in all sub directories of baseDir (=1) or not (<>1, e.g. 0)
    ###   .pattern$# (string vector): list of search patterns (>=1) as vector, e.g. { "pattern1", "pattern2", ... }
    ###
    ### creates:
    ###   Strings object (name: fileList, id: fileList.id): list of collected files (incl. path starting at baseDir; path separator: "/")
    ######

    # get original selection
    .selection# = selected# ()
    # create empty lists (Strings objects) for files and directories
    .fid = Create Strings as tokens: "", " "
    .did = Create Strings as tokens: "", " "
    Insert string: 0, .baseDir$
    # repeat until folder list is empty
    repeat
        selectObject: .did
        .d$ = Get string: 1
        Remove string: 1
        # find files with specified pattern
        for .i to size (.pattern$#)
            .fl = Create Strings as file list: "fl", .d$ + "/" + .pattern$# [.i]
            .n = Get number of strings
            if .n > 0
                for .j to .n
                    selectObject: .fl
                    .f$ = Get string: .j
                    selectObject: .fid
                    # insert path + filename
                    Insert string: 0, .d$ + "/" + .f$
                endfor
            endif
            nocheck removeObject: .fl
        endfor
        # if recursive option is true collect folders and continue file search
        # if recursive option is false do nothing (folder list is empty - loop will terminate)
        if .recursive = 1
            .dl = Create Strings as directory list: "dl", .d$ + "/*"
            .n = Get number of strings
            if .n > 0
                for .i to .n
                    selectObject: .dl
                    .rd$ = Get string: .i
                    selectObject: .did
                    Insert string: 0, .d$ + "/" + .rd$
                endfor
            endif
            nocheck removeObject: .dl
        endif
        selectObject: .did
        .s = Get number of strings
    until .s = 0
    # replace Windows path separator ("\") with neutral separator ("/")
    selectObject: .fid
    .id = Replace all: "\", "/", 0, "literals"
    Rename: "fileList"
    nocheck removeObject: .did
    nocheck removeObject: .fid
    # restore original selection
    selectObject (.selection#)
endproc

Usage example

include /path/to/procedure_fileList.praat

@fileList: "/path/to/your/corpus", 1, {"*.wav", "*.WAV"}

selectObject: fileList.id
num_of_files = Get number of strings
for i to num_of_files
	selectObject: fileList.id
	file$ = Get string: i
	file_id = Read from file: file$

	### do something with the file ###

	removeObject: file_id
endfor
removeObject: fileList.id

Parallel processing with Python wrapper

If you need to analyze a large collection of files with a computationally intensive Praat script, you can save a lot of time by processing several files in parallel. The only prerequisite is that your computer has a modern CPU with several processing cores.

One way to implement this is to use a Python wrapper script that handles parallel processing by launching multiple Praat instances to analyze multiple files simultaneously. The following code section presents a rudimentary implementation of a Python wrapper script, accompanied by a simple Praat script for illustration (calculating smoothed cepstral peak prominence, CPPS). Of course, the complexity of the Praat script can far exceed that of the example provided.

Code

Python script

from multiprocessing import Process, Queue, cpu_count
import queue
from glob import glob
import subprocess

# customize to your system:
corpus_dir = '/path/to/wav-files/'
praat_script = 'cpps.praat'
praat_binary = '/path/to/praat-binary'

def analyze(files_to_analyze, analysis_results):
    while True:
        try:
            # try to get a file from the queue
            # get_nowait() function will raise queue.Empty exception if the queue is empty.
            file = files_to_analyze.get_nowait()
        except queue.Empty:
            # exit while loop if queue is empty (all files are analyzed)
            break
        else:
            # if no exception has been raised we have a file to analyze
            # run praat script and add result to analysis_results queue
            result = subprocess.run([praat_binary, '--run', '--no-pref-files', '--no-plugins', '--utf8', praat_script, file], check=True, capture_output=True, text=True)
            analysis_results.put(file + ',' + str(result.stdout))

    return True

def main():
    # limit CPU cores to fixed number (must be < cpu_count())
    #number_of_processes = 4
    # or use all available CPU cores
    number_of_processes = cpu_count()

    # initialize queues for files and results
    files_to_analyze = Queue()
    analysis_results = Queue()
    processes = []

    # add all wav files in corpus_dir to the files queue
    for file in glob(corpus_dir + '*.wav'):
        files_to_analyze.put(file)

    # create and start parallel processes
    for _ in range(number_of_processes):
        p = Process(target=analyze, args=(files_to_analyze, analysis_results))
        processes.append(p)
        p.start()

    # complete processes
    for p in processes:
        p.join()

    # print results
    while not analysis_results.empty():
        print(analysis_results.get())

    return True

if __name__ == '__main__':
    main()

Praat script

# receive all arguments specified in the Python subprocess.run() command
# In the exampale above, only one argument is specified, namely the file to be analyzed
form command line arguments
	word file Empty
endform

# read the file and perform analysis
sound = Read from file: file$
pcep = To PowerCepstrogram: 60, 0.002, 5000, 50
cpps = Get CPPS: "yes", 0.02, 0.0005, 60, 330, 0.05, "parabolic", 0.001, 0.05, "Straight", "Robust"

# output results to stdout, which is evaluated by the Python script
writeInfoLine: cpps

If you'd like to try parallel processing:

Please verify that you have installed reasonably up-to-date versions of Python 3 (run python --version in a terminal) and Praat.
Download both scripts (Python script, Praat script) to the same directory.
Open the Python script in an editor and adapt the paths in lines 7 and 9. To reproduce this example, the corpus_dir should contain some wav files. Your praat_binary is probably located in C:\Program Files\Praat.exe (Windows), /Applications/Praat.app/Contents/MacOS/Praat (Mac), or /usr/local/bin/praat (Linux).
If you wish, you can limit the number of processes running simultaneously in line 30 (and remove/comment line 32). Default: number of processes = number of available CPU cores. (You can even run the script if you have only one CPU core available; in this case you get concurrency (virtual parallelism), which is much slower compared to simultaneous processing in multiple processor cores (real parallelism).)
Run the Python script in a terminal: python praat_parallel_processing.py and watch your CPU activity.
To compare processing time with the conventional (consecutive) approach, download and adapt another Python script and run it on the same corpus.

Jin Jiyan Azadî Zan Zendegi Azadi Woman Life Freedom

Phonetics on Speed