Session 13
Advanced Concepts
Arrays & Vectors
Arrays (and Dictionaries)
Most of the time, simple variables as introduced in session 5 are sufficient. But on top of these simple variables Praat provides more complex data structures for special occasions, namely arrays and vectors (and dictionaries and matrices).
Arrays (and dictionaries) are indexed variables which can store an arbitrary amount of values. What does that mean and what's the benefit? Imagine for example you want to collect the IDs of all selected sound objects to process them later in the script. If you know there are exactly 3 selected sound objects you would implement something like this, using simple variables:
sound1_id = selected ("Sound", 1)
sound2_id = selected ("Sound", 2)
sound3_id = selected ("Sound", 3)
That's perfectly sufficient. But what if you don't know the amount of selected sound objects and you still want to collect all of them? Let's start with a loop:
nos = numberOfSelected ("Sound")
for i to nos
sound_id = selected ("Sound", i)
endfor
The loop takes account all selected sound objects and assigns their ID to a variable. But it's the same variable at each pass of the loop. That means the assigned value of the variable sound_id get's overwritten at each pass. When the loop terminates you only collected the ID of the last selected sound object, other IDs are gone.
Now remember what I wrote above: Arrays are indexed variables which can store an arbitrary amount of values. That's exactly what we need here. The IDs of an unknown number of selected sound objects qualify as "an arbitrary amount of values". And we already have a suitable index for the "indexed variable": The counter i. We use the index to identify the sound object (selected ("Sound", i)
), so why not use it as an index of the variable? To transform a simple variable into an array we just add an index enclosed in square brackets:
nos = numberOfSelected ("Sound")
for i to nos
sound_id [i] = selected ("Sound", i)
endfor
When the loop terminates, each of sound [1], sound [2], …, sound [nos] contains one ID of a selected sound object. While with a simple variable we have a container for one value, with an array we have a structured container with an arbitrary amount of sub-containers, each sub-container for one value. The elements of an array (the sub-containers) are accessed by index. So, how would you select the second selected sound object in our example?
nos = numberOfSelected ("Sound")
for i to nos
sound_id [i] = selected ("Sound", i)
endfor
# select second selected sound object
selectObject: sound_id [2]
Besides from indexing, arrays behave like simple variables: Names must be unique and comply with the naming conventions, values are assigned using the assignment operator (=), and array elements are substituted with their value at runtime.
The distinction between numerical and string variables also applies to arrays. If we want to collect the names of all selected sound objects in addition to their IDs, we can create a new string array:
nos = numberOfSelected ("Sound")
for i to nos
sound_id [i] = selected ("Sound", i)
sound_name$ [i] = selected$ ("Sound", i)
endfor
Arrays can be quite useful when you want to accumulate an unknown number of similar values (unknown when you write the script; at runtime your script should be capable of figuring out the number of values, i.e. the highest index). Less useful are dictoniaries—in my opinion (I never used them before). Dictionaries are like arrays with a string index instead of a numerical index. They can contain numbers as well as strings:
# dictionary for numbers:
my_sound ["SamplingRate"] = 44100
my_sound ["Channels"] = 1
my_sound ["Duration"] = 5.27
# dictionary for strings:
speaker$ ["Name"] = "Jane Doe"
speaker$ ["Sex"] = "female"
speaker$ ["Age"] = "32"
Vectors (and Matrices)
Vectors are relatively new to the Praat scripting language and they are very similar to arrays. Like arrays, you can think of vectors as containers with many sub-containers. In the Praat Help numeric vectors are decribed as "array of numbers, regarded as a single object". That means you can target the whole vector with functions or operations whereas with arrays you can only target single array elements. To illustrate this, we first need to know how to create a vector:
my_vector# = { 12, 3, 9, 6 }
A vector variable needs a name like every other variable and must end in a hash symbol. Literal values are assigned enclosed in curly brackets, separated by comma. After the assignment above the vector my_vector
has 4 dimensions (i.e. contains 4 elements).
After assignment, we can target the whole vector with functions like e.g. size (my_vector#)
or sum (my_vector#)
or with operations like element-by-element division:
my_vector# = { 12, 3, 9, 6 }
size = size (my_vector#)
sum = sum (my_vector#)
# creating a new vector by dividing each element by 3:
new_vector# = my_vector# / 3
writeInfoLine: "my vector: ", my_vector#
appendInfoLine: "number of dimensions: ", size
appendInfoLine: "sum of all values: ", sum
appendInfoLine: "new_vector: ", new_vector#
my vector: 12 3 9 6
number of dimensions: 4
sum of all values: 30
new_vector: 4 1 3 2
Note that nothing of this is possible with arrays: You can't assign multiple array values in one go, you can't have whole arrays as arguments to functions, and you can't target whole arrays with arithmetic operations.
While vectors behave like a "single object" in contexts like above, it's also possible to access and manipulate individual vector elements similar to array elements:
my_vector# = { 12, 3, 9, 6 }
writeInfoLine: "my vector: ", my_vector#
appendInfoLine: "second element: ", my_vector# [2]
my_vector# [2] = 15
appendInfoLine: "my vector after manipulation: ", my_vector#
my vector: 12 3 9 6
second element: 3
my vector after manipulation: 12 15 9 6
I'am going to conclude this session with a practical example that illustrates some differences between arrays and vectors. But before that, I'd like to add one level of complexity by introducing matrices. A numeric matrix is a collection of numeric vectors, all of the same size. Sounds crazy? Try this: A numeric matrix is a table with rows and colums, containing numbers. The name of a matrix must end in two hash symbols:
my_matrix## = {{ 12, 3, 9, 6 }, { 8, 2, 10, 4 }}
writeInfoLine: "my matrix:"
appendInfoLine: my_matrix##
my matrix:
12 3 9 6
8 2 10 4
You see? Each vector corresponds to a row, vector dimensions correspond to columns. To access individual matrix elements two indices are required: [row, column]
:
my_matrix## = {{ 12, 3, 9, 6 }, { 8, 2, 10, 4 }}
number = my_matrix## [1, 3]
writeInfoLine: "first row, third column: ", number
first row, third column: 9
Vectors and matrices are important instruments for advanced signal processing. To learn more about these data structures visit Praat Help to find an evolving set of tools to deal with vectors and matrices.
Practical example
When teaching transcription/annotation I provide the students with a sound and a TextGrid containing one tier called IPA and let them work for some time. When they are finished they save the TextGrid with a unique filename and upload it to the e-learning platform. I download all TextGrids, merge them and discuss the annotations with the students. To make it easier for the students to identify their own annotation in the merged TextGrid I want to replace the name of the intervall tier (IPA) with the unique filename of their uploaded TextGrid.
Using arrays, this requires 3 loops. The first loop maintains the original selection and assigns IDs to an array. The second loop selects individual TextGrids (dissolving the original selection) in order to manipulate their tier names. The third loop restores the original selection to be ready for merging:
# get the number of selected TextGrids
nos = numberOfSelected ("TextGrid")
# assign TextGrid IDs to an array
for i to nos
grid [i] = selected ("TextGrid", i)
endfor
# select and manipulate each TextGrid
for i to nos
selectObject: grid [i]
grid_name$ = selected$ ()
new_tier_name$ = replace$ (grid_name$, "_", "-", 0)
Set tier name: 1, new_tier_name$
endfor
# restore selection and merge
selectObject: grid [1]
for i from 2 to nos
plusObject: grid [i]
endfor
Merge
Using a vector for TextGrid IDs and the vector function selected# ()
(introduced in Praat 6.0.40) we are able to implement this much more effectively with only one loop:
# assign TextGrid IDs to a vector
grids# = selected# ("TextGrid")
# select and manipulate each TextGrid
for i to size (grids#)
selectObject: grids# [i]
grid_name$ = selected$ ("TextGrid")
new_tier_name$ = replace$ (grid_name$, "_", "-", 0)
Set tier name: 1, new_tier_name$
endfor
# restore selection and merge
selectObject: grids#
Merge
selected# ("TextGrid")
collects the IDs of all selected TextGrids as a vector; this vector is assigned to grid#. Then we specify the loop counter to run from 1 to the number of vector elements (size (grids#)
). Since we collected all relevant IDs in grid#, we can savely dissolve the original selection by selecting individual TextGrids inside the loop. Individual TextGrid IDs are addressed by index (grids# [i]
). At the end, the original selection is restored to be ready for merging. This is as simple as feeding the vector containing the relevant IDs to the familiar selectObject:
function.