Praat notes
How Praat thinks
The list
Recording sound, loading files, making objects from others, will gives them an entry in the list.
You can also save each item in the list to a file.
It's a scratch space, not a project
You would be forgiven to think this is a project you can save. It is not - you can not save the list itself, and it will not be remembered between runs of Praat.
The list is itself intended as a temporary scratch area, "the things I need for what I am currently working on", the intermediate steps you need to do a nontrivial task.
Which means that, whether you are working interactively or writing scripts, you have to keep track of what you left in the list, by name or ID.
You can select one or more objects from the list, to then use one of the (applicable) buttons, to do something useful with this object, or combination of objects.
Aside from doing that by hand, you can do the same by code (by name or by id), which makes it easier to create scripts (but more on scripting later).
Suggestion: Think like a database
Objects and object combinations
Many objects will have a View & Edit to open a window for it to, well, view and possibly edit it.
- e.g. try View & Edit on a Sound
Selecting specific combinations of object( type)s brings up some specific new buttons (and removes others that do not apply to the combination)
- e.g. try View & Edit on a Sound + TextGrid; this that combines those two object's basic view, meant for annotations
- ...which only makes sense when they relate to each other.
- Say, you probably want to create such a TextGrid from the Sound (select, then click Annotate >)
- rather than create from scratch (New → Create TextGrid...)
Notes:
- Sometimes nothing makes sense for the combination you have selected
- There are also hints -- e.g. when you select only a TextGrid, there is a button named View & Edit with Sound? which, if clicked, just tells you to select both.
- there are many possible combinations not hinted at -- just some of the most common ones
Object types
There are quite a few object types, many of which you may never use.
Roughly in order of how quickly you will probably see or need it:
Some of the more commonly used object types
Sound[1]
- waveform/PCM data
- often mono, but can be multi-channel
- View & Edit shows waveform and spectrogram
- as the legend hints:
- blue trace is pitch (note it's on a different Y axes from the shown spectrogram, because it'd otherwise be at the bottom)
- red dots are formant places
- green/yellow is intensity
- pulses are shown in the waveform
LongSound[2]
- for things that won't necessarily fit in memory
- (which is less of a concern these days in terms of RAM)
- more restricted than Sound (verify)
Spectrogram
- spectrum over time (STFT), defaulting to a 50ms window length
- if you only care to see this visual, you may not need to create a Spectrogram object, in that things that include sound data (e.g. Sound object, Manipulation object) tend to show a spectrogram in their View & Edit
Manipulation
- LPC/PSOLA style speech analysis of Sound object
- View & Edit shows a
- pulses (if extracted becomes a PointProcess)
- estimated pitch (extracted as PitchTier)
- duration (extracted as DurationTier)
- contains a little more
- in particular the original Sound, e.g. for comparison's sake
Pitch
- periodicity candidates over time
- in equally sized/spaced frames (note that the evidence for these may come from different amount of pulses, and pitchtiers) show them that way
PitchTier[3]
- basically a set of (timestamp, pitch_in_hz)
- probably extracted via a Manipulation object
- Praat itself, if asked about pitch at a point, interpolates between these points (and extends outwards before the first and after the last value)
- can also be altered (and drawn) to resynthesize LPC/PSOLA type things with different vocal pitch
Intensity[4]
- Intensity is at regular interval
IntensityTier[5] - amplitude envelope
- (timestamp, intensity)
PointProcess[6]
- a sequence of points in time, e.g. marking vocal pulses
- mostly related to from LPC/PSOLA pitch stuff
Strings[7]
- ordered list of strings
Table
Matrix
More specific and/or lesser-used object types
ExperimentMFC - Multiple Forced Choice[8] style listening experiment
The praat picture window
Mainly used for making plots from data. ...which can then be saved as raster (PNG) or vector (EPS, PDF).
You won't need this until you do, so can close it. ...I've seen people making startup scripts to specifically close it
Automation and scripting
As hintend at above, you can automate Praat.
At the end of the day,
a script which is a bunch of text lines mentioning the actions,
that are equivalent to doing those actions manually.
And you can write it yourself,
but almost everything you can do in Praat GUI is recorded into history, and can be recorded into a script,
and this is often the easiest way to create a useful script -- or at least do most of it, to then maybe edit a little.
The easiest way to at least reduce the amount of clicking you do when you want to do something to a large set of sounds files, is just to use that recording.
No coding required.
But coding makes things more powerful - it adds, among other things,
- a way of ask for user input (primarily for the parameters you then hand into an existing action)
- a basic scripting language that lets you do conditions and loops and other basics you will probably need to express your wishes.
https://www.fon.hum.uva.nl/praat/manual/History_mechanism.html
New script, paste history
https://www.fon.hum.uva.nl/praat/manual/Scripting.html
Common tasks
Recording sound
New → Record Mono Sound
Can record one or more fragments, and Save each to List. To do just one, you may like Save to List & Close
Try to avoid very quiet recordings
Praat will scale up whatever you already recorded, to make it visible.
- This is great in that you always see a waveform
- and fin
This is great when you meant to do exactly what you did - say, experiment with noise
- Yet when you did not intend to make a quiet-and-probably-noisy recording, this actually creates and hides an issue.
- namely that quiet recordings are almost always also quite noisy.
"Why does that make it noisy?"
Because there is always some noise in the recording, both from the room you are in and from the devices you use, and the quieter the sound you record, the closer to that noise it is.
Yes, you can amplify it later, but you will also amplify the noise.
The less you have to amplify, the more you avoid this.
To illustrate that effect to yourself
- record some gentle taps on the microphone but otherwise be quiet.
- View & Edit the sound
- you should see only the taps.
- remove the taps (Select, Sound → Set selection to zero)_
- it will suddenly show loud everywhere else
- undo that, then go to Sound → Sound Scaling, and in particular compare 'by whole' (the default) and 'fixed range'
- 'by whole' - looks for the loudest sample in the whole sound
- this is the detault
- 'by window' - looks for the loudest part within the currentlyzoomed area (if stereo: max of both channels)
- 'by window and channel' - by maximum used range (if stereo: individually)
- 'fixed height' - seems to be "amount around calculated average"
- "so if you say 2 you basically see the whole thing (but might hide a DC offset - not that you generally care about that)
- 'fixed range' - seems to be "give min and max (average implied)"
- "so if you say -1 and 1, you see the whole thing
To set up reasonable recording levels
Tell your subject to talk at reasonably loud levels, and look at the green/yellow/red indicator
- if it barely registers, increase the input/mic sensitivity
- if it's not visible at all, increase
- if it goes into the red, lower it
- if some of your hardware/software indicates in dB: 10 to 20dB should be enough
- If not, 'a noticeable amount' should do)
- if it registers a moderate amount when you're entirely quiet, consider turning it down
- ...because that's just device noise amplified a lot (which suggests something amplifies so much that louder things will distort).
This isn't always perfect advice - hardware varies, and there may be other reasons for e.g. things to still be quiet, or for there to be distortion in other hardware, before it even got into the PC.
Sometimes things are quiet because they keep their distance from an insensitive microphone. Sometimes things distort because they smush their face into a sensitive microphone. Sometimes things are wird because someone else used the hardware and changed something. etc.
So when doing serious recording, try to start with a sanity check: make a quick recording and listen to it (headphones are often better than speakers), and maybe check the waveform for flat, clipped tops.
Microphones are usually mono.
Also, it is good science to only vary the things you intend to study, and stereo effects are not easy to control in recording or playback, so mono makes your life simpler.
That said,
- when you want to be really precise, consider that on some hardware, 'record mono' means 'mix stereo to mono', while on others, it means 'record the first channel', and you won't really know until you test
- when recording for only transcription and not experimental playback,
know that stereo recordings can be more useful, because people are good at separating multiple voices from any sound recorded in a vaguely binaural way. You may like a portable recorder with two mics mounted on them.
Viewing your recording
Select the Sound, press the View & Edit button.
More basic things you may want to do is zooming in and out, scrolling around, cutting pieces off the edge - mainly see the Edit and Time menus (and maybe learn the keyboard shortcuts)
Spectrogram → Spectrogram settings
View Range - Normally 0Hz to 5kHz (even though we recorded more) - there's almost nothing interesting above, and zooming down means we can see the pitch movement better. This is a good default, though you could lower this further to focus on vowel formant curves more.
Window length is about the (STFT) tradeoff between frequency resolution and time resolution. The default (0.005) is a good tradeoff for many tasks. Higher (0.015) may sometimes make e.g. separate formants more visible, yet makes them harder to place in time precisely. If over time you learn to recognize things visually, you may not want to touch this, just because it will look different.
The concept of dynamic range relates loudest to softest levels. Here it controls the softest levels to still draw, relative to the maximum (see below). The default 70dB thows away little to nothing, so often shows device's noise as well. Lowering this is sort of like lowering a volume knob - it will lower signal and noise equally, though the first 20 or so lowering might throw away quiet noise and it may look a little cleaner.
Spectrogram → Spectrogram advanced settings
Maximum is the energy level (you can ignore the units) to treat as the loudest to show (black). By default this field is ignored, because autoscaling handles this (...within a zoom level, so scrolling will make it vary - if you want to inspect in detail, you might care to turn off autoscaling).
Pre-emphasis considers that the loudness of speech's components (vowels mostly) falls approximately -6 dB per octave, so tries to make those overtones more visible, basically by amplifying higher frequencies. The default is +6dB per octave. Higher puts more focus on higher sounds. (identity-gain point at 1000 Hz?)
Dynamic compression amounts to bringing up the volume at the times the signal is quieter.
- this has use if there is significant variation in how loud individual responses are.
- ...but if you have strong recordings, it can only bring up noise so has little value
- This is a fraction, how much to amplify it towards the level of everything else. You rarely want to make this higher than 0.5 or so (because that's often around 20dB).
https://www.fon.hum.uva.nl/praat/manual/Advanced_spectrogram_settings___.html
Annotating your recording
Create a TextGrid
To create an empty TextGrid of the same length as a given Sound:
- Select Sound
- Find the Button: Annotate → To TextGrid
The form dialog that gives you is:
All tier names: Mary John bell Which of these are point tiers: bell
The first time you see that, it introduces two or three new things, so this bears some comment.
You can have multiple, independent 'tracks' of information, called tiers. This is useful e.g. when there are distinct things worth noting, e.g.
- multiple speakers
- a speaker and a bell to mark the start of experiment response
- annotation at sentence, phrase, word, phoneme level
- aligned translations
- ...and/or whatever else you can think of
Also, sometimes we want to segment things, and sometimes we just want to mark things in time.
So each tier can be either an
- interval tier -
- consists of segments that always covers the whole recording
- inserting something at a time will split the segment that is currently there into two
- you can select the segments
- you can optionally label each segment (you might e.g. start by marking the silences)
- point tier (sometimes 'text tier')
- inserting a point adds a specific points in time
- you can optionally label each point
- you can select the labels
So now we can grasp that
- Tier names - space-separated list. Settles the amount of tiers and their tiers at the same time
- Which of these are point tiers? - repeat the names of tiers you want to be point tiers. Any not mentioned will become interval tiers
So:
All tier names: Mary John bell Which of these are point tiers: bell
means:
- create Mary as an interval tier
- create John as an interval tier
- create bell as a point tier
Actually using that TextGrid
If you did the above, you have a Sound and a TextGrid of the same time length.
Actual annotations
Editing
A mix of keyboard and mouse seems to be most convenient.
Remember zooming (Ctrl-I, Ctrl-O), scrolling (PgUp, PgDn)
Click in waveform or spectrogram: choose a point in time
Some of the most useful keyboard shortcuts are:
Enter - add segment/point in currently selected tier
Ctrl-1, Ctrl-2 - add point in specific tier
Mouse-drag a segment-point to move it
Alt-backspace - remove point / merge segment with previous
Alt-arrows - to move around the segments (useful when cleaning up)
Annotating
Other views/editors
Pitch editor
https://www.fon.hum.uva.nl/praat/manual/PitchEditor.html
Text files, short text files, binary files
Many data-style objects (including some cases you may never use, like Sound objects) have a structured representation that can be saved as
- a text file
- which contains a little more than necessary but is human-readable.
- a short text file, which basically omits the variable names,
- but is stable enough that parsers should have no trouble (no idea if there were breaking changes over time)
- a binary format, which is a little more compact.
Some thing have futher forms, e.g.
- PitchTier and DurationTier has
- PitchTier/DurationTier Spreadsheet file (not unlike their short text form)
- headerless spreadsheet file (basically TSV)
While the text formats look parseable by yourself, try to avoid that when it is easy because the format has changed. Praat will know how to handle that, but your or other's libraries may break over time.
Praat setup
Praat preferences folder
Mostly contains
- Preferences file[9], mostly contains a whole bunch of defaults
- Buttons file[10], mostly contains registration/show/hide of interface elements (menu items and buttons)
- possible plugins, in directories
Location:
Windows: %USERPROFILE%\Praat OSX: ~/Library/Preferences/Praat Prefs/ Linux: ~/.praat-dir/
https://www.fon.hum.uva.nl/praat/manual/preferences_folder.html
Interacting with Praat
Calling into Praat executable should typically be done with --open, --run, or --send
Opening files with Praat can be done like
Praat.exe --open data\hello.wav data\hello.TextGrid Praat.exe --open script.Praat
You can
Praat.exe --run testCommandLineCalls.praat "argument"
...which does so without a GUI; any Info-window output goes to the console that runs it.
To command a running Praat GUI (or a new one if one wasn't running), you want sending
Praat.ext --send "argument"
sendpraat is roughly the same, e.g.
sendpraat 1000 praat "Read from file... hello.wav" "Play reverse" "Remove"
https://www.fon.hum.uva.nl/praat/manual/Scripting_6_9__Calling_from_the_command_line.html
"The phonetic font is not available"
Praat wants a font that covers all IPA characters in Unicode
Praat comes from a time where you likely needed to install SIL Doulos and/or SIL Charis to guarantee the IPA characters would show up.
These days there are other fonts that cover them, but Praat still plays safe and still complains if you don't have those fonts installed. If it works, you can ignore the warning. If you want it to be quiet, you could install those fonts
See also: how to install fonts (in general)
Consider tweaking Windows
Showing file extensions
If you have made
Recording1.wav Recording1.Manipilation Recording1.PitchTier
then it is fairly clear what belongs together.
However, Windows typically hides extensions (that it knows about). This is nice for a uncluttered overview where you have a nice icon instead -- but less precise when using a computer as a tool. In particular, when explorer show you just
Recording1 Recording1 Recording1
that's less great.
If you want to see extensions:
- Win7: Tools : Folder options : View tab : uncheck "Hide extensions for known file types"
- Win10: Explorer window : View : check "File Name Extensions"
- Win11: Explorer window : View : Show : File Name Extensions
- OSX:
- Linux / Gnome:
Where is my configuration
Windows: %USERPROFILE%\Praat Linux: ~/.praat-dir OSX: ~/Library/Preferences/Praat Prefs/
...which are each shorthands that resolve to directory for user currently logged in
Consider: Open with
Why?
Say you often find yourself opening one file at a time with Praat's Open → Read from File to add a bunch of files
There are faster options.
"Open with" is a general windows feature that lets you tell it what extensions to associate with what program.
How?
- Windows: Right-click a file (with an extension you want to associate with a program - you would have to do this once for each praat file extension)
- "Open With..."
- may require some extra clicking (varies with windows version) like "More apps", "look for more apps", scrolling, etc
- The first time you do this you also need to browse to where precisely the application is located on disk. If you've done that once it should be in the list of apps