Praat notes
How Praat thinks
The list
Things are given an entry in the list when you
- record sound
- load files
- make list items based on other list items
You can also save each item in the list to a file.
Loading and saving have to be done explicitly - it's a scratch space, not a project.
You would be forgiven to think this is a project, where you can save everything you see.
It is not - you can not save the list itself, and it will not be remembered between runs of Praat.
The list is itself intended as a temporary scratch area, "the things I need for what I am currently working on", the intermediate steps you need to do a nontrivial task.
Which means that, whether you are working interactively or writing scripts, you have to keep track of what you left in the list, by name or ID.
You can select one or more objects from the list, to then use one of the (applicable) buttons, to do something useful with this object, or combination of objects.
Aside from doing that by hand, you can do the same by code (by name or by id), which makes it easier to create scripts (but more on scripting later).
Suggestion: Think like a database
Objects and object combinations
Many objects will have a View & Edit to open a window for it to, well, view and possibly edit it.
- e.g. try View & Edit on a Sound
Selecting specific combinations of object( type)s brings up some specific new buttons (and removes others that do not apply to the combination)
- e.g. try View & Edit on a Sound + TextGrid; this that combines those two object's basic view, meant for annotations
- ...which only makes sense when they relate to each other.
- Say, you probably want to create such a TextGrid from the Sound (select, then click Annotate >)
- rather than create from scratch (New → Create TextGrid...)
Notes:
- Sometimes nothing makes sense for the combination you have selected
- There are also hints -- e.g. when you select only a TextGrid, there is a button named View & Edit with Sound? which, if clicked, just tells you to select both.
- there are many possible combinations not hinted at -- just some of the most common ones
Object types
There are quite a few object types, many of which you may never use.
Roughly in order of how quickly you will probably see or need it:
Some of the more commonly used object types
Sound[1]
- waveform/PCM data
- often mono, but can be multi-channel
- View & Edit shows waveform and spectrogram
- as the legend hints:
- blue trace is pitch (note it's on a different Y axes from the shown spectrogram, because it'd otherwise be at the bottom)
- red dots are formant places
- green/yellow is intensity
- pulses are shown in the waveform
LongSound[2]
- for things that won't necessarily fit in memory
- (which is less of a concern these days in terms of RAM)
- more restricted than Sound (verify)
Spectrogram
- spectrum over time (STFT), defaulting to a 50ms window length
- if you only care to see this visual, you may not need to create a Spectrogram object, in that things that include sound data (e.g. Sound object, Manipulation object) tend to show a spectrogram in their View & Edit
Manipulation
- LPC/PSOLA style speech analysis of Sound object
- View & Edit shows a
- pulses (if extracted becomes a PointProcess)
- estimated pitch (extracted as PitchTier)
- duration (extracted as DurationTier)
- contains a little more
- in particular the original Sound, e.g. for comparison's sake
Pitch
- periodicity candidates over time
- in equally sized/spaced frames (note that the evidence for these may come from different amount of pulses, and pitchtiers) show them that way
PitchTier[3]
- basically a set of (timestamp, pitch_in_hz)
- probably extracted via a Manipulation object
- Praat itself, if asked about pitch at a point, interpolates between these points (and extends outwards before the first and after the last value)
- can also be altered (and drawn) to resynthesize LPC/PSOLA type things with different vocal pitch
Intensity[4]
- Intensity is at regular interval
IntensityTier[5] - amplitude envelope
- (timestamp, intensity)
PointProcess[6]
- a sequence of points in time, e.g. marking vocal pulses
- mostly related to from LPC/PSOLA pitch stuff
Strings[7]
- ordered list of strings
Table
Matrix
More specific and/or lesser-used object types
ExperimentMFC - Multiple Forced Choice[8] style listening experiment
The praat picture window
Mainly used for making plots from data. ...which can then be saved as raster (PNG) or vector (EPS, PDF).
You won't need this until you do, so can close it. ...I've seen people making startup scripts to specifically close it
Automation and scripting
As hintend at above, you can automate Praat.
At the end of the day,
a script which is a bunch of text lines mentioning the actions,
that are equivalent to doing those actions manually.
And you can write it yourself,
but almost everything you can do in Praat GUI is recorded into history, and can be recorded into a script,
and this is often the easiest way to create a useful script -- or at least do most of it, to then maybe edit a little.
The easiest way to at least reduce the amount of clicking you do when you want to do something to a large set of sounds files, is just to use that recording.
No coding required.
But coding makes things more powerful - it adds, among other things,
- a way of ask for user input (primarily for the parameters you then hand into an existing action)
- a basic scripting language that lets you do conditions and loops and other basics you will probably need to express your wishes.
https://www.fon.hum.uva.nl/praat/manual/History_mechanism.html
New script, paste history
https://www.fon.hum.uva.nl/praat/manual/Scripting.html
Common tasks
Recording sound
New → Record Mono Sound
Can record one or more fragments, and Save each to List. To do just one, you may like Save to List & Close
Try to avoid very quiet recordings
When showing audio, Praat will scale up whatever you already recorded.
This is great in that you always see a waveform and spectrogram with whatever is the strongest signal, regardless of exactly how strong it is.
It is not great for the same reason -- in that it will not show you whenever your recording was low-level and really noisy.
"Why do low levels make it noisy?"
Because there is always some noise in the recording, both from the room you are in, and from the imperfection of even the fanciest of devices. So the quieter the sound you record, the closer to that noise it is.
Yes, you can amplify it later, but you can only even amplify both equally. So the better you do at recording time.
"Wait, why is recording time any different?"
It isn't aways - there are ways to make it loud and still noisy, but it's harder. If it's just amplification after the "noise already went through the microphone and recording device", it isn't any different.
But just the fact that you are now probably paying attention to the level indicators of your recording device means that you are doing the most important part of adjustments, and are probably getting something decent (out of whatever mic and input device you have). There are certainly further important parts of thinking like a sound engineer, but this is a topic in itself, and that was the the first thing to get right.
To illustrate the "praat shows you something regardless of how loud it actually is" effect to yourself
- record some gentle taps on the microphone but otherwise be quiet.
- View & Edit the sound
- you should see only the taps.
- remove the taps (Select, Sound → Set selection to zero)_
- it will suddenly show loud everywhere else
- undo that, then go to Sound → Sound Scaling, and in particular compare 'by whole' (the default) and 'fixed range'
- 'by whole' - looks for the loudest sample in the whole sound
- this is the detault
- 'by window' - looks for the loudest part within the currentlyzoomed area (if stereo: max of both channels)
- 'by window and channel' - by maximum used range (if stereo: individually)
- 'fixed height' - seems to be "amount around calculated average"
- "so if you say 2 you basically see the whole thing (but might hide a DC offset - not that you generally care about that)
- 'fixed range' - seems to be "give min and max (average implied)"
- "so if you say -1 and 1, you see the whole thing
To set up reasonable recording levels
Tell your subject to talk at reasonably loud levels, and look at the green/yellow/red indicator
- if it barely seems to move, increase the input/mic sensitivity
- if it's not visible at all, increase
- if it goes into the red, lower it
- if some of your hardware/software indicates in dB: 10 to 20dB should be enough
- If not, 'a noticeable amount' should do)
- if it registers a moderate amount when you're entirely quiet, consider turning it down
- ...because that's just device noise amplified a lot (which suggests something amplifies so much that louder things will distort).
This isn't always perfect advice - hardware varies, and there may be other reasons for e.g. things to still be quiet, or for there to be distortion in other hardware, before it even got into the PC.
Sometimes things are quiet because they keep their distance from an insensitive microphone. Sometimes things distort because they smush their face into a sensitive microphone. Sometimes things are wird because someone else used the hardware and changed something. etc.
So when doing serious recording, try to start with a sanity check: make a quick recording and listen to it (headphones are often better than speakers), and maybe check the waveform for flat, clipped tops.
Microphones are usually mono.
Also, it is good science to only vary the things you intend to study, and mixing multiple microphones gives a few dB of effects that are not easy to control in recording, or in playback (and headphones are different from speakers when it comes to stereo effects), so mono frankly makes your life simpler.
That said, when you want to be really precise, consider that on some input devices, 'record mono' means 'mix stereo to mono', while on others, it means 'record the first channel', and this is not something they list in spec sheets, so you won't really know which one it is until you test.
A separate point: if you are recording less for sound analysis, and more for someone to transcribe this to text, know that a lifetime of practice made people better at separating multiple voices from any sound recorded in a vaguely binaural way than from mono. You may like a portable recorder with two mics mounted on them.
Viewing your recording
Select the Sound, press the View & Edit button.
More basic things you may want to do is zooming in and out, scrolling around, cutting pieces off the edge - mainly see the Edit and Time menus (and maybe learn the keyboard shortcuts)
Spectrogram → Spectrogram settings
View Range - Normally 0Hz to 5kHz (even when you recorded more) - there's almost nothing interesting to show above that, and zooming down to this range means we can see the pitch movement is more visible. This is a good default, and you arguably could make this even narrower to see pitch movenebt even more.
Window length is about the (STFT) tradeoff between frequency resolution and time resolution. The default (0.005) is a good tradeoff for many tasks. Higher (0.015) may sometimes make e.g. separate formants more visible -- yet makes them harder to place in time precisely.
The concept of dynamic range relates the loudest to the softest levels. Praat determines the maximum in a recording, this setting determines how much lower is considered quiet enough and not worth showing. The default 70dB shows almost everything (including background noise), lowering to 50 will remove quieter noise and signal, 30 does so more aggressively. It will look cleaner but hide subtler detail.
Spectrogram → Spectrogram advanced settings
Maximum is the energy level (you can ignore the units) to treat as the loudest to show (black). By default this field is ignored, because autoscaling handles this (...within a zoom level, so scrolling will make it vary - if you want to inspect in detail, you might care to turn off autoscaling).
Pre-emphasis considers that the loudness of speech's components (vowels mostly) falls approximately -6 dB per increasing octave, so we amplify higher frequencies just for the visualisation, so that it shows them roughly equally. The default is +6dB per octave. Higher than that puts more focus on higher sounds. (identity-gain point at 1000 Hz?)
Dynamic compression you can think of as someone turning up a volume knob at points of the recording being quieter.
- this has use if there is significant variation in how loud individual responses are.
- ...but if you have strong recordings, all this really does is take the near-silent parts between words and turns up what's there -- i.e. the background noise, which is the least interesting part.
- The setting is a fraction, how much to amplify any part towards the level of everything else. You rarely want to make this higher than 0.5 or so (because that's often around 20dB).
https://www.fon.hum.uva.nl/praat/manual/Advanced_spectrogram_settings___.html
Annotating your recording
Create a TextGrid
To create an empty TextGrid of the same length as a given Sound:
- Select Sound
- Find the Button: Annotate → To TextGrid
The form dialog that gives you is:
All tier names: Mary John bell Which of these are point tiers: bell
The first time you see that, it introduces two or three new things, so this bears some comment.
You can have multiple, independent 'tracks' of information, called tiers. This is useful e.g. when there are distinct things worth noting, e.g.
- multiple speakers
- a speaker and a bell to mark the start of experiment response
- annotation at sentence, phrase, word, phoneme level
- aligned translations
- ...and/or whatever else you can think of
Also, sometimes we want to segment things, and sometimes we just want to mark things in time.
So each tier can be either an
- interval tier -
- consists of segments that always covers the whole recording
- inserting something at a time will split the segment that is currently there into two
- you can select the segments
- you can optionally label each segment (you might e.g. start by marking the silences)
- point tier (sometimes 'text tier')
- inserting a point adds a specific points in time
- you can optionally label each point
- you can select the labels
So now we can grasp that
- Tier names - space-separated list. Settles the amount of tiers and their tiers at the same time
- Which of these are point tiers? - repeat the names of tiers you want to be point tiers. Any not mentioned will become interval tiers
So:
All tier names: Mary John bell Which of these are point tiers: bell
means:
- create Mary as an interval tier
- create John as an interval tier
- create bell as a point tier
Actually using that TextGrid
If you did the above, you have a Sound and a TextGrid of the same time length.
Actual annotations
Editing
A mix of keyboard and mouse seems to be most convenient.
Remember zooming (Ctrl-I, Ctrl-O), scrolling (PgUp, PgDn)
Click in waveform or spectrogram: choose a point in time
Some of the most useful keyboard shortcuts are:
Enter - add segment/point in currently selected tier
Ctrl-1, Ctrl-2 - add point in specific tier
Mouse-drag a segment-point to move it
Alt-backspace - remove point / merge segment with previous
Alt-arrows - to move around the segments (useful when cleaning up)
Annotating
Other views/editors
Pitch editor
https://www.fon.hum.uva.nl/praat/manual/PitchEditor.html
Text files, short text files, binary files
Many data-style objects (including some cases you may never use, like Sound objects) have a structured representation that can be saved as
- a text file
- which contains a little more than necessary but is human-readable.
- a short text file, which basically omits the variable names,
- but is stable enough that parsers should have no trouble (no idea if there were breaking changes over time)
- a binary format, which is a little more compact.
Some thing have futher forms, e.g.
- PitchTier and DurationTier has
- PitchTier/DurationTier Spreadsheet file (not unlike their short text form)
- headerless spreadsheet file (basically TSV)
While the text formats look parseable by yourself, try to avoid that when it is easy because the format has changed. Praat will know how to handle that, but your or other's libraries may break over time.
Praat setup
Praat preferences folder
Mostly contains
- Preferences file[9]
- mostly contains a whole bunch of defaults
- Buttons file[10]
- mostly contains adds, shows, and hides, of menu items and buttons
- possible plugins, in directories
- specifically, directories with names that start with plugin_ and that contain a file called setup.praat will have that file executed
Location:
Windows: %USERPROFILE%\Praat OSX: ~/Library/Preferences/Praat Prefs/ Linux: ~/.praat-dir/
https://www.fon.hum.uva.nl/praat/manual/preferences_folder.html
Interacting with Praat program
Calling into Praat executable should typically be done with --open, --run, or --send
Opening files with Praat can be done like
Praat --open data\hello.wav data\hello.TextGrid Praat --open script.Praat
You can
Praat --run testCommandLineCalls.praat "argument"
...which does so without a GUI; any Info-window output goes to the console that runs it.
To command a running Praat GUI (or a new one if one wasn't running), you want sending
Praat --send "command"
sendpraat is roughly the same, e.g.
sendpraat 1000 praat "Read from file... hello.wav" "Play reverse" "Remove"
https://www.fon.hum.uva.nl/praat/manual/Scripting_6_9__Calling_from_the_command_line.html
"The phonetic font is not available"
Praat wants a font that covers all IPA characters in Unicode
Praat comes from a time where you likely needed to install SIL Doulos and/or SIL Charis to guarantee the IPA characters would show up.
These days there are other fonts that cover them, but Praat still plays safe and still complains if you don't have those fonts installed. If it works, you can ignore the warning. If you want it to be quiet, you could install those fonts
See also: how to install fonts (in general)
Consider tweaking Windows
Showing file extensions
If you have made
Recording1.wav Recording1.Manipilation Recording1.PitchTier
then it is fairly clear what belongs together.
However, Windows typically hides extensions (that it knows about). This is nice for a uncluttered overview where you have a nice icon instead -- but less precise when using a computer as a tool. In particular, when explorer show you just
Recording1 Recording1 Recording1
that's less great.
If you want to see extensions:
- Win7: Tools : Folder options : View tab : uncheck "Hide extensions for known file types"
- Win10: Explorer window : View : check "File Name Extensions"
- Win11: Explorer window : View : Show : File Name Extensions
- OSX:
- Linux / Gnome:
Where is my configuration
Windows: %USERPROFILE%\Praat Linux: ~/.praat-dir OSX: ~/Library/Preferences/Praat Prefs/
...which are each shorthands that resolve to directory for user currently logged in
Consider: Open with
Why?
Say you often find yourself opening one file at a time with Praat's Open → Read from File to add a bunch of files
There are faster options.
Window's "Open with" associates a file extension with a program, after which it will run Praat.exe with that file -- which each ends up working as an "add to list".
How?
- Windows: Right-click a file (with an extension you want to associate with a program - you would have to do this once for each praat file extension)
- "Open With..."
- may require some extra clicking (varies with windows version) like "More apps", "look for more apps", scrolling, etc
- The first time you do this you also need to browse to where precisely the application is located on disk. If you've done that once it should be in the list of apps
Praat plugins and toolkit notes
Praat plugins
Praat is extendible, in a few different ways
Run commands that happen to alter buttons file permanently
If you run Add action command: and Add menu command: yourself or from a regular script, it will permanently altering the buttons file.
You can also put that in a script, which will be a "run once to install this" -- but consider the plugin way instead.
Upsides:
- nice way to incrementally make Praat do all the things you want
Downsides:
- over enough time you won't remember what you did
Put things in your Praat profile (Plugins)
Alternatively, place that exact same file in a new directory under your Praat preferences folder.
This will now be run at Praat startup, but not alter your buttons file.
Upsides:
- picking up what's there at each run is easier for development
Downsides:
- a little more work
Site installs
Not really there.
Permanent installation in a "regardless of who runs it" requires at least a one-time run (per user/install).
The closest you can do is have such a one-time install install a script that picks up something from a shared folder.
https://www.fon.hum.uva.nl/praat/manual/plug-ins.html
Plugin examples
Praat vocal toolkit
Praat Vocal Toolkit [11]
Plugin manager
ProsodyPro notes
http://www.homepages.ucl.ac.uk/~uclyyix/ProsodyPro/
Not a plugin in the sense of 'adds buttons to the interface', more of a script that when run, initiates a semi-automated annotation.