Praat notes

From Helpful
Jump to navigation Jump to search

Language units large and small

Marked forms of words - Inflection, Derivation, Declension, Conjugation · Diminutive, Augmentative

Groups and categories and properties of words - Syntactic and lexical categories · Grammatical cases · Correlatives · Expletives · Adjuncts

Words and meaning - Morphology · Lexicology · Semiotics · Onomasiology · Figures of speech, expressions, phraseology, etc. · Word similarity · Ambiguity · Modality ·

Segment function, interaction, reference - Clitics · Apposition· Parataxis, Hypotaxis· Attributive· Binding · Coordinations · Word and concept reference

Sentence structure and style - Agreement · Ellipsis· Hedging

Phonology - Articulation · Formants· Prosody · Sound change · Intonation, stress, focus · Diphones · Intervocalic · Glottal stop · Vowel_diagrams · Elision · Ablaut_and_umlaut · Phonics

Analyses, models, processing, software - Minimal pairs · Concordances · Linguistics software · Some_relatively_basic_text_processing · Word embeddings · Semantic similarity ·· Speech processing · Praat notes · Praat plugins and toolkit notes · Praat scripting notes

Unsorted - Contextualism · · Text summarization · Accent, Dialect, Language · Pidgin, Creole · Natural language typology · Writing_systems · Typography, orthography · Digraphs, ligatures, dipthongs · More linguistic terms and descriptions · Phonetic scripts

How Praat thinks

The list

Things are given an entry in the list when you

  • record sound
  • load files
  • make list items based on other list items

You can also save each item in the list to a file.


Loading and saving have to be done explicitly - it's a scratch space, not a project.

You would be forgiven to think this is a project, where you can save everything you see.

It is not - you can not save the list itself, and it will not be remembered between runs of Praat.

The list is itself intended as a temporary scratch area, "the things I need for what I am currently working on", the intermediate steps you need to do a nontrivial task.

Which means that, whether you are working interactively or writing scripts, you have to keep track of what you left in the list, by name or ID.


You can select one or more objects from the list, to then use one of the (applicable) buttons, to do something useful with this object, or combination of objects.

Aside from doing that by hand, you can do the same by code (by name or by id), which makes it easier to create scripts (but more on scripting later).


Suggestion: Think like a database

Objects and object combinations

Many objects will have a View & Edit to open a window for it to, well, view and possibly edit it.

e.g. try View & Edit on a Sound


Selecting specific combinations of object( type)s brings up some specific new buttons (and removes others that do not apply to the combination)

e.g. try View & Edit on a Sound + TextGrid; this that combines those two object's basic view, meant for annotations
...which only makes sense when they relate to each other.
Say, you probably want to create such a TextGrid from the Sound (select, then click Annotate >)
rather than create from scratch (NewCreate TextGrid...)


Notes:

  • Sometimes nothing makes sense for the combination you have selected
  • There are also hints -- e.g. when you select only a TextGrid, there is a button named View & Edit with Sound? which, if clicked, just tells you to select both.
there are many possible combinations not hinted at -- just some of the most common ones

Object types

There are quite a few object types, many of which you may never use.


Roughly in order of how quickly you will probably see or need it:


Some of the more commonly used object types

Sound[1]

waveform/PCM data
often mono, but can be multi-channel
View & Edit shows waveform and spectrogram
as the legend hints:
blue trace is pitch (note it's on a different Y axes from the shown spectrogram, because it'd otherwise be at the bottom)
red dots are formant places
green/yellow is intensity
pulses are shown in the waveform

LongSound[2]

for things that won't necessarily fit in memory
(which is less of a concern these days in terms of RAM)
more restricted than Sound (verify)


Spectrogram

spectrum over time (STFT), defaulting to a 50ms window length
if you only care to see this visual, you may not need to create a Spectrogram object, in that things that include sound data (e.g. Sound object, Manipulation object) tend to show a spectrogram in their View & Edit


Manipulation

LPC/PSOLA style speech analysis of Sound object
View & Edit shows a
pulses (if extracted becomes a PointProcess)
estimated pitch (extracted as PitchTier)
duration (extracted as DurationTier)
contains a little more
in particular the original Sound, e.g. for comparison's sake


Pitch

periodicity candidates over time
in equally sized/spaced frames (note that the evidence for these may come from different amount of pulses, and pitchtiers) show them that way

PitchTier[3]

basically a set of (timestamp, pitch_in_hz)
probably extracted via a Manipulation object
Praat itself, if asked about pitch at a point, interpolates between these points (and extends outwards before the first and after the last value)
can also be altered (and drawn) to resynthesize LPC/PSOLA type things with different vocal pitch


Intensity[4]

Intensity is at regular interval

IntensityTier[5] - amplitude envelope

(timestamp, intensity)


PointProcess[6]

a sequence of points in time, e.g. marking vocal pulses
mostly related to from LPC/PSOLA pitch stuff



Strings[7]

ordered list of strings

Table


Matrix

More specific and/or lesser-used object types

ExperimentMFC - Multiple Forced Choice[8] style listening experiment

The praat picture window

Mainly used for making plots from data. ...which can then be saved as raster (PNG) or vector (EPS, PDF).

You won't need this until you do, so can close it. ...I've seen people making startup scripts to specifically close it

Automation and scripting

As hintend at above, you can automate Praat.


At the end of the day, a script which is a bunch of text lines mentioning the actions, that are equivalent to doing those actions manually.


And you can write it yourself, but almost everything you can do in Praat GUI is recorded into history, and can be recorded into a script, and this is often the easiest way to create a useful script -- or at least do most of it, to then maybe edit a little.


The easiest way to at least reduce the amount of clicking you do when you want to do something to a large set of sounds files, is just to use that recording.

No coding required.


But coding makes things more powerful - it adds, among other things,

  • a way of ask for user input (primarily for the parameters you then hand into an existing action)
  • a basic scripting language that lets you do conditions and loops and other basics you will probably need to express your wishes.


💤 (This recordability of actions is also part of why Praat seems a little clunky at first. There is no fundamental difference between functionality already in Praat, and what you add later - it's all just buttons triggering specific actions. This action nature is also why some actions have old (and sometimes stupid) defaults: to not break what older scripts did when they relied on those old defaults)



https://www.fon.hum.uva.nl/praat/manual/History_mechanism.html New script, paste history


https://www.fon.hum.uva.nl/praat/manual/Scripting.html

Common tasks

Recording sound

NewRecord Mono Sound


Can record one or more fragments, and Save each to List. To do just one, you may like Save to List & Close


🛈 Hint: Good control of levels before recording makes for clean recordings

Try to avoid very quiet recordings

When showing audio, Praat will scale up whatever you already recorded.

This is great in that you always see a waveform and spectrogram with whatever is the strongest signal, regardless of exactly how strong it is.

It is not great for the same reason -- in that it will not show you whenever your recording was low-level and really noisy.


"Why do low levels make it noisy?"

Because there is always some noise in the recording, both from the room you are in, and from the imperfection of even the fanciest of devices. So the quieter the sound you record, the closer to that noise it is.

Yes, you can amplify it later, but you can only even amplify both equally. So the better you do at recording time.


"Wait, why is recording time any different?"

It isn't aways - there are ways to make it loud and still noisy, but it's harder. If it's just amplification after the "noise already went through the microphone and recording device", it isn't any different.

But just the fact that you are now probably paying attention to the level indicators of your recording device means that you are doing the most important part of adjustments, and are probably getting something decent (out of whatever mic and input device you have). There are certainly further important parts of thinking like a sound engineer, but this is a topic in itself, and that was the the first thing to get right.



To illustrate the "praat shows you something regardless of how loud it actually is" effect to yourself

  • record some gentle taps on the microphone but otherwise be quiet.
  • View & Edit the sound
you should see only the taps.
  • remove the taps (Select, SoundSet selection to zero)_
it will suddenly show loud everywhere else
  • undo that, then go to SoundSound Scaling, and in particular compare 'by whole' (the default) and 'fixed range'
  • 'by whole' - looks for the loudest sample in the whole sound
this is the detault
  • 'by window' - looks for the loudest part within the currentlyzoomed area (if stereo: max of both channels)
  • 'by window and channel' - by maximum used range (if stereo: individually)
  • 'fixed height' - seems to be "amount around calculated average"
"so if you say 2 you basically see the whole thing (but might hide a DC offset - not that you generally care about that)
  • 'fixed range' - seems to be "give min and max (average implied)"
"so if you say -1 and 1, you see the whole thing


To set up reasonable recording levels

Tell your subject to talk at reasonably loud levels, and look at the green/yellow/red indicator

  • if it barely seems to move, increase the input/mic sensitivity
if it's not visible at all, increase
if it goes into the red, lower it
if some of your hardware/software indicates in dB: 10 to 20dB should be enough
If not, 'a noticeable amount' should do)
if it registers a moderate amount when you're entirely quiet, consider turning it down
...because that's just device noise amplified a lot (which suggests something amplifies so much that louder things will distort).


This isn't always perfect advice - hardware varies, and there may be other reasons for e.g. things to still be quiet, or for there to be distortion in other hardware, before it even got into the PC.

Sometimes things are quiet because they keep their distance from an insensitive microphone. Sometimes things distort because they smush their face into a sensitive microphone. Sometimes things are wird because someone else used the hardware and changed something. etc.

So when doing serious recording, try to start with a sanity check: make a quick recording and listen to it (headphones are often better than speakers), and maybe check the waveform for flat, clipped tops.

💤 Why Mono?

Microphones are usually mono.

Also, it is good science to only vary the things you intend to study, and mixing multiple microphones gives a few dB of effects that are not easy to control in recording, or in playback (and headphones are different from speakers when it comes to stereo effects), so mono frankly makes your life simpler.

That said, when you want to be really precise, consider that on some input devices, 'record mono' means 'mix stereo to mono', while on others, it means 'record the first channel', and this is not something they list in spec sheets, so you won't really know which one it is until you test.


A separate point: if you are recording less for sound analysis, and more for someone to transcribe this to text, know that a lifetime of practice made people better at separating multiple voices from any sound recorded in a vaguely binaural way than from mono. You may like a portable recorder with two mics mounted on them.

Viewing your recording

Select the Sound, press the View & Edit button.

More basic things you may want to do is zooming in and out, scrolling around, cutting pieces off the edge - mainly see the Edit and Time menus (and maybe learn the keyboard shortcuts)



💤 The spectrogram is tuned for voice, and this can be tweaked a little further

Spectrogram → Spectrogram settings

View Range - Normally 0Hz to 5kHz (even when you recorded more) - there's almost nothing interesting to show above that, and zooming down to this range means we can see the pitch movement is more visible. This is a good default, and you arguably could make this even narrower to see pitch movenebt even more.

Window length is about the (STFT) tradeoff between frequency resolution and time resolution. The default (0.005) is a good tradeoff for many tasks. Higher (0.015) may sometimes make e.g. separate formants more visible -- yet makes them harder to place in time precisely.

The concept of dynamic range relates the loudest to the softest levels. Praat determines the maximum in a recording, this setting determines how much lower is considered quiet enough and not worth showing. The default 70dB shows almost everything (including background noise), lowering to 50 will remove quieter noise and signal, 30 does so more aggressively. It will look cleaner but hide subtler detail.


Spectrogram → Spectrogram advanced settings

Maximum is the energy level (you can ignore the units) to treat as the loudest to show (black). By default this field is ignored, because autoscaling handles this (...within a zoom level, so scrolling will make it vary - if you want to inspect in detail, you might care to turn off autoscaling).

Pre-emphasis considers that the loudness of speech's components (vowels mostly) falls approximately -6 dB per increasing octave, so we amplify higher frequencies just for the visualisation, so that it shows them roughly equally. The default is +6dB per octave. Higher than that puts more focus on higher sounds. (identity-gain point at 1000 Hz?)

Dynamic compression you can think of as someone turning up a volume knob at points of the recording being quieter.

this has use if there is significant variation in how loud individual responses are.
...but if you have strong recordings, all this really does is take the near-silent parts between words and turns up what's there -- i.e. the background noise, which is the least interesting part.
The setting is a fraction, how much to amplify any part towards the level of everything else. You rarely want to make this higher than 0.5 or so (because that's often around 20dB).


https://www.fon.hum.uva.nl/praat/manual/Advanced_spectrogram_settings___.html

Annotating your recording

Create a TextGrid

To create an empty TextGrid of the same length as a given Sound:

  • Select Sound
  • Find the Button: AnnotateTo TextGrid


The form dialog that gives you is:

All tier names:                  Mary John bell
Which of these are point tiers:  bell

The first time you see that, it introduces two or three new things, so this bears some comment.

💤 For context:

You can have multiple, independent 'tracks' of information, called tiers. This is useful e.g. when there are distinct things worth noting, e.g.

multiple speakers
a speaker and a bell to mark the start of experiment response
annotation at sentence, phrase, word, phoneme level
aligned translations
...and/or whatever else you can think of


Also, sometimes we want to segment things, and sometimes we just want to mark things in time.

So each tier can be either an

  • interval tier -
consists of segments that always covers the whole recording
inserting something at a time will split the segment that is currently there into two
you can select the segments
you can optionally label each segment (you might e.g. start by marking the silences)
  • point tier (sometimes 'text tier')
inserting a point adds a specific points in time
you can optionally label each point
you can select the labels


So now we can grasp that

  • Tier names - space-separated list. Settles the amount of tiers and their tiers at the same time
  • Which of these are point tiers? - repeat the names of tiers you want to be point tiers. Any not mentioned will become interval tiers

So:

All tier names:                  Mary John bell
Which of these are point tiers:  bell

means:

create Mary as an interval tier
create John as an interval tier
create bell as a point tier

Actually using that TextGrid

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

If you did the above, you have a Sound and a TextGrid of the same time length.


Actual annotations

Editing

A mix of keyboard and mouse seems to be most convenient.


Remember zooming (Ctrl-I, Ctrl-O), scrolling (PgUp, PgDn)


Click in waveform or spectrogram: choose a point in time


Some of the most useful keyboard shortcuts are:

Enter - add segment/point in currently selected tier

Ctrl-1, Ctrl-2 - add point in specific tier

Mouse-drag a segment-point to move it

Alt-backspace - remove point / merge segment with previous

Alt-arrows - to move around the segments (useful when cleaning up)

Annotating

Other views/editors

Pitch editor

https://www.fon.hum.uva.nl/praat/manual/PitchEditor.html


Text files, short text files, binary files

Many data-style objects (including some cases you may never use, like Sound objects) have a structured representation that can be saved as

  • a text file
which contains a little more than necessary but is human-readable.
  • a short text file, which basically omits the variable names,
but is stable enough that parsers should have no trouble (no idea if there were breaking changes over time)
  • a binary format, which is a little more compact.


Some thing have futher forms, e.g.

  • PitchTier and DurationTier has
    • PitchTier/DurationTier Spreadsheet file (not unlike their short text form)
    • headerless spreadsheet file (basically TSV)


While the text formats look parseable by yourself, try to avoid that when it is easy because the format has changed. Praat will know how to handle that, but your or other's libraries may break over time.

Praat setup

Praat preferences folder

Mostly contains

  • Preferences file[9]
mostly contains a whole bunch of defaults
mostly contains adds, shows, and hides, of menu items and buttons
  • possible plugins, in directories
specifically, directories with names that start with plugin_ and that contain a file called setup.praat will have that file executed


Location:

Windows:   %USERPROFILE%\Praat
OSX:       ~/Library/Preferences/Praat Prefs/
Linux:     ~/.praat-dir/

https://www.fon.hum.uva.nl/praat/manual/preferences_folder.html


Interacting with Praat program

Calling into Praat executable should typically be done with --open, --run, or --send


Opening files with Praat can be done like

Praat --open data\hello.wav data\hello.TextGrid
Praat --open script.Praat



You can

Praat --run testCommandLineCalls.praat "argument"

...which does so without a GUI; any Info-window output goes to the console that runs it.


To command a running Praat GUI (or a new one if one wasn't running), you want sending

Praat --send "command"

sendpraat is roughly the same, e.g.

sendpraat 1000 praat "Read from file... hello.wav" "Play reverse" "Remove"


https://www.fon.hum.uva.nl/praat/manual/Scripting_6_9__Calling_from_the_command_line.html


"The phonetic font is not available"

Praat wants a font that covers all IPA characters in Unicode


Praat comes from a time where you likely needed to install SIL Doulos and/or SIL Charis to guarantee the IPA characters would show up.

These days there are other fonts that cover them, but Praat still plays safe and still complains if you don't have those fonts installed. If it works, you can ignore the warning. If you want it to be quiet, you could install those fonts


See also: how to install fonts (in general)

Consider tweaking Windows

Showing file extensions

If you have made

Recording1.wav
Recording1.Manipilation
Recording1.PitchTier

then it is fairly clear what belongs together.

However, Windows typically hides extensions (that it knows about). This is nice for a uncluttered overview where you have a nice icon instead -- but less precise when using a computer as a tool. In particular, when explorer show you just

Recording1
Recording1
Recording1

that's less great.


If you want to see extensions:

Win7: Tools : Folder options : View tab : uncheck "Hide extensions for known file types"
Win10: Explorer window : View : check "File Name Extensions"
Win11: Explorer window : View : Show : File Name Extensions
OSX:
Linux / Gnome:

Where is my configuration

Windows: %USERPROFILE%\Praat Linux: ~/.praat-dir OSX: ~/Library/Preferences/Praat Prefs/

...which are each shorthands that resolve to directory for user currently logged in


Consider: Open with

Why?

Say you often find yourself opening one file at a time with Praat's OpenRead from File to add a bunch of files


There are faster options.

Window's "Open with" associates a file extension with a program, after which it will run Praat.exe with that file -- which each ends up working as an "add to list".


How?

  • Windows: Right-click a file (with an extension you want to associate with a program - you would have to do this once for each praat file extension)
"Open With..."
may require some extra clicking (varies with windows version) like "More apps", "look for more apps", scrolling, etc
The first time you do this you also need to browse to where precisely the application is located on disk. If you've done that once it should be in the list of apps

Praat plugins and toolkit notes

Praat plugins

Praat is extendible, in a few different ways


Run commands that happen to alter buttons file permanently

If you run Add action command: and Add menu command: yourself or from a regular script, it will permanently altering the buttons file.

You can also put that in a script, which will be a "run once to install this" -- but consider the plugin way instead.

Upsides:

  • nice way to incrementally make Praat do all the things you want

Downsides:

  • over enough time you won't remember what you did


Put things in your Praat profile (Plugins)

Alternatively, place that exact same file in a new directory under your Praat preferences folder.

This will now be run at Praat startup, but not alter your buttons file.

Upsides:

  • picking up what's there at each run is easier for development

Downsides:

  • a little more work



Site installs

Not really there.

Permanent installation in a "regardless of who runs it" requires at least a one-time run (per user/install).


The closest you can do is have such a one-time install install a script that picks up something from a shared folder.


https://www.fon.hum.uva.nl/praat/manual/plug-ins.html

Plugin examples

Praat vocal toolkit

Praat Vocal Toolkit [11]



Plugin manager

http://cpran.net/


ProsodyPro notes

http://www.homepages.ucl.ac.uk/~uclyyix/ProsodyPro/

Not a plugin in the sense of 'adds buttons to the interface', more of a script that when run, initiates a semi-automated annotation.

How do I...

How do I shorten the length of a textgrid?

Unsorted