Blog Post05

1 minute read

Published:

Audio and Text Alignment

In Neurimaging language experiments it is often required that an orthographic transcript of the text is well aligned with the audio. It is important to capture the onset timings of a particular word in order to measure brain activation related to it. This is where forced alignment comes in handy.

In this blog post I will highlight the usage of the Montreal Forced Aligner.

Input to the Aligner:

  1. Audio Files: The audio usually works best in a .wav format. Other required properties are Sampling rate of 16 kHz (Mono), conversion to this type of file format can be done using existing tools, the script is present at <–to be inserted–>
  2. Transcriptions in TextGrid format: It can be generated using the .exe form of PRAAT. However, in order to make results more automated a PRAAT version without a GUI is also present. The no gui version is for Linux is available her https://www.fon.hum.uva.nl/praat/download_linux.html A TextGrid can be generated automatically using the script available at https://www.eleanorchodroff.com/tutorial/scripts/create_textgrid_mfa_simple.praat .txt format has not worked for me till now.

Output: Montreal Forced Alignment Tool