Audio files must be in mp3 format. If your audio files are in another format, use this online tool to convert your files. Then, import into Audacity (below) to post process.
Guidelines for Good Recordings
Use a quality microphone
Pick a quiet place with little background noise or disturbances. Both recorder and 'talent' should remove all accessories which may interfere with the recordings (ex. jewelry, keys, phones).
Have the "talent" speak clearly and slowly — slower than feels natural, with good annunciation, discernable breaks between words, and plenty of pauses. Have him/her speak slightly louder than usual ("project") but not so much that it sounds unnatural
Record a short test take before starting to ensure equipment is functional and audio quality is good
Have all text to be recorded printed out and numbered in large font.
Begin recording. Have them read each phrase in order with a short pause after each (~3 seconds). Have them read the number (in English, if possible) before each phrase, with a short pause (~1 second) between the number and the phrase. The numbers will aid greatly in identifying which phrase is which, especially if they were recorded in a language other than your own.
Try to record all the phrases in one take (one audio file). Don't use a separate file for each phrase. If the recording is interrupted with background noise and the speaker messes up, let the recorder keep running and continue on when possible, starting with that same phrase.
Do several takes.
Depending on the pronunciation of the 'talent' you may need to adjust the position of the recorder. Generally, a 45 degree angle downwards from the mouth works well. Additionally, a 1 inch gap between the mouth and the recorder is recommended.
If using Zoom H1 Microphone
Carry extra batteries.
Turn On: Slide power switch down for 1 second
Turn Off: Slide power switch down for 1 second
Reducing Noise: Back of Recorder – Low Cut On will reduce any wind or background noise.
Input Level: Can be automatic by switching Auto level – On (back of recorder). Or can be done manually (recommended settings to come)
Output Level: Volume that will come through your headphones. Manually adjusted near headphone input.
Recording Format: Wave Format is higher quality sound than MP3. Change the Bit Rate using the arrows (this will only change if SD card is inserted). Recommended settings would be 48/16 or 48/24. Higher bit rate results in highly quality audio recordings, however, decreases the recording time of the card. Depending on the length of the audio recording, the bit rate can be adjusted appropriately.
Listening to Recordings: After recording, you can use the play button on the side to listen to any recording. The side arrows will allow you to choose which recording you would like to listen to. As the recording plays, the remaining time of the recording will be shown. The playback can be heard using your headphones or simply through the recorder.
Deleting Recordings: While the playback is running or when it is complete, you can press the trash key to delete. Press the Record button for confirmation of the deletion. A message ‘Done’ should appear once the deletion is complete.
Extracting and Splitting
- First install Audacity and follow the instructions for downloading. You will need the LAME mp3 encoder to use audacity. You can download it for free by searching for lame_enc.dll.
- Configure the MP3 encoding settings in Audacity (Edit -> Preferences -> File Formats -> Bit Rate) (Please read Note below for newer versions of Audiocity). For speech, 64kbit mono encoding should be adequate. If the audio contains other noises or music, 96kbit mono could be considered. For very high quality applications (at a minimum, CommCare user will be using headphones) use 128kbit stereo. 64kbit mono requires ~7KB per second of audio. We use variable bit-rate encoding (better quality for a given file size).
Note: In newer versions of Audiocity 2.0 and higher the option for MP3 encoding has changed to the export step. File -> Export -> Save as type -> MP3 -> Options Button.
- Extract the recordings from the recording device.
- Open the recording file in 'audacity' (music editing program).
- If this file has multiple speech recordings(as shown in the pic below) in the same file then select the portion of audio (highlighted) you wish to process. Copy and Paste in a "new project". If not continue to work in the same project.
- Listen to the selected piece once.If there is a lot of background noise which could be the case with recordings done in the field,its best to remove them to ensure we have good quality recordings. Here is detailed video( Watch from 6:54 to 10.48) on how to do the same. NOTE: This may not be the best way to do it and it can be time consuming, so the better thing to do is, obviously, be careful while recording to ensure there are no background noises.
- Once you have finished removing background noise, you can check for any blank noises/unneccesary pauses in the audio piece and delete them. The objective is to keep the audio recording crisp and precise. So, to remove the blank noises/unneccesary pauses, drag and select just that portion of the file and press delete.
The next step is to amplify the volume of the audio file. The audio files when played by the chw's on their phones needs to be relatively loud since field environment is prone to noise. Select the entire audio file (ctrl+a).Go to effects and choose amplify
- In the pop-up box, input 10 in the "amplification" field and below, check the box "allow clipping" (see pic below) and click ok
You will notice the audio file will have vertically stretched out bars indicating that the volume has been amplified.
Now, to save this audio file as .mp3, click on file, export and save the file with the desired file name.
This step makes all the recordings approximately the same volume. First download and install MP3Gain.
Open MP3Gain and choose "open file/folder" and open all the clips you want to use. Then do Gain --> Apply Constant Gain. Configure and tweak as needed.
Command Line Instructions
mp3gain -r -c -d 10 *.mp3 (assuming all the mp3s are in the current directory)
The -d 10 is a volume boost (here, 10dB) to give to all files after they have all been normalized to the same volume. This is because the default volume level tends to sound quiet on the phones. Tailor the amount of boost to your deployment and the devices you will use). Each 10 dB of boost approximately doubles perceived loudness.
Don't boost too much or clipping will occur (the stength of the signal is boosted beyond the maximum of what the sound file can represent; the rest is 'clipped' off). Excessive clipping will sound harsh and severely degrade sound quality. You can view the amount of clipping in audacity.