Audio in CommCare can be used to help both the beneficiary or phone's user connect to messages on screen. This can be to guide an interaction for data collection or to provide counseling messages. Audio messages can often be longer than on-screen prompts, and can help serve as a "3rd party" expert for the people present. Audio can be recorded locally by people familiar with the program and incorporated directly into your application via CommCare HQ.
Audio files must be in mp3 format. If your audio files are in another format, use this online tool to convert your files, at 128kbit/s. Then, import into Audacity to complete processing.
Guidelines for Good Recordings
Before deciding to include multimedia in your application, think carefully through what the goal of multimedia will be in your application. Some applications may not need multimedia (pure data collection or advanced users who don't need support). Audio: We recommend that the person who is recording the audio files (the one holding the device, not the person who’s voice is being recorded), use headphones attached to the device to listen to the voice as it is recorded. We have found this is a helpful in determining the clarity of the recording, and will indicate to you whether any background noise or interferences were also captured. Be careful not to use a headphone that has a microphone attached to it, this sometimes creates a disturbance as two microphones are working simultaneously at the same time. Depending on the pronunciation of the speaker you may need to adjust the position of the recorder. Generally, a 45 degree angle downwards from the mouth works well. Additionally, a 1 inch gap between the mouth and the recorder is recommended. Common problems with positioning of the equipment:
Before deciding to include multimedia in your application, think carefully through what the goal of multimedia will be in your application. Some applications may not need multimedia (pure data collection or advanced users who don't need support).
We recommend that the person who is recording the audio files (the one holding the device, not the person who’s voice is being recorded), use headphones attached to the device to listen to the voice as it is recorded. We have found this is a helpful in determining the clarity of the recording, and will indicate to you whether any background noise or interferences were also captured. Be careful not to use a headphone that has a microphone attached to it, this sometimes creates a disturbance as two microphones are working simultaneously at the same time.
Depending on the pronunciation of the speaker you may need to adjust the position of the recorder. Generally, a 45 degree angle downwards from the mouth works well. Additionally, a 1 inch gap between the mouth and the recorder is recommended. Common problems with positioning of the equipment:
If using Zoom H1 Microphone
- Carry extra batteries.
- Turn On: Slide power switch down for 1 second
- Turn Off: Slide power switch down for 1 second
- Reducing Noise: Back of Recorder – Low Cut On will reduce any wind or background noise.
- Input Level: Can be automatic by switching Auto level – On (back of recorder). Or can be done manually (recommended settings to come)
- Output Level: Volume that will come through your headphones. Manually adjusted near headphone input.
- Recording Format: Wave Format is higher quality sound than MP3. Change the Bit Rate using the arrows (this will only change if SD card is inserted). Recommended settings would be 48/16 or 48/24. Higher bit rate results in highly quality audio recordings, however, decreases the recording time of the card. Depending on the length of the audio recording, the bit rate can be adjusted appropriately.
- Listening to Recordings: After recording, you can use the play button on the side to listen to any recording. The side arrows will allow you to choose which recording you would like to listen to. As the recording plays, the remaining time of the recording will be shown. The playback can be heard using your headphones or simply through the recorder.
- Deleting Recordings: While the playback is running or when it is complete, you can press the trash key to delete. Press the Record button for confirmation of the deletion. A message ‘Done’ should appear once the deletion is complete.
Processing Audio Using Audacity
Processing audio for CommCare involves five easy steps which include: (1) Splicing; (2) Background Noise Removal; (3) Blank Noise Removal; (4) Pauses; and (5) Amplification. Steps 1-3 can be done for each clip individually and steps 4 and 5 can be done in bulk. Please also see our video demonstration on how to use Audacity for processing to help you along the way!
Audio Processing Tutorial
Audio Processing Tutorial
- First install Audacity and follow the instructions for downloading. You will need the LAME mp3 encoder to use audacity. You can download it for free by searching for lame_enc.dll.
- Configure the MP3 encoding settings in Audacity (Edit -> Preferences -> File Formats -> Bit Rate) (Please read Note below for newer versions of Audiocity). For speech, 56-64kbit mono ABR encoding at 22 050Hz should give excellent results. If the audio contains other noises or music, 96kbit mono could be considered. For very high quality applications (at a minimum, CommCare user will be using headphones) use 128kbit stereo. 64kbit mono requires ~7KB per second of audio. We use average bit-rate encoding (better quality for a given file size and preferred over variable bit rate/VBR for low bit rates). Set the project's frequency to 22 050Hz in the bottom left corner of Audacity. See a discussion on encoding choices, including for voice.
Note: In newer versions of Audiocity 2.0 and higher the option for MP3 encoding has changed to the export step. File -> Export -> Save as type -> MP3 -> Options Button.
- Copy the files from the memory card in the recording device and save on your computer.
- Open the recording file in Audacity.
- Splicing: If you recorded a string of audio messages in one audio file, your sound peaks may look like the image below. If you decided to use unique sound effects like a clap or a tap on the table as a way to denote the start and stop of audio messages, then you will be able to visually decipher the different message segments. If you recorded one audio message in one file, you will hopefully have a shorter file.
- Play the file and find the best recordings for each audio message.
- Select the portion of audio (highlighted below) that you wish to process.
- If your file contains all the audio messages, then copy and paste this best audio recording for one message into a "new project" and complete your processing there.
- If you recorded separate clips for each message, then you may find it easier to process within the same file. Delete the recording segments that you do not want to keep.
- Play the file and find the best recordings for each audio message.
- Background Noise Removal: Listen to the audio message. It is best to remove background noise to ensure we have good quality messages that are played from the application. See this video from 6:54 to 10.48 seconds to learn how to remove background noise. NOTE: This may not be the best way to do it and it can be time consuming, so the better thing to do is, obviously, be careful while recording to ensure there are no background noises.
- Blank Noises: Now check for any blank noises in the audio clip and delete them. The objective is to keep the audio recording crisp and precise. To do this, drag the cursor and select that portion of the clip and press delete.
Save/Export Individually: Save this audio file as .mp3, click on File, Export as mp3. Save the file name as the questionID in the application. You may have to reference the definition file. (You can also select the processed portion of the audio clip in a larger file and select Export Selection as mp3.) Save the processed clips in a folder called "audio".
*This is the end of processing clips individually. You can complete the last two processing steps in bulk to save time.* To begin, select all the mp3 files in your "audio" folder and drag them into Audacity. You can work in smaller sets of 10-15 files instead of copying hundreds of audio clips into Audacity.
- Pauses: Note it's recommended to add a slight 0.5 - 1 sec delay at the beginning of each audio clip. Don't make the delay too long, otherwise FLWs will be inclined to try to re-play the message by hitting the hash button twice. This will actually pause the audio clip! To add intro pauses, place the cursor at the very beginning of your audio clip. Click on Generate Silence and enter the preferred seconds.
- Amplify: The next step is to amplify the volume of the audio file because audio played in the field by FLWs needs to be audible by beneficiaries, in an environment that is prone to a lot of noise (i.e. the farm, a health center, a home with a crying baby, goats and chickens).
- Select the sound peaks of all of your audio files in Audacity
- Go to Effects and choose Amplify
- In the pop-up box, input 10 in the "amplification" field and below, check the box "allow clipping" (see pic below) and click Ok.
You will notice the audio file will have vertically stretched out bars indicating that the volume has been amplified.
If the audio is going to be played in a loud environment, you may need to increase the audio volume slightly more. More amplification is better than less amplification. The FLW can reduce the volume on the device if its too loud, but wouldn't be able to increase it if the audio was not amplified enough to begin with.
Save/Export in Bulk: Now that you have processed audio in bulk, you can export in bulk too. Select File > Export Multiple.
These steps are optional.
Making the volume equal using MP3 Gain
These steps makes all the recordings approximately the same volume. First download and install MP3Gain.
- Open MP3Gain and choose "open file/folder" and open all the clips you want to use.
- Then do Gain --> Apply Constant Gain. Configure and tweak as needed.
Command Line Instructions
mp3gain -r -c -d 10 *.mp3 (assuming all the mp3s are in the current directory)
The -d 10 is a volume boost (here, 10dB) to give to all files after they have all been normalized to the same volume. This is because the default volume level tends to sound quiet on the phones. Tailor the amount of boost to your deployment and the devices you will use). Each 10 dB of boost approximately doubles perceived loudness.
Clipping Audio files in Audacity
You can also clip audio files in Audacity to control the loudness of the file. See below an example of good and bad clipping. If you recorded audio that is really loud or really soft, you might want to manually use the clipping feature.
Don't boost too much or clipping will occur (the stength of the signal is boosted beyond the maximum of what the sound file can represent; the rest is 'clipped' off). Excessive clipping will sound harsh and severely degrade sound quality. You can view the amount of clipping in audacity.