Advanced audio processing at Zencoder

We're rolling out some awesome new audio processing features this week at Zencoder, and we'll be releasing more soon. These features aren't for everyone, we know; most of our customers won't have to to worry about audio gain or highpass filters or equalization. But others will, and we hope to power some cool things with features like this. Here is a quick rundown of the new audio settings.

Audio Level Adjustment

audio_gain: Apply a gain amount to the audio. This increases or decreases the volume of the audio by the specified amount after applying other effects. This is mainly useful for adjusting the levels after applying other effects, or making adjustments to files for which you already know the level. Sounds that go just a bit over the max volume will be gently limited to preserve fidelity, but increasing significantly over the maximum volume will result in distorted sound. See audio_normalize for automatically increasing volume without distortion. Specified in dB, either positive or negative, up to 60dB. audio_normalize: Normalize audio to 0dB. This increases the volume of the audio as much as possible without causing distortion. To help ensure consistent adjustment by the expansion and compression effects, we normalize the volume both before and after those effects. See audio_pre_normalize and audio_post_normalize to only normalize at one of those points. audio_normalize is equivalent to specifying both audio_pre_normalize and audio_post_normalize. audio_pre_normalize: Normalize the audio before applying expansion or compression effects. audio_post_normalize: Normalize the audio after applying expansion or compression effects.

Frequency Control

audio_bass: Increase or decrease the amount of bass in the audio, similar to a stereo's tone controls. Specified in dB, positive or negative amounts up to 10. audio_treble: Increase or decrease the amount of treble in the audio, similar to a stereo's tone controls. Specified in dB, positive or negative amounts up to 10. audio_highpass: Apply a high-pass filter to the audio at the specified frequency in Hz. This will prevent audio frequencies below the specified threshold from being included in the output. A very common use of this setting is to reduce rumble/noise from audio, such as the sounds of trucks driving by outside during the recording or some sounds caused by wind or hitting a microphone. DC-Shift can also be reduced/removed by specifying a high-pass filter at a low frequency, such as 10 Hz. Valid frequencies are 5-24000 Hz. audio_lowpass: Apply a low-pass filter to the audio at the specified frequency in Hz. This will prevent audio frequencies above the specified threshold from being included in the output. This can be used to cut out very high pitched whine sounds from audio, or to create special effects, such as a "telephone" setting (using audio_highpass of 300 and audio_lowpass of 3000).

Dynamics Control

audio_compression_ratio: Compress the dynamic range of the audio by reducing volume at a specified ratio of N:1, above the specified threshold. Ratios between 1.0 and 30.0 are valid. Compression is essentially useful to make the audio volume more consistent between quiet points and loud points. This can be used to make loud spots (yelling, or slammed doors, etc.) less jarring, or used in conjunction with gain/normalization it can make quiet parts of a speech more audible. audio_compression_threshold: Set the threshold above which audio compression is applied, when audio_compression_ratio is specified. Values are specified in dB as a negative number down to -120. Default is -20. audio_expansion_ratio: Expand the dynamic range of the audio by reducing volume at a specified ratio of N:1, below the specified threshold. Ratios between 1.0 and 30.0 are valid. Expansion is essentially useful for reducing unwanted sounds during quiet points, or making the difference between loud and quiet sounds more significant. audio_expansion_threshold: Set the threshold below which audio compression is applied, when audio_expansion_ratio is specified. Values are specified in dB as a negative number down to -120. Default is -35.

Fade in/out

audio_fade: Apply fade-in and fade-out effects to the audio with the specified duration in seconds. Equivalent to specifying both audio_fade_in and audio_fade_out with this duration. Positive values up to 30.0 seconds. audio_fade_in: Apply a fade-in effect to the audio with the specified duration in seconds. Positive values up to 30.0 seconds. audio_fade_out: Apply a fade-out effect to the audio with the specified duration in seconds. Positive values up to 30.0 seconds.

Misc Features

audio_karaoke_mode: Apply a "karaoke" effect to the audio by subtracting the right audio channel from the left, and keeping only the difference. Only applies to stereo outputs. Be careful using this this feature - it (1) sometimes produces distortion, (2) sometimes does nothing at all, and (3) sometimes negatively effects the background noise or music.

General Notes

Zencoder applies the applicable effects/adjustments in a consistent order, as follows:
  1. highpass
  2. lowpass
  3. bass
  4. treble
  5. pre_normalization
  6. expansion
  7. compression
  8. gain
  9. fade_in
  10. fade_out
  11. post_normalization

What else?

Look for more audio processing features soon. Are there any audio features you'd like to see? Any ways that you think we can improve these features? If so, let us know!