A podcast editor submitted a 40-minute interview recorded in a kitchen — refrigerator hum at 60 Hz, HVAC rumble at 120 Hz, and a guest who occasionally drifted 30 cm from the microphone. Manual cleanup in Adobe Audition took 3 hours. After AI enhancement, the same cleanup took 11 minutes, reducing noise by 28 dB, boosting voice presence at 2–4 kHz, and applying automatic gain control to smooth the proximity variation. The refrigerator hum was undetectable in the output. The HVAC, 90% gone.
Understanding what the model does explains when to trust the output and when to fix it manually.
Three Distinct Processes Running in Sequence
| Stage | What it does | Works best on |
|---|---|---|
| Noise suppression | Identifies stationary noise floor (hum, hiss, fan) and subtracts it using spectral gating | Consistent background noise — not music |
| Voice enhancement | Boosts 2–5 kHz presence region, applies de-essing at 6–10 kHz, narrows room reverb | Speech recorded in rooms with hard surfaces |
| Loudness normalization | Applies LUFS-R target (typically -16 LUFS for podcast, -23 for broadcast) with true-peak limiting | Any recording that needs consistent volume |
When Enhancement Hurts Rather Than Helps
- Music with vocals:The noise suppressor cannot distinguish instrumental backing from "noise" — it will artifact the music while trying to clean it. Use only on speech-only recordings.
- Overlapping speech:When two people talk simultaneously, the voice isolation model picks the dominant speaker and suppresses the other. You will lose the quieter speaker's words.
- Recordings below 8 kHz sample rate: Enhancement cannot recover frequency content that was never captured. Telephone audio (8 kHz) processed at 16 kHz settings sounds hollow and artificial.
- Clipped audio (over 0 dBFS): Clipping is distortion in the waveform itself, not noise on top of it. No enhancement removes clipping; it only makes the distortion more audible by boosting surrounding frequencies.
Format and Quality Reference
| Output format | File size (1 min) | Best for |
|---|---|---|
| WAV 16-bit 44.1 kHz | ~5 MB | Further editing, archiving |
| MP3 320 kbps | ~2.4 MB | Podcast distribution |
| MP3 128 kbps | ~960 KB | Web embedding, bandwidth-limited |
| OGG Vorbis q6 | ~1.1 MB | Web audio, open format |
