OpenFLAM

Sun, Jan 25, 2026 One-minute read

FLAM is now Open Source!

We’re releasing OpenFLAM, the companion code to our ICML 2025 paper Frame-wise Language-Audio Modeling.

So what does FLAM actually do?

  • Zero-shot Sound Event Detection: describe any sound in plain text, and FLAM will tell you when it happens in your audio
  • Text-based Audio Retrieval: search massive audio libraries using natural language queries

The key insight: while most audio-language models only give you clip-level understanding, FLAM localizes events at the frame level. Ask “where’s the dog barking?” and get precise timestamps, not just “yes, there’s a dog somewhere in this 10-minute file.”

Additionally, it’s highly efficient, light-weight… and on PyPi (pip install openflam) and HuggingFace! 🪶🤗

Huge thanks to my co-authors Yusong Wu, Christos Tsirigotis, Ke Chen, Anna Huang, Aaron Courville, Prem Seetharaman, and Justin Salamon for making this happen.

Here the links: