Judge Large Audio Models on Talk Arena!

Ask Anything: From general conversation to specialized audio processing, use the models how you want!
Get State-of-the-Art Responses: Currently, we serve from GPT-4o, Gemini, Qwen2-Audio, DiVA-Llama 3 and Typhoon-Audio. We currently support an audio-in, text-out format.
Pick the Winner: Vote for the response which is the most natural, coherent, harmless, and helpful!

Acknowledgements

This platform is made possible thanks to computing support provided by SCB 10X, SCBX Group through Stanford HAI and credit support provided by Google Gemini. We also would like to thank Qwen team for their help with building a model API endpoint. Finally, thanks to the many early testers who helped us refine Talk Arena!

Cite us

If you find this work useful, please cite us:

  @misc{talkarena2024,
      title={Talk Arena: Interactive Evaluation of Large Audio Models},
      author={Minzhi Li and Will Held and Michael J. Ryan and Kunat Pipatanakul and Potsawee Manakul and Hao Zhu and Diyi Yang},
      year={2024}
    }

Judge Large Audio Models on Talk Arena!

How should I evaluate the models' output?

Does this platform store my data?

Why Audio-In, Text-Out evaluation rather than Audio-In, Audio-Out?

Ethics disclosure

Acknowledgements

Cite us