@awacke1 on Hugging Face: "I just completed getting all four aspects of the new OpenAI GPT-4-o Omni model…"

This looks great, thanks for sharing. Are you using audio capabilities of GPT-4o or first converting audio to text and using its text capabilities. I saw in their announcement that audio capabilities are not publicly available to everyone through their API, so wanted to see if I am misunderstanding something.

Developers can also now access GPT-4o in the API as a text and vision model. We plan to launch support for GPT-4o's new audio and video capabilities to a small group of trusted partners in the API in the coming weeks.

Join the conversation