Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
awacke1 
posted an update about 10 hours ago

This looks great, thanks for sharing. Are you using audio capabilities of GPT-4o or first converting audio to text and using its text capabilities. I saw in their announcement that audio capabilities are not publicly available to everyone through their API, so wanted to see if I am misunderstanding something.

Developers can also now access GPT-4o in the API as a text and vision model. We plan to launch support for GPT-4o's new audio and video capabilities to a small group of trusted partners in the API in the coming weeks.