← back
Multi model multimodal and multi agent innovations in Azure AI: Cedric Vidal
Takeaway
Azure AI Studio centralizes a vast multi-modal multi-vendor model catalog with end-to-end multimodal demos like menu reasoning and voice-preserving video translation.
Summary
- Azure AI Studio hosts 1,600+ models across vendors with serverless (per-token) or BYO-infra deployment options.
- GPT-4o demo extracts text from printed and handwritten restaurant menus, reasons about vegan options, and translates French to English simultaneously — natively multimodal.
- Use cases shown: insurance damage assessment from photos, energy infrastructure monitoring from camera feeds with JSON output for dashboards.
- New Azure video translation service translates speaker videos into other languages with original voice and emotional tone preserved (whispering, yelling).
- Build announcements: Azure AI Studio GA, GPT-4o, Phi-3 small language model, GPT-4 Turbo Vision, DALL-E 3, Whisper, Assistants API, fine-tuning for GPT-4.
azuremultimodalmodels
Original description
Explore GPT-4, multi-modality, and demos integrating sight and language with Dall-E and Whisper. Learn about developer tools, AI assistants, scalable applications, and customization. Focus on responsible AI, data privacy, and security with Azure. Featuring interactive demos and stories, this session is perfect for developers and innovators. Recorded live in San Francisco at the AI Engineer World's Fair. See the full schedule of talks at https://www.ai.engineer/worldsfair/2024/schedule & join us at the AI Engineer World's Fair in 2025! Get your tickets today at https://ai.engineer/2025 About Cedric Cedric Vidal is a Principal AI Advocate at Microsoft, specializing in Generative AI , and the startup and research ecosystems. He is dedicated to promoting AI in startups and facilitating the transition of research and startup products to the market. Before his current role, Cedric spent 4 years as an Engineering Manager in the AI data labeling space for the self-driving industry at Argo AI (now re-spawned as Latitude AI). He also served as the CTO of the Fintech AI SAAS startup Quicksign and worked as a software engineering services consultant for major Fintech enterprises for 10 years.