Microsoft's MAI-Transcribe-1 Converts 25 Languages at 36 Cents/Hour

April 3, 2026

Microsoft's MAI-Transcribe-1 Converts 25 Languages at 36 Cents/Hour

Published: April 3, 2026 at 12:38 AM

Updated: April 3, 2026 at 12:38 AM

100-word summary

Microsoft just launched MAI-Transcribe-1, a speech-to-text model that handles 25 languages at a starting price of 36 cents per hour. It runs 2.5x faster than Microsoft's previous Azure Fast transcription. The model claims top accuracy on FLEURS benchmarks across all supported languages, though key features like speaker identification and streaming remain "coming soon." Translation: your call center recordings and international Zoom meetings can now turn into searchable text without routing through OpenAI or Google. Microsoft used half the GPUs competitors typically need, suggesting the real battle isn't just about what AI can do anymore, but who can do it cheaper.

What happened

Microsoft just launched MAI-Transcribe-1, a speech-to-text model that handles 25 languages at a starting price of 36 cents per hour. It runs 2.5x faster than Microsoft's previous Azure Fast transcription. The model claims top accuracy on FLEURS benchmarks across all supported languages, though key features like speaker identification and streaming remain "coming soon." Translation: your call center recordings and international Zoom meetings can now turn into searchable text without routing through OpenAI or Google.

Why it matters

Microsoft used half the GPUs competitors typically need, suggesting the real battle isn't just about what AI can do anymore, but who can do it cheaper.

Sources