ARTICLE AD BOX
![]()
Microsoft has launched a suite of its own artificial intelligence (AI) models as the company goes head-to-head with offerings from OpenAI, Google and Anthropic. The company unveiled three models under its MAI brand on Thursday (April 2): MAI-Transcribe-1, MAI-Voice-1 and MAI-Image-2 – all available immediately through Microsoft Foundry and the MAI Playground.
Message from Microsoft AI CEO Mustafa Suleyman
The message from Microsoft is straightforward: better quality, faster performance, and lower prices than anyone else in the market, with Microsoft AI CEO Mustafa Suleyman claiming that all the three models have produced top-tier results.Three models. Three top-tier results. All shipped within just a few months by the @MicrosoftAI team.- MAI-Transcribe-1 dropped today, the most accurate transcription model in the world across 25 languages according to FLEURS WER benchmark.- MAI-Voice-1 sets a new standard for natural speech.- MAI-Image-2 lands as a top 3 model family on @arena.We've been building with them - now you can too. All 3 available now on Microsoft Foundry.
What are Microsoft’s new AI models
MAI-Transcribe-1 is Microsoft's speech-to-text transcription model, built to convert spoken audio into text across the top 25 most-used languages in the world. According to Microsoft, it transcribes audio 2.5 times faster than the company’s existing Azure Fast offering. It is priced at $0.36 per hour — a deliberately competitive entry point.MAI-Voice-1 flips the process around, turning text into natural, realistic speech. The company says the model captures emotional nuance, expression and speaker identity even across long-form audio content. It can generate 60 seconds of audio in just one second.It is priced at $22 per one million characters.MAI-Image-2 is Microsoft's image generation model. The model debuted as a top three model on the Arena.ai leaderboard and is now being rolled out across Copilot, Bing and PowerPoint.
Microsoft says users can expect at least twice the image generation speed compared to before, with no drop in quality. Pricing starts at $5 per one million tokens for text input and $33 per one million tokens for image output.For years, Microsoft's AI story has largely been told through its relationship with OpenAI, the company behind ChatGPT, in which Microsoft has invested over $13 billion. Now, Microsoft is signalling that it intends to build and own its own AI capabilities too.



English (US) ·