Microsoft unveils MAI-Voice-1 and MAI-1-preview to reduce OpenAI reliance

Microsoft is taking another stride toward independence from OpenAI, unveiling two homegrown AI systems: MAI-Voice-1 and MAI-1-preview. It’s the first major output from the company’s internal AI unit, and the debut looks confident.

The headline act is MAI-Voice-1, a speech model that can produce a one-minute audio clip in under a second, and it runs on a single GPU. The synthesized voice is so convincing that telling it apart from a real person is hardly feasible. The model is already at work in Copilot Daily, where a host reads the news and delivers podcast-style explainers on complex topics. It’s also available to try in Copilot Labs, letting users enter text, switch voices, and even adjust the speaking style. The mix of speed and modest hardware needs hints at careful engineering.

The second system, MAI-1-preview, is a text model trained using 15,000 Nvidia H100 GPUs. It’s built to follow instructions and generate ChatGPT-like answers. Microsoft plans to weave it into Copilot soon to reduce reliance on OpenAI, and the model is already being tested on the open platform LMArena. Public trials suggest the team is confident and keen to gather broad feedback.

With these models, Microsoft steps into a dual role: still a partner to OpenAI, but also a rival. While OpenAI develops ChatGPT-5 and Google promotes its visual models Gemini and DeepMind, Microsoft is clearly shoring up its own position by offering alternatives that already stand out for their speed and quality.