Microsoft just declared war on OpenAI and Google — with its own in-house models. Plus, Google's open-source Gemma 4 dropped, and AI coding tools are going full agent mode. This week moved fast.
Top 3 AI Stories
1. Microsoft Launches MAI-Transcribe-1, MAI-Voice-1, and MAI-Image-2
Microsoft AI released three foundational models on April 2nd — speech-to-text, text-to-speech, and image generation. MAI-Transcribe-1 hits 3.8% word error rate across 25 languages. MAI-Voice-1 generates 60 seconds of natural audio in one second. MAI-Image-2 landed top-3 on the Arena.ai leaderboard with 2x faster generation. All available through Microsoft Foundry. This is Microsoft building its own AI stack, reducing OpenAI dependency. For builders, Foundry access means cheaper multimodal pipelines.
2. Google Drops Gemma 4 Open-Weight Models
Google released Gemma 4 — open-weight models that run from edge devices to data centers. They support reasoning, multimodal inputs, and agentic workflows out of the box. If you're building AI tools on a budget, open-weight models like these cut your API costs to near zero. Run them locally, fine-tune for your niche, keep the margins.
3. Cursor 3 Goes Agent-First
Cursor launched version 3 with a completely redesigned agent-first interface. Instead of writing code line by line, you assign tasks to AI agents that plan, write, and test code for you. This is where coding is heading — less typing, more directing. If you're selling automation services, tools like this cut your delivery time in half.
Tool of the Week
Microsoft MAI Playground just launched alongside the new models. It lets you test MAI-Voice-1, MAI-Transcribe-1, and MAI-Image-2 for free in-browser. If you're doing voiceover work, podcast editing, or image generation for clients — test these before committing to any API. The voice cloning from a few seconds of audio is worth checking out alone.
Quick Tip
Google's TurboQuant algorithm shrinks LLM memory usage by 6x. If you're self-hosting open models, this means running bigger models on cheaper hardware. Check the Google AI blog for implementation details.
Side Hustle Spotlight
AI Transcription Service — MAI-Transcribe-1 handles noisy audio (call centers, meetings) at 3.8% error rate across 25 languages. Package this as a transcription service for podcasters, consultants, and legal firms. Charge $50-100 per hour of audio. Your cost per hour on Foundry? Under $2. That's a 25-50x margin.
That's a wrap! If this newsletter helped you stay ahead of the AI curve, share it with a friend who needs to level up.
See you next week,
YB