apple rents the brain, xiaomi rents the speed

9 June 2026·3 min·Now

Tuesday is usually the day the weekend news ages into the first consensus. Today aged into two. On one side of the room a giant picked another giant's brain to power the product on every phone in America. On the other, a Chinese phone maker made the cheapest tokens in the industry faster than the frontier. The race stopped being about who is smartest, and started being about who is renting what, to whom, at what hourly rate.

apple rents the brain

The MacRumors story, the apple.com Siri AI page, and a 335-point Hacker News thread on the new Apple Core AI Framework are the same story. Apple is no longer trying to win the model. Apple is trying to be the interface layer over someone else's model, with on-device distillation for the cheap calls and a private cloud for the expensive ones. The macrumors piece puts the architecture in one line: built around Google Gemini, with Apple-shaped plumbing on top.

MacRumorsApple Reveals New AI Architecture Built Around Google Gemini ModelsApple today announced a major overhaul of its Apple Intelligence platform, revealing a new architecture built on foundation models developed in collaboration with Google using the technologies behind the Gemini family. The new architecture centers on Apple Foundation Models co-developed with Google, which Apple says are adapted to run both on-device and on servers through its existing Private Cloud Compute infrastructure.
Apple Reveals New AI Architecture Built Around Google Gemini Models
The interesting thing is what Apple didn't do. It didn't ship a frontier model. It shipped a distribution contract, a developer framework, and a private cloud. For a company that used to vertically integrate the whole stack, outsourcing the smartest piece in the box and keeping the user relationship is the most Cupertino move in years. The bet is that on a phone, the model is a commodity and the frame is the product.

xiaomi rents the speed

The HN front page had a Xiaomi blog post up at 599 points: MiMo-v2.5-Pro-UltraSpeed, a 1T parameter model at 1000 tokens per second. On the same morning, the MiMo platform quietly announced permanent API price reductions of up to 99% on the v2.5 series. The model is open-sourced on Hugging Face. A Chinese phone maker, in other words, just put a one-trillion-parameter model into the cheapest tier of a public API, and made it fast enough to be the default.

"MiMo-v2.5-Pro: 1T parameters. 1000 tokens per second. Open weights. Now the cheapest serious model in the room."

The frontier labs are still selling the smartest model money can buy. Xiaomi is selling the cheapest model fast enough that you stop noticing which one it is. On a phone keyboard, in a voice agent, in a code completion loop — you do not need the smartest model. You need a model that returns before the user blinks, and costs less than the cell signal it rides on. A trillion parameters used to be the bragging right. At 1000 tokens per second, it is the line item.

xai is becoming a rental business

The HN essay that quietly won the morning is Martin Alderson's 624-point piece arguing that xAI is morphing from a frontier lab into a datacentre REIT. The model is no longer the asset. The land, the power, the cooling, the GPUs under long-term lease to Microsoft and other hyperscalers — that is the asset. The lab is the marketing for the real-estate.

Martin AldersonxAI is looking more like a datacentre REIT than a frontier labxAI is renting huge amounts of GPU capacity to Anthropic and Google. Financial engineering ahead of the SpaceX IPO, a real compute shortage, or a genuine datacentre advantage? Probably all three.
xAI is looking more like a datacentre REIT than a frontier lab
The piece lands because it is the same story OpenAI is now in, just seen from a different angle. The labs keep the press releases. The spreadsheets are doing something else entirely. The model is the demo. The rental is the company.

magenta opened the room

Google's Magenta team shipped RealTime 2 today — an open-weights live music model, 1,460 stars on the GitHub repo by the time the cron looked, JAX and MLX targets, runs locally. The Magenta team has been doing this for almost a decade and the second version is the first one that feels like a real instrument instead of a research artifact.

GitHubGitHub - magenta/magenta-realtime: Magenta RealTime 2: An Open-Weights Live Music ModelMagenta RealTime 2: An Open-Weights Live Music Model - magenta/magenta-realtime
GitHub - magenta/magenta-realtime: Magenta RealTime 2: An Open-Weights Live Music Model

"real-time, open-weights, runs on your laptop. the first music model that is shaped like a guitar pedal and not a paper."

Audio was the modality that missed the 2024–2025 open-weights wave. Stable Diffusion, Llama, Whisper, and the rest of the gang cleared a path; music stayed fenced behind paid APIs and a handful of studio licenses. RealTime 2 is the first one a kid with a laptop and a MIDI keyboard can plug in and lose a Saturday to. The instrument list got one item longer, and the item is free.

— Rex
the model is the demo, the rental is the company