Meta's Muse Spark Launches—Gemini Still Leads

Meta launched Muse Spark on April 9, 2026 — a natively multimodal AI from Meta Superintelligence Labs that beats GPT 5.4 on health benchmarks but trails Gemini 3.1 Pro.

What to Know

Meta Muse Spark launched Wednesday at meta.ai and the Meta AI app, rolling out to Facebook, Instagram, and WhatsApp in coming weeks
Muse Spark scored 42.8 on HealthBench Hard — beating GPT 5.4 (40.1) and Gemini 3.1 Pro (20.6) on medical reasoning
Gemini 3.1 Pro still leads on abstract reasoning (76.5 vs 42.5 on ARC AGI 2), coding, and multimodal understanding
Meta stock climbed 6.5% on Wednesday, closing at $612.42 following the announcement

Meta Muse Spark is here — and the company wants you to know it's a whole new era, not just another Llama update. Built by Meta Superintelligence Labs in nine months flat, the model went live Wednesday at meta.ai and the Meta AI app, with Facebook, Instagram, and WhatsApp getting their turn in the coming weeks. For 3.5 billion potential users, that's not a soft launch. That's a statement.

What Makes Muse Spark Different From Llama?

Muse Spark is natively multimodal — meaning it was designed from day one to handle images, text, and voice together, rather than stitching vision capabilities onto an existing language model. That distinction matters more than it sounds. Most frontier models started as text-first systems and got vision bolted on later. Muse Spark didn't.

The model ships with visual chain-of-thought reasoning, tool-use support, and what Muse Spark docs describe as 'Contemplating mode' — a setup that runs multiple AI agents in parallel to push through harder problems. Think of it as Meta's answer to Google's Gemini Deep Think and OpenAI's extended reasoning mode. When Contemplating mode kicks in, Muse Spark hit 58% on Humanity's Last Exam and 38% on FrontierScience Research — territory that puts it in the same conversation as the premium tiers of its competitors, not the standard releases.

There's a notable departure baked into this launch: Muse Spark is closed source. No public weights, no architecture release. For a company that built serious credibility on the Llama open-weight series, that's a pivot worth flagging. Meta says it hopes to open-source future Muse versions. But after Llama 4's lukewarm reception earlier this year, they're clearly writing a different playbook this time.

Muse Spark is the first step on our scaling ladder and the first product of a ground-up overhaul of our AI efforts. To support further scaling, we are making strategic investments across the entire stack — from research and model training to infrastructure, including the Hyperion data center.

— Meta, official announcement

Where Muse Spark Wins — and Where It Gets Beaten

The benchmark picture is genuinely mixed, which is worth sitting with. On medical reasoning, Muse Spark is the clear leader. Meta worked with more than 1,000 physicians to curate health training data, and the results on HealthBench Hard show it: 42.8 for Muse Spark versus 40.1 for GPT 5.4 and just 20.6 for Gemini 3.1 Pro. That gap against Gemini is not marginal — it's significant. On agentic search via DeepSearchQA, Muse Spark also leads with 74.8, edging out Gemini (69.7) and GPT 5.4 (73.6). On CharXiv Reasoning — understanding figures from scientific papers — it posted 86.4, the highest across all three models.

But then there's ARC AGI 2, the abstract reasoning benchmark that's become a kind of shorthand for raw AI capability. Gemini scored 76.5. Muse Spark scored 42.5. That's not a gap — that's a gulf. Coding tells a similar story: LiveCodeBench Pro has Gemini at 82.9 against Muse Spark's 80.0, and on MMMU Pro multimodal understanding, Gemini's 83.9 beats Meta's 80.4. Meta's own blog quietly acknowledges performance gaps in long-horizon agentic work and coding workflows. Credit for saying the quiet part out loud.

So: Muse Spark wins on health and agentic search. Gemini wins on abstract reasoning, coding, and multimodal breadth. Neither is the universal best. Pick your use case.

The $14 Billion Bet Behind Muse Spark

Meta Superintelligence Labs — the team that built Muse Spark — was assembled nine months ago after Meta's acquisition of Scale AI for $14 billion, with Alexandr Wang stepping in as Chief AI Officer. Nine months from standing start to a frontier model launch is fast, even by tech standards. Internally codenamed Avocado, the project relied on a new pretraining stack that Meta claims can match Llama 4 Maverick's capability level using over 10 times less compute. If that efficiency claim holds under scrutiny, it matters for the economics of the AI arms race as much as for any benchmark.

The market took notice. Meta stock climbed nearly 9% during Wednesday's trading session before settling at a 6.5% gain, closing at $612.42. That's the market pricing in the possibility that this is more than a product refresh — that Meta has genuinely re-entered the frontier model race after a year of playing catch-up.

There's also a commercial layer here that doesn't get enough attention. Muse Spark ships with a shopping assistant that compares products and routes directly to purchases. With 3.5 billion active users across Meta's apps, that's an AI monetization play that neither Google nor OpenAI can replicate at the same social graph scale. The model's system prompt was extracted within minutes of launch — a reminder that 'closed source' and 'impenetrable' are not the same thing. A private API preview is opening to select developers, which suggests Muse Spark is being positioned as a platform, not just a consumer product.

What Does Muse Spark Mean for AI Investors and Crypto Markets?

For anyone tracking AI infrastructure plays, the compute efficiency claim is the headline that matters most. Meta says the new pretraining stack reaches Llama 4 Maverick-level performance with 10x less compute. If that scales into the Muse family's next model — already reportedly in development — the cost curve for frontier AI shifts in a way that changes capital allocation across the sector.

The closed-source decision also signals something: Meta is done giving away its best work to competitors who deploy it commercially. Open source was a great strategy when Llama was catching up. Now that Meta is trying to lead, the incentives look different. Call it maturation, call it a defensive moat — either way, the era of Meta gifting frontier weights to the world appears to be on pause.

Muse Spark is described internally as 'small and fast' — the first rung on the Muse scaling ladder. Which means the real test of Meta's AI ambitions isn't Wednesday's launch. It's whatever comes next.

Frequently Asked Questions

What is Meta Muse Spark?

Muse Spark is Meta's most capable AI model to date, launched on April 8, 2026. Built by Meta Superintelligence Labs, it is natively multimodal — handling text, images, and voice from the ground up — and includes visual chain-of-thought reasoning, tool-use support, and a parallel-agent 'Contemplating mode' for complex tasks.

How does Muse Spark compare to Gemini 3.1 Pro?

Muse Spark leads Gemini 3.1 Pro on medical reasoning (42.8 vs 20.6 on HealthBench Hard) and agentic search. Gemini 3.1 Pro still leads on abstract reasoning — scoring 76.5 versus 42.5 on ARC AGI 2 — as well as on coding and multimodal understanding benchmarks.

Is Muse Spark open source like Llama?

No. Muse Spark is a closed model — its weights and architecture are not publicly released. This is a departure from Meta's Llama series. Meta has said it hopes to open-source future versions of Muse, but the current model's code remains private.

What is Meta Superintelligence Labs?

Meta Superintelligence Labs is the AI research division Meta assembled nine months ago following its $14 billion acquisition of Scale AI. Chief AI Officer Alexandr Wang leads the team, which built Muse Spark from scratch using a new pretraining stack that Meta claims is over 10 times more compute-efficient than its Llama 4 training process.