Sakana AI turns research into 8-hour “Virtual CSO” reports, not instant chatbot answers
Marlin runs autonomous, multi-model reasoning loops to produce 100-page, cited strategy reports for enterprises.

Tokyo-based Sakana AI has launched its first commercial product, Sakana Marlin, billed as a “Virtual CSO” for B2B strategy research. Instead of seconds-long text replies, Marlin runs continuous self-governing reasoning loops for up to eight hours to deliver deeply researched, well cited strategy reports and executive slides.
Sakana AI just shipped a new kind of enterprise AI work: Sakana Marlin, a “Virtual CSO” that can run up to eight hours of autonomous reasoning to produce deeply researched, well cited strategy outputs. The big reversal here is speed. Where most generative AI tools still optimize for instant responses, Marlin is explicitly built around temporal scale, delivering 100-page strategy reports plus executive slides after long-horizon runs.
For decision-makers, that shift matters because strategy is usually the bottleneck, not drafting. Marlin is designed for enterprise use, targeting corporations, financial institutions, and think tanks, and it’s positioned as an alternative to “prompt engineering back-and-forth.” The workflow starts with a core research topic, then a brief initial exchange to sharpen scope, and then the user steps away while the system operates as a self-contained digital strategy team. Over the next several hours, it formulates hypotheses, navigates the web, cross-references sources to verify findings, and maps causal dynamics in complex business environments.
If that sounds like a search engine with better manners, Sakana is asking you to think bigger. The company’s framing is less “find information quickly” and more “generate strategy options with professional-grade structure.” In Marlin’s case, the output is not treated like a generic text blob. It includes executive summary slides, appendices, references, and a deeply researched report. That structure is exactly what enterprise teams need when they are preparing internal decisions, board materials, investment memos, policy briefs, or scenario planning.
Sakana also gave examples meant to prove the system can handle messy, multi-factor questions. The sample use cases include generating detailed resolution scenarios for a theoretical blockade of the Strait of Hormuz, mapping the fragmented global AI regulation patchwork, and analyzing macroeconomic trends like the return of “bond vigilantes.” Whether or not you work on geopolitics or rates, the underlying point is the same: these are not single-answer prompts. They require synthesis across domains and an attempt at causal reasoning, not just summarization.
Under the hood, Marlin is powered by what Sakana calls an exploration engine that builds on its earlier laboratory breakthroughs. The product relies on Sakana’s Adaptive Branching Monte Carlo Tree Search (AB-MCTS) and leverages frameworks derived from “The AI Scientist,” Sakana AI’s earlier research project featured in the journal Nature that automated parts of the scientific discovery process from ideation to peer review. Sakana also says Marlin relies on multiple AI models, but it did not provide specific model names or providers in the reporting.
The technical core is the shift from repeated sampling to a branching, feedback-driven approach. Traditionally, people try to improve large language model outputs by running the model dozens of times and hoping one answer lands. That blind parallelism cannot evaluate intermediate steps or pivot based on feedback. AB-MCTS is designed to treat the research process as a branching tree of possibilities. Along the way, it dynamically balances “Going Wider” (exploration) by spawning entirely new alternative hypotheses when progress stalls, and “Going Deeper” (exploitation) by refining, auditing, and building on a candidate that shows strategic promise.
What makes this more than a lab demo is the extension to Multi-LLM AB-MCTS. The system can coordinate heterogeneous models and delegate different sub-tasks to different models, including an orchestration model for initial ideation and a reasoning-heavy model for auditing and verification. In effect, it’s aiming to harness strengths across frontier models over thousands of automated cycles, with AB-MCTS providing “mathematical guardrails” so the outputs are not just long-winded generations but the product of systemic trial-and-error.
There is a second enterprise requirement here that is easy to underestimate: data handling and licensing. Sakana Marlin is not a general consumer tool. It’s a commercial SaaS offering restricted to corporate entities, organizations, and sole proprietors. Sakana says neither it nor its external AI service providers will use customer data or inputs for model training or fine-tuning unless the client provides explicit opt-in consent. Even then, data is heavily processed to remove personally identifiable information. For teams handling sensitive M&A research, unreleased product strategy, or proprietary market analysis, that closed-loop security is not a nice-to-have. It is often the difference between pilot and procurement.
Sakana also released historical context that hints at why this launch is credible, not just marketing. The AB-MCTS framework was publicly introduced in June 2025 alongside the research paper “Wider or Deeper? Scaling LLM Inference-Time Compute with Adaptive Branching Tree Search.” The company released the underlying algorithm as an open-source library called TreeQuest under the permissive Apache 2.0 license to encourage experimentation. That open-source milestone seeded the foundation for what later became the proprietary, enterprise-grade Marlin product a year later.
The market context is straightforward: the generative AI hype cycle has been speed-first. For the past two years, milliseconds-friendly demos have set the bar. But enterprises are now staring at the cost of shallow outputs and the risk of confident wrongness. A tool that can run up to eight hours of self-governing reasoning to deliver vetted, structured reports is not just a feature upgrade. It changes how strategy work can be scheduled, delegated, and audited. If Marlin lands with customers, it pressures the rest of the category to stop competing only on instant text and start competing on long-horizon thinking, verification, and enterprise-grade governance.
This story's Key Insights and Take-aways are locked.
Create a free account to unlock Executive Actions for one credit.
Register to UnlockAlways free for Executives Club members. Join the Club
More in Technology

Trump export-control order forces Anthropic to suspend Mythos 5 and Fable 5
A 5:21 PM directive bars access by any foreign national, including Anthropic employees, and triggers a frantic policy fight.

Microsoft adds Amazon capacity for GitHub after AI outages and reliability failures
GitHub is tapping multiple clouds, accelerating Azure, and relying on AWS compute to survive an AI coding surge.

Helium-3 goes moonbound as forecast demand climbs and prices stay brutal
The expensive isotope is drawing lunar mining plans, and decision-makers are watching for supply chain and regulation ripples.
