Every AI product you use today has a hidden ceiling. The more context you give the model, a longer document, more conversation history, a bigger codebase, the more expensive it becomes to run, and the worse it gets at picking out what actually matters. That's why AI tools forget things mid-conversation, why long documents get chopped into pieces before going in, and why inference bills are so hard to predict.
A company called Subquadratic claims they've solved it at the architecture level.
What they announced
SubQ 1M-Preview is their first model. Today's AI models have a math problem. When you double the amount of information you feed them, the cost of running the model roughly quadruples. SubQ's design changes that. Doubling the input only doubles the cost, which means the model can read far more in a single pass without the bill spiraling and without losing track of what matters.
The headline numbers they're sharing:
- Matches or beats today's leading models like Claude Opus 4.6 on standard accuracy tests
- Outperforms most top models on tasks that involve finding information buried inside very long inputs
- Scores ahead of several leading models on the standard test for AI coding ability
- A research version reportedly reads the equivalent of dozens of full-length books in a single pass, at a fraction of the computing power today's models would need
If those numbers hold up under independent testing, this is top-tier performance at radically different economics.
Why it matters for products being built today
Most AI applications shipping right now are built around the limits of today's models. Teams build elaborate systems to fetch only the most relevant snippets before sending them to the AI, because feeding in the full source is too expensive. Long documents get sliced into pieces and reassembled. Complex tasks get split across multiple AI agents because a single model cannot hold the full picture.
If SubQ's approach delivers in real-world use, much of that workaround machinery becomes optional. You could feed an entire codebase, a full library of documents, or a year of customer conversations into one model in one go, without the cost making the product unviable.
That changes what founders can actually build. Customer support agents that remember every past conversation. Legal review tools that read full case files in one pass. Financial assistants that ingest entire quarterly reports without being chopped up first. Applications that died on a spreadsheet because the math didn't work start to look viable.
What to actually watch
Worth being honest here. SubQ is currently in private beta. Most of the bigger numbers come from Subquadratic itself, with only one independently verified test in the mix. The AI industry has seen plenty of launch-day claims quietly underperform once developers got their hands on the product. The real test starts now.
A few things worth tracking over the next quarter:
- Independent test results from third parties
- Real pricing once the model leaves beta
- How it performs on messy real-world workloads, not curated tests
- Whether the design holds up at the much larger input sizes the company is hinting at
The bigger pattern
Every major constraint in computing has eventually broken, and when it does, the products built on top reorganize around the new ground rules. If SubQ's approach reaches production, the next generation of AI products will look meaningfully different. Fewer workarounds. More direct reasoning over full datasets. AI assistants that hold their thread over weeks of work instead of minutes.
For anyone building or deploying AI right now, this is worth watching closely. Even if SubQ specifically does not pan out, the direction is real and will reshape what gets built over the next two years.
The companies that move first when constraints break are usually the ones that define the next category.
