SubQ 1.1 Small

Subquadratic released SubQ 1.1 Small, a language model that processes massive documents using an efficient sparse attention mechanism instead of traditional dense attention. Unlike conventional models that compute attention in quadratic time, SubQ achieves linear scaling, requiring 64 times less compute than dense attention and running 56 times faster than optimized baselines at 1 million tokens. The model maintains near-perfect retrieval accuracy across context windows of 1 to 12 million tokens—trained primarily at 1 million but generalizing seamlessly to 12x that length. On standard benchmarks, SubQ balances long-context capability with reasoning performance: scoring 99% on multi-task retrieval, 85% on graduate-level science, and nearly 90% on competitive programming. Designed for financial analysis, legal contracts, and large codebases, the model is rolling out to design partners this summer with general availability expected by year's end. According to Subquadratic, this efficiency breakthrough makes long-context AI practical for enterprise workloads that previously required fragmentation or workarounds.

Source: https://subq.ai/subq-1-1-small-technical-report

Listen to this story

Hear this and more stories in a personalized audio briefing.

Open The Chonkerton