Blog Post

Search as Code: Perplexity's New Agent Search

Perplexity's Search as Code lets agents write Python to orchestrate search primitives directly. Why the codegen pattern beats serial tool calling for builders.

Search as Code: Perplexity's New Agent Search - Blog post featured image

For most of the agent era, search has behaved like a vending machine. You put in a query, the machine runs its fixed internal pipeline, and out comes a tray of ranked results.

That contract was fine when AI mostly answered single questions. It breaks the moment an agent has to complete a task that spans hundreds of retrieval steps.

Perplexity's new Search as Code architecture is a direct response to that break, and it signals where agentic systems are heading.

What actually changed

Instead of calling a search endpoint and consuming the output, the model now writes Python that reaches into the search stack itself.

Perplexity rearchitected their search into atomic primitives exposed through an Agentic Search SDK: retrieval, fanout, ranking, filtering, deduplication, rendering. The model composes those primitives into a pipeline built for the task, then runs it inside a secure sandbox.

The shift fixes real failure modes. Traditional tool calling forces every search operation through its own round trip to the model. That means serial control flow when the work is naturally parallel, ballooning cost when one rigid pipeline gets reused for jobs it was never tuned for, and context pollution as noisy intermediate state piles into the window.

Codegen collapses many of those turns into a single program. Fanout queries run concurrently. Filtering and joining happen in deterministic code. The model only sees the output it actually needs.

The numbers that matter

Perplexity benchmarked the approach against systems from OpenAI, Anthropic, Exa, and Parallel across five suites. Search as Code led four of the five and tied for first on the last.

On their internal CVE advisory task, identifying over two hundred high severity vulnerabilities and binding each to a vendor confirmed fix version, it hit perfect accuracy while cutting token usage by roughly eighty five percent against a non codegen baseline. Every competing system scored under twenty five percent.

The cost story is the part builders should sit with. Even the low reasoning configuration beat several full price competitors while running cheaper. The expensive frontier is no longer the only way to get frontier results.

Why a builder should care

The headline is not really about Perplexity. It is about a pattern.

For years the assumption was that a model orchestrates capabilities through a thin serial interface, one function call at a time. Search as Code argues the right boundary sits lower in the stack, where the model writes code that orchestrates primitives directly and fills any gap with more code on the fly.

That principle generalizes beyond search. Any system where an agent loops through dozens of identical tool calls is a candidate: expose composable primitives, hand the model a sandbox, let it write the orchestration.

The result is fewer turns, cleaner context, lower cost, and more control over how work gets done.

If you are building agentic features, the takeaway is to stop designing every integration as a sequence of one shot tool calls. Start asking which workloads would run better as generated code over a set of primitives.

Build it with Axentia

At Axentia we build AI products and integrations on this exact principle: composable primitives, the right amount of orchestration, and boring proven infrastructure underneath.

If you have an agentic workflow buckling under serial tool calls, or a research heavy feature you want done right, we can help you design and ship it. Reach out at axentia.in and let us scope what your product needs.

Explore More Articles

Discover other insightful articles and stories from our blog.