Agno is now Generally Available

Ashpreet Bedi
April 1, 2025
2 min read

Agno is now Generally Available

The simplest, fastest, full-featured library for building Agentic Systems is now production-ready.-

  • 2 years, 2500 PRs, 22K ⭐️
  • 1M+ new Agents created every week
  • Built-in Memory, Knowledge, Reasoning & Tools

Here's my journey building one of the most popular Agent Libraries:

In 2022, I left Airbnb after 5 years, took some time off and started helping companies build AI products.

When gpt-3.5 came out, RAG was all the rage, so that's what we built for our customers. We tried everything, routing, query construction, evals-driven-development; but nothing seemed to work.

It always felt like we were fighting the "framework of the month". The concepts of chains and prompt templates just didn't make sense - wasn't that what Python was for? Why did I need a framework for this?

I was never a merchant of complexity, so I reverted to doing everything by hand, making API calls directly, chunking, scraping, loading vector databases. It was a lot of fun and the more AI products we built, we started encapsulating these utilities into classes. These were healthy abstractions, maximum 1 layer deep and the goal was to expose the LLM APIs through a unified interface, not wrap them. We called these classes "Assistants" and they worked really, really well.

When function calling came out (mid 2023 I think), our Assistants became ridiculously powerful. They managed "conversation state" (storage & memory), "vector data" (knowledge) and tool calls. This was a huge win for us and we decided to open-source Agno (then called Phidata), but with 2 strict conditions:

  1. We are not merchants of complexity. The Assistants class is the base class; click on it, and you see the entire code. I had serious PTSD from deeply nested classes.
  2. We refused to claim "production-ready" unless genuinely battle-tested. Many frameworks label themselves as production-ready from day one, which feels misleading - monitoring alone doesn't equate to being production-ready.

We released Auto-RAG (now Agentic RAG) and the LLM OS, which became incredibly popular. These projects proved you could build powerful AI products without succumbing to the "complexity matrix".

But a problem remained: true production systems demand extreme performance. Unless you've built agentic systems handling 20-30k rpm and millions of users, you don't truly know what "production-ready" means. Trust me - monitoring traces or evals isn't enough. They’re valuable and you do need them, but they aren't the silver bullets they're sold as.

The real challenge is systems engineering, after all, the Agents are asynchronous, long running and non-deterministic. To put things into perspective, AWS API Gateway has a 29 seconds timeout. If the endpoint doesn't respond within this period, API Gateway will return a 504 Gateway Timeout error.

One of the most used infrastructure components in the world (the API Gateway) is not built for Agentic Systems -- imagine that. I know you can request increases and they'll be granted, but that's not the point. Even Cloudflare has a timeout on the Free, Pro, and Business plans. Many, many applications we put into production took 30 minutes to 3 hours to run, spawning thousands of Agents.

Another point is performance. Your Agents need to start up incredibly quickly because you'll create many (sometimes hundreds) for every request. This ensures the Agents maintain a state tailored to that specific request and there is no memory leak. You can't have a single Agent serve multiple users because you must limit the data each Agent accesses—this only happens if the Agent is instantiated  (dynamically) "for that user and request."

Next, your database and vector DB implementations must be lightning-fast, as you'll read state and knowledge data on every request.

Because of this, we obsessed over performance. So much so that we made Agno Agents start-up in ~2μs and made our memory and knowledge drivers 70% faster.

(I'm waiting for python 3.10 to be the default so we can use `slots=True` on dataclasses)

Our performance-first approach resonated deeply with developers, making Phidata the framework of choice for thousands of engineers at top companies worldwide.

Earlier this year, Phidata rebranded to Agno - the fastest library for building Multimodal Agents.

Agno is now battle-tested, quietly handling millions of requests across hundreds of agentic systems in production. I don't use these words lightly, but after years of relentless effort, we are truly production-ready.

Today, Agno is Generally Available.

If you're building Agents, give Agno a try, I promise you won't be disappointed.