The Age of Vertical AI Agents

AI systems that are built to automate a workflow in specific industries for specialized use cases or Verticals are called Vertical AI Agents.

The recent advancements in LLMs opened a very exciting door in workflow automation, ‘Understanding context from natural language’. Additionally, thanks to vector databases and MCP , we can now extend the knowledge and capability of the agents to make them near autonomous to fully autonomous in their vertical. And suddenly, we have a long list of redundant, repetitive tasks that were done by people till date, which can now be fully automated with Vertical AI Agents.

Let’s go over an interesting example before we dive deep.

We’re at a drive-thru at your favorite quick-service restaurant in 2023. We place the order through the microphone placed at the entrance; an employee of the restaurant notes that down and types down the order in their point-of-sale software. Now, this is an example of a redundant, repetitive task, which is done by a human being. Lilac Labs (YC S’24), automates this with voice AI, cutting labor costs by at least $100,000 annually.

Types of AI Agents

Based on the functionality, AI agents can be classified into the following categories:

Reactive Agents – These agents operate based on pre-defined rules or patterns. Think of chatbots that respond based on keywords. For example, Voice response chatbots that replace traditional IVR systems. Reach out to us to learn more about the IVR automation solution we built for a FinTech client using the Abilytics Agentic AI Studio.
Proactive Agents – They anticipate user needs and act accordingly, like recommendation engines. Smart assistant software and recommendation engines come under this category.
Autonomous Agents – These can function without human intervention, making decisions based on learned data, such as autonomous trading bots.
Collaborative Agents – These work alongside humans, providing suggestions but requiring human approval for final actions. For example, coding agents and AI copilots.
Multi-Agent Systems – A collection of AI agents working together to solve complex problems, like swarm intelligence in logistics.

Identifying the Right Idea for Agentic AI Automation

A number of AI technologists and investors like Andrew Ng (Founder, Deeplearning.ai) and Jared Freidman (Group Partner at YC, Co-founder & Ex. CTO of Scribd) have predicted that a large number of tasks that we see around ourselves today will be automated with AI Agents in the near future, maybe even grow 10X the size of SaaS market. But, how do we know whether a task or a workflow is worth automating?

Repetitive & High-Volume Tasks – Tasks that consume significant human time without requiring creativity are prime candidates.
Data-Driven Processes – AI thrives on data; workflows with structured or semi-structured data are ideal.
Predictive or Pattern-Based Decision Making – If the task involves recognizing trends or making forecasts, AI can be highly effective.
High ROI on Automation – The cost of automation should be significantly lower than the value it provides. In case of AI Agents that replace human intervention, easiest method to check ROI is to see the cost of the human intervention being replaced. As long as running the Agent is cheaper, it’s worth it.
Scalability Potential – It’s ideal for a solution to be adaptable and scalable, so as to be applied in multiple industries and sectors.

RAG-Based AI Agents and Their Architecture

One of the most powerful techniques in AI automation is Retrieval-Augmented Generation (RAG). RAG-based AI agents combine the capabilities of large language models (LLMs) with external knowledge retrieval, allowing them to provide accurate and contextually relevant responses while staying up to date. Or, in simple terms, give the LLM a text book, of sorts, to refer to before answering the user query. LLM’s knowledge is limited by the dataset they are trained in. If we need to extend the knowledge of the LLM, we can consolidate the knowledge context in some data format, store it in a database. When the user sends a query, we fetch the relevant data from the database and send to the LLM along with the query. Remember the vector databases we talked about in the introduction? This is where they come in. Vector databases provide efficient semantic search and large-scale information retrieval, hence suitable for RAG architecture.

Technical Architecture of RAG-Based AI Agents

User Query Input – The AI agent receives a user request.
Retrieval Component – The system searches an external knowledge base (such as vector databases like Pinecone) to fetch relevant documents.
Augmentation Layer – Retrieved information is processed and formatted to enhance the response.
LLM Processing – The enhanced input is passed to an LLM, which generates a response incorporating both the original query and retrieved knowledge.
Response Generation & Feedback Loop – The system delivers the final output and continuously improves by incorporating user feedback.

RAG-based agents are particularly useful for industries that are knowledge-intensive or have changing knowledge context such as sales, finance, legal, and research.

Extending Agent’s Capabilities with Tool Calling

Let’s take a case where we are building a coding agent. With what we’ve learned so far, we can build a coding assistant that can take the requirements as user query and suggest the code or changes to a code, which the user can then apply to their files. Now, what if we want the agent to be capable of applying the changes by itself? What if the agent could be capable of creating new relevant files, deleting unwanted code snippets, and running the required commands in the terminal?

This is where ‘tool-calling’ comes in. When an AI encounters a query it can’t answer or act on directly, we can provide it an option to “call” a specialized tool (e.g., a calculator tool, search engine tool for web browsing, filesystem tool to allow the agent to access files and folders) to act on the user query. This enhances the agents’s functionality. While, it could only provide a response which user could act on. Now, the agent can carry out a task that’s beyond the LLM’s scope as well. In our above example, we would integrate a ‘filesystem management tool’ to our coding agent to allow it to create, edit and delete relevant files.

Model Context Protocol (MCP) is a standardized method of implementing tool integrations with LLMs. Model Context Protocol (MCP) acts as a bridge between AI agents and external tools, granting them abilities beyond what an LLM can do alone.

Conclusion

As we enter the age of vertical AI agents, businesses that identify the right processes for automation and implement domain-specific solutions will gain significant competitive advantages. By leveraging technologies like RAG and MCP, these specialized AI systems will transform workflows across industries, freeing humans to focus on creative and strategic work while the agents handle specialized, repetitive tasks with unprecedented efficiency and accuracy

BLOG