- Don't Fear AI
- Posts
- How to build AI Agents 2: How AI Agents Work and Their Key Components
How to build AI Agents 2: How AI Agents Work and Their Key Components
Welcome back to Day 2 of my How to build AI Agent from the Scratch adventure! If you're following along, you know I'm on a 6-day challenge, spending just one hour each day to demystify the world of AI agents. Yesterday, we time-traveled through AI history, The Journey from Machine Learning to AI Agents
Today, we’ll focus on three key areas:
1. What’s wrong with the basic way of using Large Language Models (LLMs).
2. The core components of AI agents.
3. How AI agents actually work in practice.
I am going to tackle something many of you have probably experienced firsthand: the limitations of using Large Language Models (LLMs) in their basic form. You know that feeling when ChatGPT gives you a great answer but can't actually help you get things done? That's exactly what we'll explore.
We'll then get into what I like to call the secret sauce – the core components that transform a simple LLM into a capable AI agent. Think of it as upgrading your smartphone's assistant from just answering questions to actually helping you manage your digital life.
Finally, we'll pull back the curtain and see how AI agents operate in the real world. No more theoretical concepts – we're talking practical, hands-on examples that show these digital assistants in action.
So grab your virtual hard hat, because today we're going to peek under the hood of AI agents and understand what makes them truly special.
data:image/s3,"s3://crabby-images/bfcfa/bfcfaff0f4b40efa95583174baddc5319720acf7" alt=""
What’s Wrong with the Basic Use of LLMs?
Let's start with a reality check: most of us are using Large Language Models (LLMs) like we're playing a sophisticated game of Q&A. You ask a question, you get an answer that's it. It's like having a really smart friend who can only talk but never actually help you do anything.
Let's break this down with a real-world example: Planning a vacation.
Basic LLM Usage
Prompt : "I am going on a vacation, what city should I visit?"
Response : "Here's a list of popular cities: Paris, Tokyo, New York..."
Pretty basic, right? The LLM doesn't know your budget, your interests, or whether you're afraid of long flights. Plus, you're still left with all the heavy lifting – researching flights, finding hotels, planning activities, and so on.
data:image/s3,"s3://crabby-images/bc8fa/bc8fa95363300ec7dc61f607e88f00442b333d90" alt=""
Using AI Agents: This is where AI agents shine. Instead of just suggesting destinations, an AI agent approaches your trip planning like a personal travel assistant would:
Reason: "Let me understand your preferences, budget, and constraints."
Plan: "Based on your love for art, moderate budget, and preference for warm weather, here's a detailed travel plan."
Act: Actually help book flights, suggest specific hotels within your budget, and create an itinerary.
This is what we call "ReAct" (Reason and Action) – the secret sauce that makes AI agents so powerful. They don't just stop at giving advice; they combine reasoning with concrete actions to help achieve your goals.
Think of an AI agent as upgrading from a question and answer tool that just gives answers to having a capable personal assistant. The LLM serves as the agent's "brain" for reasoning and planning, while various tools (like flight booking APIs, calendar integrations, or weather services) serve as its "hands" for taking action.
data:image/s3,"s3://crabby-images/5969e/5969e0b2d4d93864e10cedd2552ceba20db257db" alt=""
Imagine building a highly efficient personal assistant. What would they need? A brain to think, hands to act, tools to work with, a memory to remember things, and the ability to use different tools effectively. That's exactly what we're looking at with AI agents. Let's break down these essential components:
Core Components of AI Agents
1. Reasoning & Planning - The Brain
This is where the magic begins. Like a master chef breaking down a complex recipe into simple steps, an AI agent takes your big goal and turns it into a manageable action plan.
This is the phase where the agent breaks a complex task into smaller, actionable steps. Think of it as creating a to-do list to achieve the final goal. Instead of just telling you just the places to visit, it thinks like this: "First, I need to check when the user is free. Then find destinations with nice weather during those dates. After that, I'll look for flights within your budget..."
The heavy lifting here is done by advanced language models like GPT-4 or LLAMA 3, acting as the agent's decision-making center.
2. Action - The Hand
Once the plan is created, the agent executes the tasks. This is what sets agents apart from simple chatbots and basic LLMs. They don't just talk, they do. When planning your trip, the agent doesn't just suggest the places to visit, it actually:
Opens your calendar application
Scans for free dates
Cross-references with flight availability
Makes real-time price comparisons
3. Tools - The Toolkit
Tools are essential for completing tasks that require specific actions. Think of these as the agent's Swiss Army knife. Just as you might use different apps on your phone for different tasks, an AI agent has access to:
Web browsers for research
Calculation tools for budgeting
APIs for booking services
Programming environments for data analysis
4. Memory: Keeping Track
Unlike a regular chatbot and LLMs that forgets everything between messages, an AI agent maintains context. It remembers:
Your original request ("I want a relaxing beach vacation")
What it's already done ("I've checked flights to Bali")
What worked and what didn't ("Direct flights are too expensive, looking at alternatives")
Your preferences ("You mentioned you prefer warm weather")
5. Function Calling
This is the agent's ability to speak multiple languages – not human languages, but the different "languages" of various tools and services. It's like having a universal remote that can control all your devices.
For example, when you say "book me a flight," the agent knows how to:
Translate your request into API calls
Send the right commands to flight booking systems
Process the responses
Present the results back to you in plain English
Each of these components works together seamlessly, like a well-oiled machine. The reasoning engine makes decisions, the tools provide capabilities, and the memory maintains context.
This architecture allows AI agents to go beyond simple question-answering to become genuine digital assistants that can help accomplish real-world tasks.
data:image/s3,"s3://crabby-images/335fe/335fe1e6a3098c708197fabf63702c03eca4c26d" alt=""
How AI Agents work
Andrew Ng beautifully captures the essence of AI agents when he says
An Agentic workflow is an iterative and collaborative approach to interact with LLMs that mimics human problem solving processes. -
Now let us go back to our example. "Planning a vacation". AI agent could break the tasks into the following.
Step 1: Breaking Down Tasks
The agent uses an LLM which is the "brain" to first dissects the massive task of planning a vacation into tasks
Check calendar for vacation days
Check for cities with good weather based for the vacation days
Check for flights for the cities
Suggest places to visit in those cities
Calculate the total cost of the trip
Step 2: Tools in Action
For each task, the agent deploys specific tools
Check calendar for vacation days - Use a python API to access my google calendar to check for my vacation dates
Check for cities with good weather based on the vacation days - Use a web browser to check for cities with good weather conditions
Check for flights to those city - Use the web browser to search for flights to those cities
Suggest places to visit in those cities - Use web browser to seach for thing to do in the city
Calculate the total cost of the trip - Use calculator to calculate the total cost of the trip
Step 3: The Feedback Loop
What makes AI agents truly powerful is their ability to adapt and refine. Here's how it works:
Execute: The agent runs each planned task
Evaluate: Results go back to the LLM for analysis
Adjust: If something's not quite right:
Maybe flights to Bali are too expensive
Weather in Paris shows unexpected rain
Your calendar has conflicts
Repeat: The agent adjusts the plan and tries again
Think of it like having a travel agent who:
Constantly checks their work
Adapts to new information
Keeps refining until they find the perfect plan for you
This iterative process continues until all the pieces fit together perfectly - just like how a human would solve a complex problem, but at machine speed and scale.
The beauty of this system is that it combines the strategic thinking of LLMs with the practical capabilities of various tools, creating a truly helpful assistant that can not just plan, but actually help you accomplish your goals.
We've pulled back the curtain today and seen how AI agents are revolutionizing the way we interact with artificial intelligence. No more simple Q&A sessions – we're talking about digital assistants that can actually roll up their sleeves and get things done. From breaking down complex tasks to executing them with precision, AI agents are transforming the landscape of what's possible.
See you tomorrow as we continue our journey into the fascinating world of AI agents. Trust me, you won't want to miss what's coming next! 🚀
Remember: We're not just learning about the future of AI – we're learning how to build it.