- Don't Fear AI
- Posts
- How to Build an AI Agent with 40 Lines of Code
How to Build an AI Agent with 40 Lines of Code
Step by step guide on how to build an AI Agent
Welcome to Day 6 of my "How to Build an AI Agent" series! Over the past five days, we've covered the theoretical foundations of AI agents. Now, it's time to put that knowledge into practice by building a simple yet powerful AI agent. Imagine having your own digital detective that can scour the internet, gather information, and serve you neat summaries while you sip your coffee. That's exactly what we're creating today.
data:image/s3,"s3://crabby-images/f913d/f913d928579d7286035074a6fc58a787319b1db5" alt=""
In this article, we'll walk step by step through building a web AI agent that can autonomously open a browser, search the internet, and summarize results for you. With just 40 lines of code, you'll have a fully functional AI-powered web assistant!
Get ready to transform abstract concepts into working code as we build an autonomous web agent that can:
Launch a browser on its own
Navigate the vast landscape of the internet
Process and synthesize information just like a human research assistant
The best part? No complex infrastructure or thousands of lines of code needed. We're keeping it lean, mean, and straightforward.
Prerequisites
Want to build your own AI agent? Let's first make sure you have everything you need. Don't worry – the setup is surprisingly straightforward.
What You Need to Know
Basic Python programming (if you've worked with Python before, you're good to go!)
Familiarity with command-line interfaces
No prior AI experience required – we'll explain everything as we go
Tools Used
Our AI agent's power comes from combining two cutting-edge tools:
LLM (Large Language Model): The brain of our operation, handling the complex reasoning and decision-making. We will use OpenAI GPT-4o, I will change this to an open source LLM in my next article.
Agent Framework: A powerful framework that helps us build and orchestrate AI agents without the complexity. In this project we will use Microsoft AutoGen. There are other options like I mentioned in my previous article Best AI Agent Frameworks.
To build our AI agent, we'll use:
LLM (Large Language Model): OpenAI GPT-4o
Agent Framework: Microsoft AutoGen
The complete code is available on my GitHub repository. Feel free to check it out!
Setting Up the Environment
To keep things simple, I won’t cover the installation process for dependencies and virtual environments here. If you need help with that, you can find the setup instructions in my GitHub repo.
Now, let's dive into coding!
Step 1: Import required Libraries
Let us gather all the necessary tools. Think of this as laying out our workspace each import serves a specific purpose in bringing our AI agent to life.
import os
from autogen_agentchat.agents import AssistantAgent, UserProxyAgent
from autogen_agentchat.conditions import MaxMessageTermination, TextMentionTermination
from autogen_agentchat.teams import SelectorGroupChat
from autogen_agentchat.ui import Console
from autogen_ext.models.openai import OpenAIChatCompletionClient
from autogen_ext.agents.web_surfer import MultimodalWebSurfer
from dotenv import load_dotenv
Let's break down what each import brings to our project:
os
and load_dotenv
: These handle our environment variables, keeping our API keys secure. Think of them as the key masters of our application.
AssistantAgent
and UserProxyAgent
: These are our main actors. The AssistantAgent is our AI brain that processes and responds to tasks, while the UserProxyAgent acts as the bridge between us and the AI.
MaxMessageTermination
and TextMentionTermination
: These are our conversation controllers. They help determine when our agent should stop processing or move on to the next task – like a smart traffic light system for our AI's thoughts.
SelectorGroupChat
: This orchestrates communication when we have multiple agents working together, like a skilled project manager coordinating team efforts.
MultimodalWebSurfer
: This is our agent's ability to navigate the web, capable of understanding both text and visual content it encounters online.
OpenAIChatCompletionClient
: This connects us to GPT-4's powerful language capabilities, serving as our agent's knowledge base and reasoning engine.
In the next section, we'll see how these components work together to create an agent that can understand your requests and navigate the web autonomously. Ready to bring these pieces together?
Step 2: Initialize the Large Language Model (LLM)
Every agent needs a powerful brain, and in our case, that's LLM. Let's set up the language model that will drive our agent's understanding and decision-making capabilities. In this case we use OpenAI GPT 4o mini.
# Load environment variables for secure API access
load_dotenv()
# Initialize the LLM client with the desired model
1 model_client = OpenAIChatCompletionClient(model="gpt-4o-mini", api_key=os.getenv("OPENAI_API_KEY"))
A quick note on the model choice: While GPT-4 isn't free, it provides the best balance of capability and efficiency for our web-browsing agent. You can use other open source models too.
🔐 Security Tip: Notice how we're using environment variables (os.getenv
) to handle our API key? This is a crucial security practice to keep your credentials safe and out of your code.
Step 3: Define the Agents
Our AI system needs different specialists working together. Let's create three specialized agents, each with its unique role in making our web assistant both powerful and reliable.
Web Surfer Agent
The Web Surfer Agent is responsible for retrieving information from the web. It autonomously browses the internet and collects relevant data based on user queries.
web_surfer_agent = MultimodalWebSurfer(
name="web_surfer_agent",
description="Agent that browses the web and retrieves information to solve tasks.",
model_client=model_client,
headless=False, # Set to True for headless execution
)
Assistant Agent
The Assistant Agent processes the data retrieved by the Web Surfer Agent. It verifies the correctness of the information, filters out irrelevant details, and summarizes the key points.
assistant_agent = AssistantAgent(
name="assistant_agent",
description="Agent that verifies and summarizes information retrieved by the web surfer agent.",
system_message="""
You are an assistant tasked with verifying and summarizing information retrieved by the web surfer agent.
- If the web surfer agent's response is incomplete, instruct it to continue with 'keep going'.
- If the task is fully addressed, provide a final summary and conclude with 'TERMINATE'.
""",
model_client=model_client,
)
User Proxy Agent
The User Proxy Agent serves as an intermediary when additional input is required from the user. If the other agents need clarification, they will consult the user through this agent.
user_proxy_agent = UserProxyAgent(
name="user_proxy_agent",
description="A human user consulted when agents need clarification, preferences, or additional information."
)
Step 4: Create a Coordinator
Just like any effective team needs a manager or coordinator, our AI agents need a coordinator to ensure they work together seamlessly. The Coordinator ensures smooth communication between the agents. It determines which agent should act at each step and directs tasks accordingly. Let's create a smart coordination system that knows exactly when to use each agent's strengths.
role_selector_prompt = """
You are the coordinator of this multi-agent role-play system. There are three roles available: {roles}.
- The web_surfer_agent retrieves web information.
- The assistant_agent verifies and summarizes data.
- The user_proxy_agent clarifies when needed.
Based on the task and {history}, decide the next active agent and return its name from {participants}.
"""
Think of this coordinator as an AI orchestra conductor, ensuring each instrument (agent) plays its part at exactly the right moment. The role_selector_prompt
is like the conductor's score, containing detailed instructions about when each agent should step in.
Step 5: Define the Termination Condition
Just like a good meeting needs a clear endpoint, our AI system needs to know when to wrap up its work. he Termination Condition determines when the agent should stop. Our agent will either terminate when it reaches 20 interactions or when the keyword "TERMINATE" is detected in the conversation.
Message Limit Guard (
MaxMessageTermination
):Prevents the agents from getting stuck in endless loops
Caps the conversation at 20 rounds
Ensures efficient resource usage
Completion Signals (
TextMentionTermination
):Recognizes when tasks are naturally complete
Responds to explicit termination commands
Handles both success and failure scenarios
termination_condition = MaxMessageTermination(max_messages=20) | TextMentionTermination("TERMINATE")
💡 Pro Tip: The 20-message limit is a good starting point, but you might want to adjust it based on your specific needs. Complex research tasks might need more rounds, while simple queries could use fewer.
Step 6: Assemble the Multi-Agent Team
Now, let’s combine everything into a fully functional AI agent team. The multi-agent team is responsible for executing tasks in coordination with each other.
# Initialize the group chat with our coordination system
multi_agent_team = SelectorGroupChat(
[web_surfer_agent, assistant_agent, user_proxy_agent],
selector_prompt=role_selector_prompt,
model_client=model_client,
termination_condition=termination_condition,
)
Step 7: Run the AI Agent
Time to bring our creation to life! Here's how to set your multi agent team in motion. To execute the multi agent team, use the following code:
i# Run the multi-agent team asynchronously
await Console(multi_agent_team.run_stream(task="Find the latest AI advancements in 2025"))
Conclusion
Congratulations! You’ve just built a fully functional AI-powered web agent with around 40 lines of code. In this article, we:
Defined and initialized three specialized agents.
Created a coordinator to manage agent interactions.
Set up a termination condition to control execution.
Assembled and ran the AI agent team.
Each agent plays a vital role:
Web Surfer Agent gathers information.
Assistant Agent verifies and summarizes data.
User Proxy Agent steps in when user input is needed.
Coordinator directs workflow among the agents.
Termination Condition ensures controlled execution.
🔮 Looking Ahead:
In the next article, we’ll explore replacing OpenAI’s GPT-4o with an open-source LLM, making the AI agent free to run. Stay tuned!
Happy coding!