Adding Browsing Capabilities to AG2

Author: Robert Jambrecic Introduction Previously, in our Cross-Framework LLM Tool Integration guide, we combined tools from frameworks like LangChain, CrewAI, and PydanticAI to enhance AG2. Now, we’re taking AG2 even further by integrating Browser Use and Crawl4AI, enabling agents to navigate websites, extract dynamic content, and interact with web pages. This unlocks new possibilities for automated data collection, web automation, and more. Browser Use Integration Installation Browser Use requires Python 3.11 or higher. To get started with the Browser Use integration in AG2, follow these steps: Install AG2 with the browser-use extra: pip install ag2[browser-use] If you have been using autogen or pyautogen, all you need to do is upgrade it using: pip install -U autogen[browser-use] or pip install -U pyautogen[browser-use] as pyautogen, autogen, and ag2 are aliases for the same PyPI package. Set up Playwright: # Installs Playwright and browsers for all OS playwright install # Additional command, mandatory for Linux only playwright install-deps For running the code in Jupyter, use nest_asyncio to allow nested event loops. pip install nest_asyncio You’re all set! Now you can start using browsing features in AG2. Imports import os import nest_asyncio from autogen import AssistantAgent, UserProxyAgent from autogen.tools.experimental import BrowserUseTool nest_asyncio.apply() Agent Configuration Configure the agents for the interaction. config_list defines the LLM configurations, including the model and API key. UserProxyAgent simulates user inputs without requiring actual human interaction (set to NEVER). AssistantAgent represents the AI agent, configured with the LLM settings. Browser Use supports the following models: Supported Models We had great experience with OpenAI, Anthropic, and Gemini. However, DeepSeek and Ollama haven’t performed as well. config_list = [{"model": "gpt-4o-mini", "api_key": os.environ["OPENAI_API_KEY"]}] llm_config = { "config_list": config_list, } user_proxy = UserProxyAgent(name="user_proxy", human_input_mode="NEVER") assistant = AssistantAgent(name="assistant", llm_config=llm_config) Web Browsing with Browser Use The BrowserUseTool llows agents to interact with web pages—navigating, searching, and extracting information. To see the agent’s activity in real-time, set headless to False in the browser_config. If True, the browser runs in the background. browser_use_tool = BrowserUseTool( llm_config=llm_config, browser_config={"headless": False}, ) browser_use_tool.register_for_execution(user_proxy) browser_use_tool.register_for_llm(assistant) Initiate Chat Now, let’s run a task where the assistant searches Reddit for “AG2,” clicks the first post, and retrieves the first comment. result = user_proxy.initiate_chat( recipient=assistant, message="Go to Reddit, search for 'ag2' in the search bar, click on the first post and return the first comment.", max_turns=2, ) user_proxy (to assistant): Go to Reddit, search for 'ag2' in the search bar, click on the first post and return the first comment. -------------------------------------------------------------------------------- assistant (to user_proxy): *** Suggested tool call (call_kHzzd6KnbDpGatDyN5Pm2hLv): browser_use * Arguments: {"task":"Go to Reddit, search for 'ag2', click on the first post and return the first comment."} ************************************************************************** -------------------------------------------------------------------------------- >>>>>>>> EXECUTING FUNCTION browser_use... Call ID: call_kHzzd6KnbDpGatDyN5Pm2hLv Input arguments: {'task': "Go to Reddit, search for 'ag2', click on the first post and return the first comment."} INFO [agent]

Feb 6, 2025 - 19:50

Introduction

Previously, in our Cross-Framework LLM Tool Integration guide, we combined tools from frameworks like LangChain, CrewAI, and PydanticAI to enhance AG2.

Now, we’re taking AG2 even further by integrating Browser Use and Crawl4AI, enabling agents to navigate websites, extract dynamic content, and interact with web pages. This unlocks new possibilities for automated data collection, web automation, and more.

`Browser Use` Integration

Installation

Browser Use requires Python 3.11 or higher.

To get started with the Browser Use integration in AG2, follow these steps:

Install AG2 with the browser-use extra:
```
pip install ag2[browser-use]
```
If you have been using autogen or pyautogen, all you need to do is upgrade it using:
```
pip install -U autogen[browser-use]
```
or
```
pip install -U pyautogen[browser-use]
```
as pyautogen, autogen, and ag2 are aliases for the same PyPI package.

Set up Playwright:

# Installs Playwright and browsers for all OS
playwright install
# Additional command, mandatory for Linux only
playwright install-deps

For running the code in Jupyter, use nest_asyncio to allow nested event loops.
```
pip install nest_asyncio
```

You’re all set! Now you can start using browsing features in AG2.

Imports

import os
import nest_asyncio

from autogen import AssistantAgent, UserProxyAgent
from autogen.tools.experimental import BrowserUseTool

nest_asyncio.apply()

Agent Configuration

Configure the agents for the interaction.

config_list defines the LLM configurations, including the model and API key.
UserProxyAgent simulates user inputs without requiring actual human interaction (set to NEVER).
AssistantAgent represents the AI agent, configured with the LLM settings.

Browser Use supports the following models: Supported Models

We had great experience with OpenAI, Anthropic, and Gemini. However, DeepSeek and Ollama haven’t performed as well.

config_list = [{"model": "gpt-4o-mini", "api_key": os.environ["OPENAI_API_KEY"]}]

llm_config = {
    "config_list": config_list,
}

user_proxy = UserProxyAgent(name="user_proxy", human_input_mode="NEVER")
assistant = AssistantAgent(name="assistant", llm_config=llm_config)

Web Browsing with Browser Use

The BrowserUseTool llows agents to interact with web pages—navigating, searching, and extracting information.

To see the agent’s activity in real-time, set headless to False in the browser_config. If True, the browser runs in the background.

browser_use_tool = BrowserUseTool(
    llm_config=llm_config,
    browser_config={"headless": False},
)

browser_use_tool.register_for_execution(user_proxy)
browser_use_tool.register_for_llm(assistant)

Initiate Chat

Now, let’s run a task where the assistant searches Reddit for “AG2,” clicks the first post, and retrieves the first comment.

result = user_proxy.initiate_chat(
    recipient=assistant,
    message="Go to Reddit, search for 'ag2' in the search bar, click on the first post and return the first comment.",
    max_turns=2,
)

user_proxy (to assistant):

Go to Reddit, search for 'ag2' in the search bar, click on the first post and return the first comment.

--------------------------------------------------------------------------------
assistant (to user_proxy):

***** Suggested tool call (call_kHzzd6KnbDpGatDyN5Pm2hLv): browser_use *****
Arguments:
{"task":"Go to Reddit, search for 'ag2', click on the first post and return the first comment."}
****************************************************************************

--------------------------------------------------------------------------------

>>>>>>>> EXECUTING FUNCTION browser_use...
Call ID: call_kHzzd6KnbDpGatDyN5Pm2hLv
Input arguments: {'task': "Go to Reddit, search for 'ag2', click on the first post and return the first comment."}
INFO     [agent]


                                            
                            
                                Read More                                
                            
                        
                                        
                        Tags:
                        
                                                    
                    
                    
                        
                            
                                                                    
                                        
                                            
                                            Previous Article                                        
                                    
                                    
                                        Rose day Special
                                    
                                                            
                            
                                                                    
                                        
                                            Next Article                                            
                                        
                                    
                                    
                                        Building effective AI agents with Trigger.dev
                                    
                                                            
                        
                    
                                        
                        
                            
                                
                                    
                                        Related Posts
                                    
                                
                                
                                    
                                                                                            
                                                        
                                                                                                                            
                                                                    
                                                                        
                                                                                                                                            
                                                                
                                                                                                                        How I wrote a chess advisor for myself in CSharp/WPF
                                                                Mar 9, 2025
     0

                                                        
                                                    
                                                                                                    
                                                        
                                                                                                                            
                                                                    
                                                                        
                                                                                                                                            
                                                                
                                                                                                                        What is the difference between a core component, an eng...
                                                                Feb 28, 2025
     0

                                                        
                                                    
                                                                                                    
                                                        
                                                                                                                            
                                                                    
                                                                        
                                                                                                                                            
                                                                
                                                                                                                        Trending 50+ Github Repositories for Projects.
                                                                Feb 5, 2025
     0

                                                        
                                                    
                                                                                    
                                
                            
                        
                    
                                            
                            
                                
                                    
                                                                                    
                                                                            
                                    
                                                                                    
                                                    
        
        
        
            
                
                    Name
                    
                
                
                    Email
                    
                
            
        
        
            Comment