Lifecycle Hooks

Browser-Use provides lifecycle hooks that allow you to execute custom code at specific points during the agent’s execution. Hook functions can be used to read and modify agent state while running, implement custom logic, change configuration, integrate the Agent with external applications.

Available Hooks

Currently, Browser-Use provides the following hooks:

Hook	Description	When it’s called
`on_step_start`	Executed at the beginning of each agent step	Before the agent processes the current state and decides on the next action
`on_step_end`	Executed at the end of each agent step	After the agent has executed all the actions for the current step, before it starts the next step

await agent.run(on_step_start=..., on_step_end=...)

Each hook should be an async callable function that accepts the agent instance as its only parameter.

Basic Example

from browser_use import Agent
from langchain_openai import ChatOpenAI


async def my_step_hook(agent: Agent):
    # inside a hook you can access all the state and methods under the Agent object:
    #   agent.settings, agent.state, agent.task
    #   agent.controller, agent.llm, agent.browser_session
    #   agent.pause(), agent.resume(), agent.add_new_task(...), etc.
    
    # You also have direct access to the playwright Page and Browser Context
    page = await agent.browser_session.get_current_page()
    #   https://playwright.dev/python/docs/api/class-page
    
    current_url = page.url
    visit_log = agent.state.history.urls()
    previous_url = visit_log[-2] if len(visit_log) >= 2 else None
    print(f"Agent was last on URL: {previous_url} and is now on {current_url}")
    
    # Example: listen for events on the page, interact with the DOM, run JS directly, etc.
    await page.on('domcontentloaded', lambda: print('page navigated to a new url...'))
    await page.locator("css=form > input[type=submit]").click()
    await page.evaluate('() => alert(1)')
    await page.browser.new_tab
    await agent.browser_session.session.context.add_init_script('/* some JS to run on every page */')
    
    # Example: monitor or intercept all network requests
    async def handle_request(route):
		# Print, modify, block, etc. do anything to the requests here
        #   https://playwright.dev/python/docs/network#handle-requests
		print(route.request, route.request.headers)
		await route.continue_(headers=route.request.headers)
	await page.route("**/*", handle_route)

    # Example: pause agent execution and resume it based on some custom code
    if '/completed' in current_url:
        agent.pause()
        Path('result.txt').write_text(await page.content()) 
        input('Saved "completed" page content to result.txt, press [Enter] to resume...')
        agent.resume()
    
agent = Agent(
    task="Search for the latest news about AI",
    llm=ChatOpenAI(model="gpt-4o"),
)

await agent.run(
    on_step_start=my_step_hook,
    # on_step_end=...
    max_steps=10
)

Data Available in Hooks

When working with agent hooks, you have access to the entire Agent instance. Here are some useful data points you can access:

agent.task lets you see what the main task is, agent.add_new_task(...) lets you queue up a new one
agent.controller give access to the Controller() object and Registry() containing the available actions
- agent.controller.registry.execute_action('click_element_by_index', {'index': 123}, browser_session=agent.browser_session)
agent.context lets you access any user-provided context object passed in to Agent(context=...)
agent.sensitive_data contains the sensitive data dict, which can be updated in-place to add/remove/modify items
agent.settings contains all the configuration options passed to the Agent(...) at init time
agent.llm gives direct access to the main LLM object (e.g. ChatOpenAI)
agent.state gives access to lots of internal state, including agent thoughts, outputs, actions, etc.
- agent.state.history.model_thoughts(): Reasoning from Browser Use’s model.
- agent.state.history.model_outputs(): Raw outputs from the Browsre Use’s model.
- agent.state.history.model_actions(): Actions taken by the agent
- agent.state.history.extracted_content(): Content extracted from web pages
- agent.state.history.urls(): URLs visited by the agent
agent.browser_session gives direct access to the BrowserSession() and playwright objects
- agent.browser_session.get_current_page(): Get the current playwright Page object the agent is focused on
- agent.browser_session.browser_context: Get the current playwright BrowserContext object
- agent.browser_session.browser_context.pages: Get all the tabs currently open in the context
- agent.browser_session.get_page_html(): Current page HTML
- agent.browser_session.take_screenshot(): Screenshot of the current page

Tips for Using Hooks

Avoid blocking operations: Since hooks run in the same execution thread as the agent, try to keep them efficient or use asynchronous patterns.
Handle exceptions: Make sure your hook functions handle exceptions gracefully to prevent interrupting the agent’s main flow.
Use custom actions instead: hooks are fairly advanced, most things can be implemented with custom action functions instead

Complex Example: Agent Activity Recording System

This comprehensive example demonstrates a complete implementation for recording and saving Browser-Use agent activity, consisting of both server and client components.

Setup Instructions

To use this example, you’ll need to:

Set up the required dependencies:

pip install fastapi uvicorn prettyprinter pyobjtojson dotenv browser-use langchain-openai

Create two separate Python files:
- api.py - The FastAPI server component
- client.py - The Browser-Use agent with recording hook
Run both components:
- Start the API server first: python api.py
- Then run the client: python client.py

Server Component (api.py)

The server component handles receiving and storing the agent’s activity data:

#!/usr/bin/env python3

#
# FastAPI API to record and save Browser-Use activity data.
# Save this code to api.py and run with `python api.py`
# 

import json
import base64
from pathlib import Path

from fastapi import FastAPI, Request
import prettyprinter
import uvicorn

prettyprinter.install_extras()

# Utility function to save screenshots
def b64_to_png(b64_string: str, output_file):
    """
    Convert a Base64-encoded string to a PNG file.
    
    :param b64_string: A string containing Base64-encoded data
    :param output_file: The path to the output PNG file
    """
    with open(output_file, "wb") as f:
        f.write(base64.b64decode(b64_string))

# Initialize FastAPI app
app = FastAPI()


@app.post("/post_agent_history_step")
async def post_agent_history_step(request: Request):
    data = await request.json()
    prettyprinter.cpprint(data)

    # Ensure the "recordings" folder exists using pathlib
    recordings_folder = Path("recordings")
    recordings_folder.mkdir(exist_ok=True)

    # Determine the next file number by examining existing .json files
    existing_numbers = []
    for item in recordings_folder.iterdir():
        if item.is_file() and item.suffix == ".json":
            try:
                file_num = int(item.stem)
                existing_numbers.append(file_num)
            except ValueError:
                # In case the file name isn't just a number
                pass

    if existing_numbers:
        next_number = max(existing_numbers) + 1
    else:
        next_number = 1

    # Construct the file path
    file_path = recordings_folder / f"{next_number}.json"

    # Save the JSON data to the file
    with file_path.open("w") as f:
        json.dump(data, f, indent=2)

    # Optionally save screenshot if needed
    # if "website_screenshot" in data and data["website_screenshot"]:
    #     screenshot_folder = Path("screenshots")
    #     screenshot_folder.mkdir(exist_ok=True)
    #     b64_to_png(data["website_screenshot"], screenshot_folder / f"{next_number}.png")

    return {"status": "ok", "message": f"Saved to {file_path}"}

if __name__ == "__main__":
    print("Starting Browser-Use recording API on http://0.0.0.0:9000")
    uvicorn.run(app, host="0.0.0.0", port=9000)

Client Component (client.py)

The client component runs the Browser-Use agent with a recording hook:

#!/usr/bin/env python3

#
# Client to record and save Browser-Use activity.
# Save this code to client.py and run with `python client.py`
#

import asyncio
import requests
from dotenv import load_dotenv
from pyobjtojson import obj_to_json
from langchain_openai import ChatOpenAI
from browser_use import Agent

# Load environment variables (for API keys)
load_dotenv()


def send_agent_history_step(data):
    """Send the agent step data to the recording API"""
    url = "http://127.0.0.1:9000/post_agent_history_step"
    response = requests.post(url, json=data)
    return response.json()


async def record_activity(agent_obj):
    """Hook function that captures and records agent activity at each step"""
    website_html = None
    website_screenshot = None
    urls_json_last_elem = None
    model_thoughts_last_elem = None
    model_outputs_json_last_elem = None
    model_actions_json_last_elem = None
    extracted_content_json_last_elem = None

    print('--- ON_STEP_START HOOK ---')
    
    # Capture current page state
    website_html = await agent_obj.browser_session.get_page_html()
    website_screenshot = await agent_obj.browser_session.take_screenshot()

    # Make sure we have state history
    if hasattr(agent_obj, "state"):
        history = agent_obj.state.history
    else:
        history = None
        print("Warning: Agent has no state history")
        return

    # Process model thoughts
    model_thoughts = obj_to_json(
        obj=history.model_thoughts(),
        check_circular=False
    )
    if len(model_thoughts) > 0:
        model_thoughts_last_elem = model_thoughts[-1]

    # Process model outputs
    model_outputs = agent_obj.state.history.model_outputs()
    model_outputs_json = obj_to_json(
        obj=model_outputs,
        check_circular=False
    )
    if len(model_outputs_json) > 0:
        model_outputs_json_last_elem = model_outputs_json[-1]

    # Process model actions
    model_actions = agent_obj.state.history.model_actions()
    model_actions_json = obj_to_json(
        obj=model_actions,
        check_circular=False
    )
    if len(model_actions_json) > 0:
        model_actions_json_last_elem = model_actions_json[-1]

    # Process extracted content
    extracted_content = agent_obj.state.history.extracted_content()
    extracted_content_json = obj_to_json(
        obj=extracted_content,
        check_circular=False
    )
    if len(extracted_content_json) > 0:
        extracted_content_json_last_elem = extracted_content_json[-1]

    # Process URLs
    urls = agent_obj.state.history.urls()
    urls_json = obj_to_json(
        obj=urls,
        check_circular=False
    )
    if len(urls_json) > 0:
        urls_json_last_elem = urls_json[-1]

    # Create a summary of all data for this step
    model_step_summary = {
        "website_html": website_html,
        "website_screenshot": website_screenshot,
        "url": urls_json_last_elem,
        "model_thoughts": model_thoughts_last_elem,
        "model_outputs": model_outputs_json_last_elem,
        "model_actions": model_actions_json_last_elem,
        "extracted_content": extracted_content_json_last_elem
    }

    print("--- MODEL STEP SUMMARY ---")
    print(f"URL: {urls_json_last_elem}")
    
    # Send data to the API
    result = send_agent_history_step(data=model_step_summary)
    print(f"Recording API response: {result}")


async def run_agent():
    """Run the Browser-Use agent with the recording hook"""
    agent = Agent(
        task="Compare the price of gpt-4o and DeepSeek-V3",
        llm=ChatOpenAI(model="gpt-4o"),
    )
    
    try:
        print("Starting Browser-Use agent with recording hook")
        await agent.run(
            on_step_start=record_activity,
            max_steps=30
        )
    except Exception as e:
        print(f"Error running agent: {e}")


if __name__ == "__main__":
    # Check if API is running
    try:
        requests.get("http://127.0.0.1:9000")
        print("Recording API is available")
    except:
        print("Warning: Recording API may not be running. Start api.py first.")
    
    # Run the agent
    asyncio.run(run_agent())

Contribution by Carlos A. Planchón.

Working with the Recorded Data

After running the agent, you’ll find the recorded data in the recordings directory. Here’s how you can use this data:

View recorded sessions: Each JSON file contains a snapshot of agent activity for one step
Extract screenshots: You can modify the API to save screenshots separately
Analyze agent behavior: Use the recorded data to study how the agent navigates websites

Extending the Example

You can extend this recording system in several ways:

Save screenshots separately: Uncomment the screenshot saving code in the API
Add a web dashboard: Create a simple web interface to view recorded sessions
Add session IDs: Modify the API to group steps by agent session
Add filtering: Implement filters to record only specific types of actions

Get Started

Customize

Development

Available Hooks

Basic Example

Data Available in Hooks

Tips for Using Hooks

Complex Example: Agent Activity Recording System

Setup Instructions

Server Component (api.py)

Client Component (client.py)

Working with the Recorded Data

Extending the Example

Get Started

Customize

Development

​Available Hooks

​Basic Example

​Data Available in Hooks

​Tips for Using Hooks

​Complex Example: Agent Activity Recording System

​Setup Instructions

​Server Component (api.py)

​Client Component (client.py)

​Working with the Recorded Data

​Extending the Example

Available Hooks

Basic Example

Data Available in Hooks

Tips for Using Hooks

Complex Example: Agent Activity Recording System

Setup Instructions

Server Component (api.py)

Client Component (client.py)

Working with the Recorded Data

Extending the Example