Basic Function Registration

Functions can be either sync or async. Keep them focused and single-purpose.

from browser_use.controller.service import Controller

# Initialize the controller
controller = Controller()

@controller.action('Ask user for information')
def ask_human(question: str, display_question: bool) -> str:
    return input(f'\n{question}\nInput: ')

Basic Controller has all basic functionality you might need to interact with the browser already implemented.

# ... then pass controller to the agent
agent = Agent(
    task=task,
    llm=llm,
    controller=controller
)

Keep the function name and description short and concise. The Agent use the function solely based on the name and description. The stringified output of the action is passed to the Agent.

Browser-Aware Functions

For actions that need browser access, use the requires_browser=True parameter:

from browser_use.browser.service import Browser

@controller.action('Open website', requires_browser=True)
async def open_website(url: str, browser: Browser):
    page = browser.get_current_page()
    await page.goto(url)

Structured Parameters with Pydantic

For complex actions, you can define parameter schemas using Pydantic models:

from pydantic import BaseModel
from typing import Optional

class JobDetails(BaseModel):
    title: str
    company: str
    job_link: str
    salary: Optional[str] = None

@controller.action(
    'Save job details which you found on page',
    param_model=JobDetails,
    requires_browser=True
)
async def save_job(params: JobDetails, browser: Browser):
    print(f"Saving job: {params.title} at {params.company}")

    # Access browser if needed
    page = browser.get_current_page()
    await page.goto(params.job_link)

Using Custom Actions with multiple agents

You can use the same controller for multiple agents.

controller = Controller()

# ... register actions to the controller

agent = Agent(
    task="Go to website X and find the latest news",
    llm=llm,
    controller=controller
)

# Run the agent
await agent.run()

agent2 = Agent(
    task="Go to website Y and find the latest news",
    llm=llm,
    controller=controller
)

await agent2.run()

The controller is stateless and can be used to register multiple actions and multiple agents.