Custom actions are functions you provide, that are added to our default actions the agent can use to accomplish tasks. Action functions can request arbitrary parameters that the LLM has to come up with + a fixed set of framework-provided arguments for browser APIs / Agent(context=...) / etc.

Our default set of actions is already quite powerful, the built-in Controller provides basics like open_tab, scroll_down, extract_content, and more.

It’s easy to add your own actions to implement additional custom behaviors, integrations with other apps, or performance optimizations.

For examples of custom actions (e.g. uploading files, asking a human-in-the-loop for help, drawing a polygon with the mouse, and more), see examples/custom-functions.

Action Function Registration

To register your own custom functions (which can be sync or async), decorate them with the @controller.action(...) decorator. This saves them into the controller.registry.

from browser_use import Controller, ActionResult

controller = Controller()

@controller.action('Ask human for help with a question', domains=['example.com'])   # pass allowed_domains= or page_filter= to limit actions to certain pages
def ask_human(question: str) -> ActionResult:
    answer = input(f'{question} > ')
    return ActionResult(extracted_content=f'The human responded with: {answer}', include_in_memory=True)
# Then pass your controller to the agent to use it
agent = Agent(
    task='...',
    llm=llm,
    controller=controller,
)

Keep your action function names and descriptions short and concise:

  • The LLM chooses between actions to run solely based on the function name and description
  • The LLM decides how to fill action params based on their names, type hints, & defaults

Action Parameters

Browser Use supports two patterns for defining action parameters: normal function arguments, or a Pydantic model.

Function Arguments

For simple actions that don’t need default values, you can define the action parameters directly as arguments to the function. This one takes a single string argument, css_selector. When the LLM calls an action, it sees its argument names & types, and will provide values that fit.

@controller.action('Click element')
def click_element(css_selector: str, page: Page) -> ActionResult:
    # css_selector is an action param the LLM must provide when calling
    # page is a special framework-provided param to access the browser APIs (see below)
    await page.locator(css_selector).click()
    return ActionResult(extracted_content=f"Clicked element {css_selector}")

Pydantic Model

You can define a pydantic model for the parameters your action expects by setting a @controller.action(..., param_model=MyParams). This allows you to use optional parameters, default values, Annotated[...] types with custom validation, field descriptions, and other features offered by pydantic.

When the agent calls calls your agent function, an instance of your model with the values filled by the LLM will be passed as the argument named params to your action function.

Using a pydantic model is helpful because it allows more flexibility and power to enforce the schema of the values the LLM should provide. The LLM gets the entire pydantic JSON schema for your param_model, it will see the function name & description + individual field names, types, descriptions, and default values.

from typing import Annotated
from pydantic import BaseModel, AfterValidator
from browser_use import ActionResult

class MyParams(BaseModel):
    field1: int
    field2: str = 'default value'
    field3: Annotated[str, AfterValidator(lambda s: s.lower())]  # example: enforce always lowercase
    field4: str = Field(default='abc', description='Detailed description for the LLM')

@controller.action('My action', param_model=MyParams)
def my_action(params: MyParams, page: Page) -> ActionResult:
    await page.keyboard.type(params.field2)
    return ActionResult(extracted_content=f"Inputted {params} on {page.url}")

Any special framework-provided arguments (e.g. page) will be passed as separate positional arguments after params.

Framework-Provided Parameters

These special action parameters are injected by the Controller and are passed as extra args to any actions that expect them.

For example, actions that need to run playwright code to interact with the browser should take the argument page or browser_session.

  • page: Page - The current Playwright page (shortcut for browser_session.get_current_page())
  • browser_session: BrowserSession - The current browser session (and playwright context via browser_session.browser_context)
  • context: AgentContext - Any optional top-level context object passed to the Agent, e.g. Agent(context=user_provided_obj)
  • page_extraction_llm: BaseChatModel - LLM instance used for page content extraction
  • available_file_paths: list[str] - List of available file paths for upload / processing
  • has_sensitive_data: bool - Whether the action content contains sensitive data markers (check this to avoid logging sensitive data to terminal by accident)

Example: Action uses the current page

from playwright.async_api import Page
from browser_use import Controller, ActionResult

controller = Controller()

@controller.action('Type keyboard input into a page')
async def input_text_into_page(text: str, page: Page) -> ActionResult:
    await page.keyboard.type(text)
    return ActionResult(extracted_content='Website opened')

Example: Action uses the browser_context

from browser_use import BrowserSession, Controller, ActionResult

controller = Controller()

@controller.action('Open website')
async def open_website(url: str, browser_session: BrowserSession) -> ActionResult:
    # find matching existing tab by looking through all pages in playwright browser_context
    all_tabs = await browser_session.browser_context.pages
    for tab in all_tabs:
        if tab.url == url:
            await tab.bring_to_foreground()
            return ActionResult(extracted_content=f'Switched to tab with url {url}')
    # otherwise, create a new tab
    new_tab = await browser_session.browser_context.new_page()
    await new_tab.goto(url)
    return ActionResult(extracted_content=f'Opened new tab with url {url}')

Important Rules

  1. Return an ActionResult: All actions should return an ActionResult | str | None. The stringified version of the result is passed back to the LLM, and optionally persisted in the long-term memory when ActionResult(..., include_in_memory=True).
  2. Type hints on arguments are required: They are used to verify that action params don’t conflict with special arguments injected by the controller (e.g. page)
  3. Actions functions called directly must be passed kwargs: When calling actions from other actions or python code, you must pass all parameters as kwargs only, even though the actions are usually defined using positional args (for the same reasons as pluggy). Action arguments are always matched by name and type, not positional order, so this helps prevent ambiguity / reordering issues while keeping action signatures short.
    @controller.action('Fill in the country form field')
    def input_country_field(country: str, page: Page) -> ActionResult:
        await some_action(123, page=page)                                # ❌ not allowed: positional args, use kwarg syntax when calling
        await some_action(abc=123, page=page)                            # ✅ allowed: action params & special kwargs
        await some_other_action(params=OtherAction(abc=123), page=page)  # ✅ allowed: params=model & special kwargs
# Using Pydantic Model to define action params (recommended)
class PinCodeParams(BaseModel):
    code: int
    retries: int = 3                                               # ✅ supports optional/defaults

@controller.action('...', param_model=PinCodeParams)
async def input_pin_code(params: PinCodeParams, page: Page): ...   # ✅ special params at the end

# Using function arguments to define action params
async def input_pin_code(code: int, retries: int, page: Page): ... # ✅ params first, special params second, no defaults
async def input_pin_code(code: int, retries: int=3): ...           # ✅ defaults ok only if no special params needed
async def input_pin_code(code: int, retries: int=3, page: Page): ... # ❌ Python SyntaxError! not allowed

Reusing Custom Actions Across Agents

You can use the same controller for multiple agents.

controller = Controller()

# ... register actions to the controller

agent = Agent(
    task="Go to website X and find the latest news",
    llm=llm,
    controller=controller
)

# Run the agent
await agent.run()

agent2 = Agent(
    task="Go to website Y and find the latest news",
    llm=llm,
    controller=controller
)

await agent2.run()

The controller is stateless and can be used to register multiple actions and multiple agents.

Exclude functions

If you want to exclude some registered actions and make them unavailable to the agent, you can do:

controller = Controller(exclude_actions=['open_tab', 'search_google'])
agent = Agent(controller=controller, ...)

If you want actions to only be available on certain pages, and to not tell the LLM about them on other pages, you can use the allowed_domains and page_filter:

from pydantic import BaseModel
from browser_use import Controller, ActionResult

controller = Controller()

async def is_ai_allowed(page: Page):
    if api.some_service.check_url(page.url):
        logger.warning('Allowing AI agent to visit url:', page.url)
        return True
    return False

@controller.action('Fill out secret_form', allowed_domains=['https://*.example.com'], page_filter=is_ai_allowed)
def fill_out_form(...) -> ActionResult:
    ... will only be runnable by LLM on pages that match https://*.example.com *AND* where is_ai_allowed(page) returns True