Custom Functions
Extend default agent and write custom action functions to do certain tasks
Custom actions are functions you provide, that are added to our default actions the agent can use to accomplish tasks.
Action functions can request arbitrary parameters that the LLM has to come up with + a fixed set of framework-provided arguments for browser APIs / Agent(context=...)
/ etc.
Our default set of actions is already quite powerful, the built-in Controller
provides basics like open_tab
, scroll_down
, extract_content
, and more.
It’s easy to add your own actions to implement additional custom behaviors, integrations with other apps, or performance optimizations.
For examples of custom actions (e.g. uploading files, asking a human-in-the-loop for help, drawing a polygon with the mouse, and more), see examples/custom-functions.
Action Function Registration
To register your own custom functions (which can be sync
or async
), decorate them with the @controller.action(...)
decorator. This saves them into the controller.registry
.
Keep your action function names and descriptions short and concise:
- The LLM chooses between actions to run solely based on the function name and description
- The LLM decides how to fill action params based on their names, type hints, & defaults
Action Parameters
Browser Use supports two patterns for defining action parameters: normal function arguments, or a Pydantic model.
Function Arguments
For simple actions that don’t need default values, you can define the action parameters directly as arguments to the function. This one takes a single string argument, css_selector
.
When the LLM calls an action, it sees its argument names & types, and will provide values that fit.
Pydantic Model
You can define a pydantic model for the parameters your action expects by setting a @controller.action(..., param_model=MyParams)
.
This allows you to use optional parameters, default values, Annotated[...]
types with custom validation, field descriptions, and other features offered by pydantic.
When the agent calls calls your agent function, an instance of your model with the values filled by the LLM will be passed as the argument named params
to your action function.
Using a pydantic model is helpful because it allows more flexibility and power to enforce the schema of the values the LLM should provide.
The LLM gets the entire pydantic JSON schema for your param_model
, it will see the function name & description + individual field names, types, descriptions, and default values.
Any special framework-provided arguments (e.g. page
) will be passed as separate positional arguments after params
.
Framework-Provided Parameters
These special action parameters are injected by the Controller
and are passed as extra args to any actions that expect them.
For example, actions that need to run playwright code to interact with the browser should take the argument page
or browser_session
.
page: Page
- The current Playwright page (shortcut forbrowser_session.get_current_page()
)browser_session: BrowserSession
- The current browser session (and playwright context viabrowser_session.browser_context
)context: AgentContext
- Any optional top-level context object passed to the Agent, e.g.Agent(context=user_provided_obj)
page_extraction_llm: BaseChatModel
- LLM instance used for page content extractionavailable_file_paths: list[str]
- List of available file paths for upload / processinghas_sensitive_data: bool
- Whether the action content contains sensitive data markers (check this to avoid logging sensitive data to terminal by accident)
Example: Action uses the current page
Example: Action uses the browser_context
Important Rules
- Return an
ActionResult
: All actions should return anActionResult | str | None
. The stringified version of the result is passed back to the LLM, and optionally persisted in the long-term memory whenActionResult(..., include_in_memory=True)
. - Type hints on arguments are required: They are used to verify that action params don’t conflict with special arguments injected by the controller (e.g.
page
) - Actions functions called directly must be passed kwargs: When calling actions from other actions or python code, you must pass all parameters as kwargs only, even though the actions are usually defined using positional args (for the same reasons as pluggy).
Action arguments are always matched by name and type, not positional order, so this helps prevent ambiguity / reordering issues while keeping action signatures short.
Reusing Custom Actions Across Agents
You can use the same controller for multiple agents.
The controller is stateless and can be used to register multiple actions and multiple agents.
Exclude functions
If you want to exclude some registered actions and make them unavailable to the agent, you can do:
If you want actions to only be available on certain pages, and to not tell the LLM about them on other pages,
you can use the allowed_domains
and page_filter
: