use_vision (default: "auto"): Vision mode - "auto" includes screenshot tool but only uses vision when requested, True always includes screenshots, False never includes screenshots and excludes screenshot tool
page_extraction_llm: Separate LLM model for page content extraction. You can choose a small & fast model because it only needs to extract text from the page (default: same as llm)
initial_actions: List of actions to run before the main task without LLM. Example
max_actions_per_step (default: 10): Maximum actions per step, e.g. for form filling the agent can output 10 fields at once. We execute the actions until the page changes.
max_failures (default: 3): Maximum retries for steps with errors
final_response_after_failure (default: True): If True, attempt to force one final model call with intermediate output after max_failures is reached
use_thinking (default: True): Controls whether the agent uses its internal “thinking” field for explicit reasoning steps.
flash_mode (default: False): Fast mode that skips evaluation, next goal and thinking and only uses memory. If flash_mode is enabled, it overrides use_thinking and disables the thinking process entirely. Example