{/* Chat column */}
{/* Live browser view — liveUrl available from session creation */}
);
}
```
---
## Summary
| Method | Purpose |
|--------|---------|
| `client.sessions.create()` | Create a session (returns `liveUrl` immediately) |
| `client.run()` | Send a task and stream messages with `for await` |
| `client.sessions.stop()` | Stop the current task |
| `client.sessions.waitForRecording()` | Get MP4 recording URLs |
# Grow Therapy provider search
Source: https://docs.browser-use.com/cloud/tutorials/grow-therapy-compare
This tutorial builds a provider search tool for [Grow Therapy](https://www.growtherapy.com) — a therapy marketplace that handles insurance credentialing for providers. We combine [structured output](https://docs.browser-use.com/cloud/agent/structured-output) with [deterministic rerun](https://docs.browser-use.com/cloud/agent/cache-script) to build a fast, repeatable search pipeline.
## What you'll build
A script that:
1. Searches Grow Therapy's provider directory with filters (location, insurance, specialty)
2. Extracts therapist profiles with ratings and availability
3. Caches the search so you can sweep across geographies or specialties instantly
---
## Setup
```python Python
import asyncio
import json
from pydantic import BaseModel
from browser_use_sdk.v3 import AsyncBrowserUse
client = AsyncBrowserUse()
```
```typescript TypeScript
import { BrowserUse } from "browser-use-sdk/v3";
import { z } from "zod";
const client = new BrowserUse();
```
## 1. Define the output schema
```python Python
class Provider(BaseModel):
name: str
title: str
specialties: list[str]
insurance_plans: list[str]
rating: float | None = None
next_available: str | None = None
class ProviderSearch(BaseModel):
providers: list[Provider]
total_found: int | None = None
location: str
specialty: str
```
```typescript TypeScript
const ProviderSearch = z.object({
providers: z.array(z.object({
name: z.string(),
title: z.string(),
specialties: z.array(z.string()),
insurancePlans: z.array(z.string()),
rating: z.number().nullable(),
nextAvailable: z.string().nullable(),
})),
totalFound: z.number().nullable(),
location: z.string(),
specialty: z.string(),
});
```
## 2. Create a workspace
```python Python
workspace = await client.workspaces.create(name="grow-therapy-search")
```
```typescript TypeScript
const workspace = await client.workspaces.create({ name: "grow-therapy-search" });
```
## 3. Search for providers
```python Python
result = await client.run(
"Go to growtherapy.com and search for therapists in {{New York}} "
"who specialize in {{anxiety}} and accept insurance. "
"Return the first 5 provider profiles as JSON.",
workspace_id=str(workspace.id),
output_schema=ProviderSearch,
)
for p in result.output.providers:
print(f"{p.name} ({p.title})")
print(f" Specialties: {', '.join(p.specialties)}")
print(f" Rating: {p.rating}")
print(f" Next available: {p.next_available}")
print()
```
```typescript TypeScript
const result = await client.run(
"Go to growtherapy.com and search for therapists in {{New York}} " +
"who specialize in {{anxiety}} and accept insurance. " +
"Return the first 5 provider profiles as JSON.",
{ workspaceId: workspace.id, schema: ProviderSearch },
);
for (const p of result.output.providers) {
console.log(`${p.name} (${p.title})`);
console.log(` Specialties: ${p.specialties.join(", ")}`);
console.log(` Rating: ${p.rating}`);
console.log(` Next available: ${p.nextAvailable}`);
}
```
## 4. Sweep across locations and specialties
After the first run caches the search flow, sweep across different parameters at $0 LLM cost:
```python Python
locations = ["Los Angeles", "Chicago", "Houston", "Miami"]
specialties = ["depression", "trauma", "ADHD"]
for location in locations:
for specialty in specialties:
result = await client.run(
f"Go to growtherapy.com and search for therapists in {{{{{location}}}}} "
f"who specialize in {{{{{specialty}}}}} and accept insurance. "
f"Return the first 5 provider profiles as JSON.",
workspace_id=str(workspace.id),
output_schema=ProviderSearch,
)
count = len(result.output.providers)
print(f"{location} / {specialty}: {count} providers found")
```
```typescript TypeScript
const locations = ["Los Angeles", "Chicago", "Houston", "Miami"];
const specialties = ["depression", "trauma", "ADHD"];
for (const location of locations) {
for (const specialty of specialties) {
const result = await client.run(
`Go to growtherapy.com and search for therapists in {{${location}}} ` +
`who specialize in {{${specialty}}} and accept insurance. ` +
`Return the first 5 provider profiles as JSON.`,
{ workspaceId: workspace.id, schema: ProviderSearch },
);
console.log(`${location} / ${specialty}: ${result.output.providers.length} providers`);
}
}
```
---
## Summary
| Step | What happens | Cost |
|------|-------------|------|
| First search | Agent navigates Grow Therapy, caches the flow | ~$0.10 |
| 12 cached sweeps (4 cities x 3 specialties) | Script reruns with new params | **$0 LLM each** |
| Site layout change | [Auto-healing](https://docs.browser-use.com/cloud/agent/cache-script#auto-healing) regenerates the script | ~$0.10 |
Therapy platforms have dynamic UIs that can change frequently. [Auto-healing](https://docs.browser-use.com/cloud/agent/cache-script#auto-healing) ensures your cached scripts stay working without manual maintenance.
## Next steps
- [Structured output](https://docs.browser-use.com/cloud/agent/structured-output) — Learn more about extracting typed data with Pydantic and Zod schemas.
- [Human in the loop](https://docs.browser-use.com/cloud/agent/human-in-the-loop) — Let a human review or interact with the browser mid-task, useful for auth flows or approving results before continuing.
- [Deterministic rerun](https://docs.browser-use.com/cloud/agent/cache-script) — Deep dive into how caching and auto-healing work.
# FAQ
Source: https://docs.browser-use.com/cloud/faq
## Which model should I use?
- **Claude Opus 4.6** (`claude-opus-4.6`) — most capable. Use for the hardest tasks that need maximum accuracy.
- **Claude Sonnet 4.6** (`claude-sonnet-4.6`, default) — best balance of capability and cost. Use for complex multi-step workflows.
- **GPT-5.4 mini** (`gpt-5.4-mini`) — fast and efficient. Good for simple, well-defined tasks.
## How do I get the live browser URL?
`live_url` is returned on session creation. Embed it in an iframe or open it in a browser.
```python
session = await client.sessions.create(task="Go to example.com")
print(session.live_url)
```
## Getting blocked by a website
Stealth and proxies are active by default. If you're still getting blocked:
- **Use a profile** with logged-in cookies to bypass login walls.
- **Try a different proxy country** to match the target region.
If it still doesn't work, contact support inside the [Cloud Dashboard](https://cloud.browser-use.com) — send us a link to the page where you're getting blocked.
## Rate limited (429 errors)
The SDK auto-retries 429 responses with exponential backoff. If persistent, you may need more concurrent sessions — contact support.
## v2 vs v3 — which should I use?
**v3 is the recommendation for everything.** It's a premium agent (not available in open source) that is significantly more capable than v2:
- **Much better at complex tasks** and multi-step workflows
- **Much better at large data extraction**
- **File system** with persistent memory across tasks
- **Task scheduling** with 1,000+ integrations (Gmail, Slack, and more)
v2 is the closest to the open-source experience — pure browser automation, nothing else. If the open source already works great for your use case, v2 is the natural fit. For everything else, use v3.
```python
# v3 (recommended)
from browser_use_sdk.v3 import AsyncBrowserUse
# v2 (simple browser-only tasks)
from browser_use_sdk.v2 import AsyncBrowserUse
```
# Agent (v2)
Source: https://docs.browser-use.com/cloud/legacy/agent
## Models
| Model | API String | Cost per Step |
| ----- | ---------- | ------------- |
| Browser Use 2.0 (default) | `browser-use-2.0` | \$0.006 |
| Browser Use LLM | `browser-use-llm` | \$0.002 |
| O3 | `o3` | \$0.03 |
| Gemini Flash Latest | `gemini-flash-latest` | \$0.0075 |
| Gemini Flash Lite Latest | `gemini-flash-lite-latest` | \$0.005 |
| Claude Sonnet 4.5 | `claude-sonnet-4-5-20250929` | \$0.05 |
| Claude Sonnet 4.6 | `claude-sonnet-4.6` | \$0.05 |
Pass `llm` explicitly to select a model:
```python Python
result = await client.run("...", llm="browser-use-2.0")
```
```typescript TypeScript
const result = await client.run("...", { llm: "browser-use-2.0" });
```
---
## Files
Upload images, PDFs, documents, and text files (10 MB max) to sessions, and download output files from completed tasks.
### Upload a file
Get a presigned URL, then upload via POST.
```python Python
import httpx
from browser_use_sdk import AsyncBrowserUse
client = AsyncBrowserUse()
session = await client.sessions.create()
upload = await client.files.session_url(
session.id,
file_name="input.pdf",
content_type="application/pdf",
size_bytes=1024,
)
with open("input.pdf", "rb") as f:
async with httpx.AsyncClient() as http:
await http.post(upload.url, content=f.read(), headers={"Content-Type": "application/pdf"})
result = await client.run("Summarize the uploaded PDF", session_id=session.id)
```
```typescript TypeScript
import { BrowserUse } from "browser-use-sdk";
import { readFileSync } from "fs";
const client = new BrowserUse();
const session = await client.sessions.create();
const upload = await client.files.sessionUrl(session.id, {
fileName: "input.pdf",
contentType: "application/pdf",
sizeBytes: 1024,
});
await fetch(upload.url, {
method: "POST",
body: readFileSync("input.pdf"),
headers: { "Content-Type": "application/pdf" },
});
const result = await client.run("Summarize the uploaded PDF", { sessionId: session.id });
```
Presigned URLs expire after 120 seconds. Max file size: 10 MB.
### Download task output files
```python Python
result = await client.tasks.get(task_id)
for file in result.output_files:
output = await client.files.task_output(task_id, file.id)
print(output.download_url) # download URL
```
```typescript TypeScript
const result = await client.tasks.get(taskId);
for (const file of result.outputFiles) {
const output = await client.files.taskOutput(taskId, file.id);
console.log(output.downloadUrl);
}
```
---
## Streaming steps
Use `async for` to yield steps as the agent works.
```python Python
from browser_use_sdk import AsyncBrowserUse
client = AsyncBrowserUse()
run = client.run("Find the most upvoted post on Reddit r/technology today")
async for step in run:
print(f"Step {step.number}: {step.next_goal}")
print(f" URL: {step.url}")
print(run.result.output) # final result after iteration
```
```typescript TypeScript
import { BrowserUse } from "browser-use-sdk";
const client = new BrowserUse();
const run = client.run("Find the most upvoted post on Reddit r/technology today");
for await (const step of run) {
console.log(`Step ${step.number}: ${step.nextGoal}`);
console.log(` URL: ${step.url}`);
}
console.log(run.result?.output); // final result after iteration
```
---
## Key parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| `task` | `str` | What you want the agent to do. 1-50,000 characters. |
| `llm` | `str` | Model override. Default: Browser Use 2.0. |
| `output_schema` / `schema` | Pydantic / Zod | Schema for structured output. |
| `session_id` | `str` | Reuse an existing session. Omit for auto-session. |
| `start_url` | `str` | Initial page URL. Saves steps — send the agent directly there. |
| `secrets` | `dict` | Domain-specific credentials. See [Authentication](https://docs.browser-use.com/cloud/guides/authentication). |
| `allowed_domains` | `list[str]` | Restrict agent to these domains only. |
| `session_settings` | `SessionSettings` | Proxy, profile, browser config. See [Profiles](https://docs.browser-use.com/cloud/guides/authentication). |
| `flash_mode` | `bool` | Faster but less careful navigation. |
| `thinking` | `bool` | Enable extended reasoning. |
| `vision` | `bool \| str` | Enable screenshots for the agent. |
| `highlight_elements` | `bool` | Highlight interactive elements on the page. |
| `system_prompt_extension` | `str` | Append custom instructions to the system prompt. |
| `judge` | `bool` | Enable quality judge to verify output. |
| `skill_ids` | `list[str]` | Skills the agent can use during the task. |
| `op_vault_id` | `str` | 1Password vault ID for auto-fill credentials and 2FA. |
| `metadata` | `dict[str, str]` | Custom metadata attached to the task. |
# Public share links (v2)
Source: https://docs.browser-use.com/cloud/legacy/public-share
Generate a public URL that anyone can open to watch the entire agent session — no API key needed. Useful for sharing with teammates, stakeholders, or embedding in dashboards.
```python Python
share = await client.sessions.create_share(session.id)
print(share.share_url)
```
```typescript TypeScript
const share = await client.sessions.createShare(session.id);
console.log(share.shareUrl);
```
# Skills
Source: https://docs.browser-use.com/cloud/legacy/skills
A skill turns a website interaction into a reusable, reliable API. Describe what you need, Browser Use builds the automation, you call it like a function.
## How skills work
Every skill has two parts:
- **Goal** — the full specification: what parameters it accepts, what data it returns, and the complete scope of work. If you want to extract data from 100 listings, the goal describes extracting from *all* of them.
- **Demonstration** (`agent_prompt`) — shows *how* to perform the task, but only once. Think of it like onboarding a new colleague: you would not walk them through all 100 listings. You would show the first one or two and say "keep going like this for the rest." The demonstration navigates the site, triggers the necessary network requests, and the system builds the actual endpoint logic from that recording.
The demonstration only needs to trigger the right network requests — it does not need to complete the full task. If extracting from many pages, open the first item and maybe paginate once. The system handles the rest.
## Create a skill
```python Python
from browser_use_sdk import AsyncBrowserUse
client = AsyncBrowserUse()
skill = await client.skills.create(
goal="Extract the top X posts from HackerNews. For each post return: title, URL, score, author, comment count, and rank. X is an input parameter.",
agent_prompt="Go to https://news.ycombinator.com, click on the first post to load its content, go back to the list, and scroll down to trigger loading of additional posts.",
)
print(skill.id)
```
```typescript TypeScript
import { BrowserUse } from "browser-use-sdk";
const client = new BrowserUse();
const skill = await client.skills.create({
goal: "Extract the top X posts from HackerNews. For each post return: title, URL, score, author, comment count, and rank. X is an input parameter.",
agentPrompt: "Go to https://news.ycombinator.com, click on the first post to load its content, go back to the list, and scroll down to trigger loading of additional posts.",
});
console.log(skill.id);
```
Skill creation takes ~30 seconds. You can also create skills visually from the [Cloud Dashboard](https://cloud.browser-use.com/skills) — record yourself performing the task, or let the agent demonstrate it.
## Execute a skill
```python Python
result = await client.skills.execute(
skill.id,
parameters={"X": 10},
)
print(result)
```
```typescript TypeScript
const result = await client.skills.execute(skill.id, {
parameters: { X: 10 },
});
console.log(result);
```
## Refine with feedback
If execution is not quite right, iterate for free:
```python Python
await client.skills.refine(skill.id, feedback="Also extract the product description and available colors")
```
```typescript TypeScript
await client.skills.refine(skill.id, {
feedback: "Also extract the product description and available colors",
});
```
## Marketplace
Browse and use community-created skills.
```python Python
skills = await client.marketplace.list()
my_skill = await client.marketplace.clone(skill_id)
result = await client.marketplace.execute(skill_id, parameters={...})
```
```typescript TypeScript
const skills = await client.marketplace.list();
const mySkill = await client.marketplace.clone(skillId);
const result = await client.marketplace.execute(skillId, { parameters: { ... } });
```
See [Pricing](https://browser-use.com/pricing) for skill costs.
# 1Password & 2FA
Source: https://docs.browser-use.com/cloud/guides/1password
## Setup
### 1. Create a dedicated vault
Create a new vault in 1Password for Browser Use. Add the credentials you want the agent to access (usernames, passwords, and 2FA/TOTP codes).
### 2. Create a service account token
1. Go to [1Password Developer Tools - Service Accounts](https://my.1password.eu/developer-tools/active/service-accounts)
2. Click **New Service Account**, name it "Browser Use Cloud"
3. Grant **read access** to the dedicated vault
4. Copy the generated token
### 3. Connect to Browser Use Cloud
1. Go to [Browser Use Cloud Settings - Secrets](https://cloud.browser-use.com/settings?tab=secrets)
2. Click **Create Integration**
3. Paste your service account token
## Run tasks with 1Password
```python Python
from browser_use_sdk import AsyncBrowserUse
client = AsyncBrowserUse()
result = await client.run(
"Log into my Jira account and create a new ticket",
op_vault_id="your-vault-id",
allowed_domains=["*.atlassian.net"],
)
print(result.output)
```
```typescript TypeScript
import { BrowserUse } from "browser-use-sdk";
const client = new BrowserUse();
const result = await client.run(
"Log into my Jira account and create a new ticket",
{
opVaultId: "your-vault-id",
allowedDomains: ["*.atlassian.net"],
},
);
console.log(result.output);
```
For SSO/OAuth redirects, include all required domains:
```python Python
result = await client.run(
"Log into Jira and create a ticket for the Q4 release",
op_vault_id="your-vault-id",
allowed_domains=["*.atlassian.net", "*.okta.com"],
)
```
```typescript TypeScript
const result = await client.run(
"Log into Jira and create a ticket for the Q4 release",
{
opVaultId: "your-vault-id",
allowedDomains: ["*.atlassian.net", "*.okta.com"],
},
);
```
## How it works
When the agent encounters a login form:
1. It identifies the service (e.g., Twitter, GitHub, LinkedIn)
2. Retrieves matching credentials from your 1Password vault
3. Fills in the username and password
4. If 2FA is required and a TOTP code is stored, it generates and enters the code automatically
The agent never sees your actual credentials. The actual username, password, and 2FA codes are filled in programmatically — keeping your secrets hidden from the AI model.
# Secrets
Source: https://docs.browser-use.com/cloud/guides/secrets
Pass credentials to the agent scoped by domain. The agent uses them only on matching domains.
```python Python
from browser_use_sdk import AsyncBrowserUse
client = AsyncBrowserUse()
result = await client.run(
"Log into GitHub and star the browser-use/browser-use repo",
secrets={"github.com": "username:password123"},
allowed_domains=["github.com"],
)
```
```typescript TypeScript
import { BrowserUse } from "browser-use-sdk";
const client = new BrowserUse();
const result = await client.run(
"Log into GitHub and star the browser-use/browser-use repo",
{
secrets: { "github.com": "username:password123" },
allowedDomains: ["github.com"],
},
);
```
Use `allowed_domains` to restrict the agent to specific domains. Supports wildcards: `example.com`, `*.example.com`.
For SSO/OAuth redirects, include all domains in the auth flow:
```python Python
result = await client.run(
"Log into the company portal and download the Q4 report",
secrets={
"portal.example.com": "user@company.com:password123",
"okta.com": "user@company.com:password123",
},
allowed_domains=["portal.example.com", "*.okta.com"],
)
```
```typescript TypeScript
const result = await client.run(
"Log into the company portal and download the Q4 report",
{
secrets: {
"portal.example.com": "user@company.com:password123",
"okta.com": "user@company.com:password123",
},
allowedDomains: ["portal.example.com", "*.okta.com"],
},
);
```
# API Reference
Source: https://docs.browser-use.com/cloud/api-reference
## Authentication
All requests require an API key in the `X-Browser-Use-API-Key` header:
```
X-Browser-Use-API-Key: bu_your_key_here
```
Get a key at [cloud.browser-use.com/settings](https://cloud.browser-use.com/settings?tab=api-keys&new=1). Keys start with `bu_`.
## Base URL
```
https://api.browser-use.com/api/v3
```
## Quick example
```bash Create a session
curl -X POST https://api.browser-use.com/api/v3/sessions \
-H "X-Browser-Use-API-Key: bu_your_key_here" \
-H "Content-Type: application/json" \
-d '{"task": "Find the top 3 trending repos on GitHub today"}'
```
```bash Get session result (replace SESSION_ID)
curl https://api.browser-use.com/api/v3/sessions/SESSION_ID \
-H "X-Browser-Use-API-Key: bu_your_key_here"
```
## Environment variable
Set the key once so SDKs pick it up automatically:
```bash
export BROWSER_USE_API_KEY=bu_your_key_here
```
---
Prefer the SDK? See the [Agent docs](https://docs.browser-use.com/cloud/agent/quickstart) — the SDK has all API endpoints available as methods, including `client.browsers.create()`.
```bash Python
pip install browser-use-sdk
```
```bash TypeScript
npm install browser-use-sdk
```
# API key
Source: https://docs.browser-use.com/cloud/api-v2-overview
Get a key at [cloud.browser-use.com/settings](https://cloud.browser-use.com/settings?tab=api-keys&new=1), then:
```bash
export BROWSER_USE_API_KEY=your_key
```
Base URL: `https://api.browser-use.com/api/v2`
---
Prefer the SDK? See the [Agent (v2) docs](https://docs.browser-use.com/cloud/legacy/agent).
```bash Python
pip install browser-use-sdk
```
```bash TypeScript
npm install browser-use-sdk
```