Browser Use uses proprietary/private test sets that must never be committed to Github and must be fetched through a authorized api request.
Accessing these test sets requires an approved Browser Use account.
There are currently no publicly available test sets, but some may be released in the future.
First, navigate to https://browser-use.tools and log in with an authorized browser use account.Then, click the “Account” button at the top right of the page, and click the “Cycle New Key” button on that page.Copy the resulting url and secret key into your .env file. It should look like this:
The evaluations webpage has a convenient GUI for generating these commands. To use it, navigate to https://browser-use.tools/dashboard.Then click the button “New Eval Run” on the left panel. This will open a interface with selectors, inputs, sliders, and switches.Input your desired configuration into the interface and copy the resulting python command at the bottom. Then run this command as before.