Browser Use
https://browser-use.com/
Make Websites Accessible for Agents
We make websites accessible for AI agents by extracting all interactive elements, so agents can focus on what makes their beer taste better.
Powerful Browser Automation
Browser Use combines advanced AI capabilities with robust browser automation to make web interactions seamless for AI agents.
Vision + HTML Extraction
Combines visual understanding with HTML structure extraction for comprehensive web interaction.Multi-tab Management
Automatically handles multiple browser tabs for complex workflows and parallel processing.Element Tracking
Extracts clicked elements XPaths and repeats exact LLM actions for consistent automation.Custom Actions
Add your own actions like saving to files, database operations, notifications, or human input handling.Self-correcting
Intelligent error handling and automatic recovery for robust automation workflows.Any LLM Support
Compatible with all LangChain LLMs including GPT-4, Claude 3, and Llama 2.
eval
https://github.com/browser-use/eval
WebVoyager evaluation for Browser Use
This repository is a fork of original repoEvaluation runs
The file structure is the same as the original repo. The only difference is that the
run_browser_use.py
is modified to add the browser use evaluation, we also changed the prompts to be suitable for the browser use evaluation (VERY minimal changes - evaluate multiple images, not just one) and switched to langchain.We also have a list of impossible tasks that are not possible anymore (completely outdated, can't be fixed with dates).
We changed some tasks that included dates to be more in the future instead of the past since the data is outdated which would make the task impossible (e.g. "Please find me a hotel on 2023-12-01 on booking.com", which is impossible since you can't search for a hotel in the past).
We ran the evaluation on 16gb of RAM with 15 concurrent tasks.
requirements.txt
is missingbrowser-use
on purpose since we install it by building the package locally.Manual correction of evaluations
The eval model is not good. That's why we added another success criteria -
unknown
if the eval model is not sure.Most of the tasks are indeed correct, but some tasks had wrong assesment, and
unknown
either went intosuccess
orfailed
.We manually reviewed the evaluations for the tasks that are either "unknown" or "failed" and corrected them. This is due to the fact that the default WebVoyager evaluator is not good.
Costs
The whole cost of running the dataset once is around 250 USD for gpt4o.
web-ui
https://github.com/browser-use/web-ui
This project builds upon the foundation of the browser-use, which is designed to make websites accessible for AI agents.
We would like to officially thank WarmShao for his contribution to this project.
WebUI: is built on Gradio and supports a most of
browser-use
functionalities. This UI is designed to be user-friendly and enables easy interaction with the browser agent.Expanded LLM Support: We've integrated support for various Large Language Models (LLMs), including: Gemini, OpenAI, Azure OpenAI, Anthropic, DeepSeek, Ollama etc. And we plan to add support for even more models in the future.
Custom Browser Support: You can use your own browser with our tool, eliminating the need to re-login to sites or deal with other authentication challenges. This feature also supports high-definition screen recording.
Persistent Browser Sessions: You can choose to keep the browser window open between AI tasks, allowing you to see the complete history and state of AI interactions.
playwright
https://playwright.dev/python/docs/intro
Introduction
Playwright was created specifically to accommodate the needs of end-to-end testing. Playwright supports all modern rendering engines including Chromium, WebKit, and Firefox. Test on Windows, Linux, and macOS, locally or on CI, headless or headed with native mobile emulation.
The Playwright library can be used as a general purpose browser automation tool, providing a powerful set of APIs to automate web applications, for both sync and async Python.
This introduction describes the Playwright Pytest plugin, which is the recommended way to write end-to-end tests.
Playwright enables reliable end-to-end testing for modern web apps.
Any browser • Any platform • One API
Cross-browser. Playwright supports all modern rendering engines including Chromium, WebKit, and Firefox.
Cross-platform. Test on Windows, Linux, and macOS, locally or on CI, headless or headed.
Cross-language. Use the Playwright API in TypeScript, JavaScript, Python, .NET, Java.
Test Mobile Web. Native mobile emulation of Google Chrome for Android and Mobile Safari. The same rendering engine works on your Desktop and in the Cloud.
Resilient • No flaky tests
Auto-wait. Playwright waits for elements to be actionable prior to performing actions. It also has a rich set of introspection events. The combination of the two eliminates the need for artificial timeouts - the primary cause of flaky tests.
Web-first assertions. Playwright assertions are created specifically for the dynamic web. Checks are automatically retried until the necessary conditions are met.
Tracing. Configure test retry strategy, capture execution trace, videos, screenshots to eliminate flakes.
No trade-offs • No limits
Browsers run web content belonging to different origins in different processes. Playwright is aligned with the modern browsers architecture and runs tests out-of-process. This makes Playwright free of the typical in-process test runner limitations.
Multiple everything. Test scenarios that span multiple tabs, multiple origins and multiple users. Create scenarios with different contexts for different users and run them against your server, all in one test.
Trusted events. Hover elements, interact with dynamic controls, produce trusted events. Playwright uses real browser input pipeline indistinguishable from the real user.
Test frames, pierce Shadow DOM. Playwright selectors pierce shadow DOM and allow entering frames seamlessly.
Full isolation • Fast execution
Browser contexts. Playwright creates a browser context for each test. Browser context is equivalent to a brand new browser profile. This delivers full test isolation with zero overhead. Creating a new browser context only takes a handful of milliseconds.
Log in once. Save the authentication state of the context and reuse it in all the tests. This bypasses repetitive log-in operations in each test, yet delivers full isolation of independent tests.
Powerful Tooling
Codegen. Generate tests by recording your actions. Save them into any language.
Playwright inspector. Inspect page, generate selectors, step through the test execution, see click points, explore execution logs.
Trace Viewer. Capture all the information to investigate the test failure. Playwright trace contains test execution screencast, live DOM snapshots, action explorer, test source, and many more.
标签:web,Use,playwright,--,use,Playwright,test,Browser,browser From: https://www.cnblogs.com/lightsong/p/18673521