首页 > 其他分享 >Browser Use -- playwright

Browser Use -- playwright

时间:2025-01-15 17:44:12浏览次数:1  
标签:web Use playwright -- use Playwright test Browser browser

Browser Use

https://browser-use.com/

Make Websites Accessible for Agents

We make websites accessible for AI agents by extracting all interactive elements, so agents can focus on what makes their beer taste better.

 

Powerful Browser Automation

Browser Use combines advanced AI capabilities with robust browser automation to make web interactions seamless for AI agents.

 

Vision + HTML Extraction

Combines visual understanding with HTML structure extraction for comprehensive web interaction.  

Multi-tab Management

Automatically handles multiple browser tabs for complex workflows and parallel processing.  

Element Tracking

Extracts clicked elements XPaths and repeats exact LLM actions for consistent automation.  

Custom Actions

Add your own actions like saving to files, database operations, notifications, or human input handling.  

Self-correcting

Intelligent error handling and automatic recovery for robust automation workflows.  

Any LLM Support

Compatible with all LangChain LLMs including GPT-4, Claude 3, and Llama 2.

 

eval

https://github.com/browser-use/eval

WebVoyager evaluation for Browser Use

This repository is a fork of original repo

Evaluation runs

The file structure is the same as the original repo. The only difference is that the run_browser_use.py is modified to add the browser use evaluation, we also changed the prompts to be suitable for the browser use evaluation (VERY minimal changes - evaluate multiple images, not just one) and switched to langchain.

We also have a list of impossible tasks that are not possible anymore (completely outdated, can't be fixed with dates).

We changed some tasks that included dates to be more in the future instead of the past since the data is outdated which would make the task impossible (e.g. "Please find me a hotel on 2023-12-01 on booking.com", which is impossible since you can't search for a hotel in the past).

We ran the evaluation on 16gb of RAM with 15 concurrent tasks.

requirements.txt is missing browser-use on purpose since we install it by building the package locally.

Manual correction of evaluations

The eval model is not good. That's why we added another success criteria - unknown if the eval model is not sure.

Most of the tasks are indeed correct, but some tasks had wrong assesment, and unknown either went into success or failed.

We manually reviewed the evaluations for the tasks that are either "unknown" or "failed" and corrected them. This is due to the fact that the default WebVoyager evaluator is not good.

Costs

The whole cost of running the dataset once is around 250 USD for gpt4o.

 

web-ui

https://github.com/browser-use/web-ui

This project builds upon the foundation of the browser-use, which is designed to make websites accessible for AI agents.

We would like to officially thank WarmShao for his contribution to this project.

WebUI: is built on Gradio and supports a most of browser-use functionalities. This UI is designed to be user-friendly and enables easy interaction with the browser agent.

Expanded LLM Support: We've integrated support for various Large Language Models (LLMs), including: Gemini, OpenAI, Azure OpenAI, Anthropic, DeepSeek, Ollama etc. And we plan to add support for even more models in the future.

Custom Browser Support: You can use your own browser with our tool, eliminating the need to re-login to sites or deal with other authentication challenges. This feature also supports high-definition screen recording.

Persistent Browser Sessions: You can choose to keep the browser window open between AI tasks, allowing you to see the complete history and state of AI interactions.

 

playwright

https://playwright.dev/python/docs/intro

Introduction

Playwright was created specifically to accommodate the needs of end-to-end testing. Playwright supports all modern rendering engines including Chromium, WebKit, and Firefox. Test on Windows, Linux, and macOS, locally or on CI, headless or headed with native mobile emulation.

The Playwright library can be used as a general purpose browser automation tool, providing a powerful set of APIs to automate web applications, for both sync and async Python.

This introduction describes the Playwright Pytest plugin, which is the recommended way to write end-to-end tests.

 

Playwright enables reliable end-to-end testing for modern web apps.

Any browser • Any platform • One API

Cross-browser. Playwright supports all modern rendering engines including Chromium, WebKit, and Firefox.

Cross-platform. Test on Windows, Linux, and macOS, locally or on CI, headless or headed.

Cross-language. Use the Playwright API in TypeScript, JavaScript, Python, .NET, Java.

Test Mobile Web. Native mobile emulation of Google Chrome for Android and Mobile Safari. The same rendering engine works on your Desktop and in the Cloud.

   

Resilient • No flaky tests

Auto-wait. Playwright waits for elements to be actionable prior to performing actions. It also has a rich set of introspection events. The combination of the two eliminates the need for artificial timeouts - the primary cause of flaky tests.

Web-first assertions. Playwright assertions are created specifically for the dynamic web. Checks are automatically retried until the necessary conditions are met.

Tracing. Configure test retry strategy, capture execution trace, videos, screenshots to eliminate flakes.

No trade-offs • No limits

Browsers run web content belonging to different origins in different processes. Playwright is aligned with the modern browsers architecture and runs tests out-of-process. This makes Playwright free of the typical in-process test runner limitations.

Multiple everything. Test scenarios that span multiple tabs, multiple origins and multiple users. Create scenarios with different contexts for different users and run them against your server, all in one test.

Trusted events. Hover elements, interact with dynamic controls, produce trusted events. Playwright uses real browser input pipeline indistinguishable from the real user.

Test frames, pierce Shadow DOM. Playwright selectors pierce shadow DOM and allow entering frames seamlessly.

   

Full isolation • Fast execution

Browser contexts. Playwright creates a browser context for each test. Browser context is equivalent to a brand new browser profile. This delivers full test isolation with zero overhead. Creating a new browser context only takes a handful of milliseconds.

Log in once. Save the authentication state of the context and reuse it in all the tests. This bypasses repetitive log-in operations in each test, yet delivers full isolation of independent tests.

Powerful Tooling

Codegen. Generate tests by recording your actions. Save them into any language.

Playwright inspector. Inspect page, generate selectors, step through the test execution, see click points, explore execution logs.

Trace Viewer. Capture all the information to investigate the test failure. Playwright trace contains test execution screencast, live DOM snapshots, action explorer, test source, and many more.

 

标签:web,Use,playwright,--,use,Playwright,test,Browser,browser
From: https://www.cnblogs.com/lightsong/p/18673521

相关文章

  • python 按时间戳删除32×32数组的前2列和后9列
    还是雨滴谱文件,这次尝试批量处理首先处理1个单独的txt文件#!usr/bin/envpython#-*-coding:utf-8_*-"""@author:Suyue@file:raindrop.py@time:2025/01/15{DAY}@desc:"""importnumpyasnpimportredefprocess_file(input_file,output_file):......
  • P3247 [HNOI2016] 最小公倍数 解题报告
    前置知识:可撤销并查集用一个栈记录合并顺序,每次撤销将栈顶的元素恢复但是这种方法不能路径压缩,因为会改变节点之间的关系,为了保证时间,可以按照\(size\)进行合并题意显然为能不能找到一条路径,使这条路径上最大的\(a\)为\(qa\),最大的\(b\)为\(qb\)因为有a和b两个限制,考......
  • 枚举
    1.解释通俗来讲,枚举就是试,试出正确的结果。优点:简便好写,思路好想缺点:未经优化的枚举时间复杂度大,有时需要优化2.步骤1.确定枚举的条件(如值,范围)2.通过循环与判断语句去判断是否符合条件3.例题题目:一个数组中的数互不相同,求其中和为0的数对的个数。思路:挨个匹配尝试核心......
  • springboot3+快速集成jwt指南
    首先简单回忆一下思路:登录接口为用户生成一个jwt,jwt存于redis中。在使用后续功能通过web拦截器拦截,先获取校验jwt是否过期,再决定是否放行。后续根据jwt中取出来的信息即可实现简单的鉴权总体来说功能如下:本博客以springboot3+为例,使用jjwt0.12.3<dependency>......
  • React项目准备
    目录1.创建项目并运行2.文件夹命名规范3.路由配置4.配置@别名路径(1)路径转换(2)VSCode联想提示5.使用git管理项目1.创建项目并运行在命令行输入创建指令:npxcreate-react-appxxx(xxx表示项目名称)进入对应目录,执行:npmi初始化安装包。(遇到报错有可能是包版本问题,去到packa......
  • pg_controldata的使用方法
    [omm@txy~]$pg_controldata/openGauss/data/dnpg_controlversionnumber:923Catalogversionnumber:201611171Databasesystemidentifier:5932367657193972969Databaseclusterstate:inproductionpg_contr......
  • 商业模式画布BMC如何使用:提升企业战略规划力
    想象一下,你正站在一个充满无限可能的画布前,每一笔都可能勾勒出企业未来的蓝图。这不仅是一幅简单的画,而是一份战略地图——一份关于如何连接客户、创造价值、驱动收入的完整指南。这就是商业模式画布(BusinessModelCanvas,BMC),一个为企业设计和优化商业模式而生的神器。由亚历山......
  • 2025 年宣布一件大事,Oracle 一键安装脚本开源了!
    大家好,这里是公众号DBA学习之路,致力于分享数据库领域相关知识。目录前言Oracle一键安装脚本脚本下载环境信息安装前准备Centos7.9Redhat8.10脚本参数一键安装11GR219C写在最后前言你没看错,就是Oracle数据库一键安装脚本部分开源了!之前很多朋友咨询我脚本......
  • 一文读懂Redis之哨兵(Sentinel)模式搭建
    目录一、环境规划二、Redis服务主从模式搭建步骤一、Master节点创建目录步骤二、Master节点下载Redis安装包步骤三、Master节点解压Redis安装包步骤四、Master节点进行make编译步骤五、Slave01节点、Slave02节点根据步骤一至步骤四安装Redis服务步骤六、Master节点修......
  • Linux
    基础概念与系统架构Linux的起源与发展Linux起源于LinusTorvalds在1991年发布的Linux内核。它是一种类Unix操作系统,遵循开源软件的原则,众多开发者和社区围绕内核开发了各种发行版。其发展得益于互联网的协作模式,如Debian、RedHat等发行版的不断演进,推动了Lin......