💻 Browser Use
browser-use
AI控制浏览器,访问网站,总结信息,并给出结论。
支持多种模型,如gpt-4o,deepseek-r1等。
installation
$ pip install browser-use
# install playwright
$ playwright install
pip安装包的时候有报错,如下:
(.venv) ➜ browser-use-demo pip install browser-use -i https://pypi.org/simple
ERROR: Could not find a version that satisfies the requirement browser-use (from versions: none)
ERROR: No matching distribution found for browser-use
# 在pypi中查询知道:browser-use需要Python>=3.11,这里对应提高下python版本后重试
# 这一步安装,多少需要点时间..
Demo
api key: 创建.env
文件,添加OPENAI_API_KEY
。 API Key参见:OpenAI API Key
这里readme中给的demo如下:
from langchain_openai import ChatOpenAI
from browser_use import Agent
import asyncio
from dotenv import load_dotenv
load_dotenv()
async def main():
agent = Agent(
task="Compare the price of gpt-4o and DeepSeek-V3",
llm=ChatOpenAI(model="gpt-4o"),
)
await agent.run()
asyncio.run(main())
使用的是gpt-4o模型,但是我没有api key..因此修改成为deepseek-r1模型。
具体支持的模型可以参考:supported-models
# agent-deepseek-r1.py
from langchain_openai import ChatOpenAI
from browser_use import Agent
from pydantic import SecretStr
import asyncio
# Initialize the model
api_key = "xxxxxxxxxxxxxxxxxxxxxxxxxxx"
llm=ChatOpenAI(base_url='https://api.deepseek.com/v1', model='deepseek-chat', api_key=SecretStr(api_key))
# Create agent with the model
async def main():
agent = Agent(
task="Can you compare the icicibank and hdfcbank fundamental from screener.in?",
llm=llm,
use_vision=False
)
await agent.run()
asyncio.run(main())
运行:python3 agent-deepseek-r1.py
,会使playwright打开浏览器,开始执行task,如下为一张截图:
等待一段时间的执行,最后会输出结果,如下:
如上,一个完整的task执行完成。(浏览器执行的过程有点子慢..😂)
这里又问了一个问题:Compare the differences of playwright and selenium?
浏览器的过程如下:
结果如下,从控制台复制,自行评价:
...
...
INFO [agent] 📄 Result: Here are the key differences between Playwright and Selenium:
1. **Speed and Performance**: Playwright is often chosen for speed and offers significantly better performance than Selenium.
2. **Learning Curve**: Playwright has an easier learning curve and provides value faster compared to Selenium.
3. **Browser Support**: Playwright supports Chromium, Firefox, and WebKit browsers, while Selenium offers a wide variety of browser support including Chrome, Firefox, IE, Edge, Opera, and more.
4. **Debugging Capabilities**: Playwright has more advanced debugging capabilities compared to Selenium.
5. **Community and Ecosystem**: Selenium has a broader support and established ecosystem, while Playwright has limited community support due to its recent entry into the market.
6. **Setup and Parallel Execution**: Playwright has easier setup with built-in parallelism, whereas Selenium requires more setup, especially for parallel execution.
7. **Modern Features**: Playwright is a newer, open-source tool developed by Microsoft, while Selenium is an open-source tool that has been in the industry for a long time.
INFO [agent] ✅ Task completed
INFO [agent] ✅ Successfully