docs: update extension related docs
This commit is contained in:
@@ -22,6 +22,7 @@ npm start # Start website dev server
|
|||||||
npm run build # Build all packages
|
npm run build # Build all packages
|
||||||
npm run build:libs # Build all libraries
|
npm run build:libs # Build all libraries
|
||||||
npm run lint # ESLint with TypeScript strict rules
|
npm run lint # ESLint with TypeScript strict rules
|
||||||
|
npm run zip -w @page-agent/ext # Zip the extension package
|
||||||
```
|
```
|
||||||
|
|
||||||
## Architecture
|
## Architecture
|
||||||
@@ -36,7 +37,7 @@ packages/
|
|||||||
├── page-agent/ # npm: "page-agent" entry class (with UI + controller + demo builds)
|
├── page-agent/ # npm: "page-agent" entry class (with UI + controller + demo builds)
|
||||||
├── website/ # @page-agent/website (private)
|
├── website/ # @page-agent/website (private)
|
||||||
├── llms/ # @page-agent/llms
|
├── llms/ # @page-agent/llms
|
||||||
├── extension/ # 🚧 WIP: Browser extension (WXT + React)
|
├── extension/ # Browser extension (WXT + React)
|
||||||
├── page-controller/ # @page-agent/page-controller
|
├── page-controller/ # @page-agent/page-controller
|
||||||
└── ui/ # @page-agent/ui
|
└── ui/ # @page-agent/ui
|
||||||
```
|
```
|
||||||
|
|||||||
@@ -20,10 +20,11 @@ Thank you for your interest in contributing to Page-Agent! We welcome contributi
|
|||||||
|
|
||||||
### Project Structure
|
### Project Structure
|
||||||
|
|
||||||
This is a **monorepo** with npm workspaces containing **3 main packages**:
|
This is a **monorepo** with npm workspaces containing **4 main packages**:
|
||||||
|
|
||||||
- **Page Agent** (`packages/page-agent/`) - Main entry with built-in UI Panel, published as `page-agent` on npm
|
- **Page Agent** (`packages/page-agent/`) - Main entry with built-in UI Panel, published as `page-agent` on npm
|
||||||
- **Core** (`packages/core/`) - Core agent logic without UI (npm: `@page-agent/core`)
|
- **Core** (`packages/core/`) - Core agent logic without UI (npm: `@page-agent/core`)
|
||||||
|
- **Extension** (`packages/extension/`) - Chrome extension for multi-page tasks and browser-level automation
|
||||||
- **Website** (`packages/website/`) - React documentation and landing page. Also as demo and test page for the core lib. private package `@page-agent/website`
|
- **Website** (`packages/website/`) - React documentation and landing page. Also as demo and test page for the core lib. private package `@page-agent/website`
|
||||||
|
|
||||||
We use a simplified monorepo solution with `native npm-workspace + ts reference + vite alias`. No fancy tooling. Hoisting is required.
|
We use a simplified monorepo solution with `native npm-workspace + ts reference + vite alias`. No fancy tooling. Hoisting is required.
|
||||||
@@ -145,6 +146,16 @@ If your lame AI assistant does not support [AGENTS.md](https://agents.md/). Add
|
|||||||
npm start
|
npm start
|
||||||
```
|
```
|
||||||
|
|
||||||
|
### Extension Development
|
||||||
|
|
||||||
|
```bash
|
||||||
|
npm run dev -w @page-agent/ext
|
||||||
|
npm run zip -w @page-agent/ext
|
||||||
|
```
|
||||||
|
|
||||||
|
- Load extension in Chrome via `chrome://extensions` -> **Load unpacked**
|
||||||
|
- Use `packages/extension/docs/extension_api.md` (EN) or `packages/extension/docs/extension_api_zh.md` (ZH) for API integration details
|
||||||
|
|
||||||
### Testing on Other Websites
|
### Testing on Other Websites
|
||||||
|
|
||||||
- Start and serve a local `iife` script
|
- Start and serve a local `iife` script
|
||||||
@@ -193,14 +204,6 @@ By contributing to this project, you agree that your contributions will be licen
|
|||||||
|
|
||||||
> You may need to sign a github CLA before you create a PR.
|
> You may need to sign a github CLA before you create a PR.
|
||||||
|
|
||||||
### Browser-Use Attribution
|
|
||||||
|
|
||||||
Parts of this project are derived from the [browser-use](https://github.com/browser-use/browser-use) project (MIT License). When contributing to DOM-related functionality:
|
|
||||||
|
|
||||||
- Maintain existing attribution comments
|
|
||||||
- Follow similar patterns for consistency
|
|
||||||
- Credit browser-use for derived concepts
|
|
||||||
|
|
||||||
## 💬 Questions?
|
## 💬 Questions?
|
||||||
|
|
||||||
- Open a GitHub issue for technical questions
|
- Open a GitHub issue for technical questions
|
||||||
|
|||||||
47
README-zh.md
47
README-zh.md
@@ -1,4 +1,4 @@
|
|||||||
# PageAgent 🤖🪄
|
# Page Agent
|
||||||
|
|
||||||
<picture>
|
<picture>
|
||||||
<source media="(prefers-color-scheme: dark)" srcset="https://img.alicdn.com/imgextra/i4/O1CN01qKig1P1FnhpFKNdi6_!!6000000000532-2-tps-1280-256.png">
|
<source media="(prefers-color-scheme: dark)" srcset="https://img.alicdn.com/imgextra/i4/O1CN01qKig1P1FnhpFKNdi6_!!6000000000532-2-tps-1280-256.png">
|
||||||
@@ -19,18 +19,16 @@
|
|||||||
|
|
||||||
## ✨ Features
|
## ✨ Features
|
||||||
|
|
||||||
- **🎯 轻松集成**
|
- **🎯 轻松集成**
|
||||||
- 无需 Python,无需无头浏览器,无需浏览器插件。纯页面内脚本。
|
- 无需 `浏览器插件` / `Python` / `无头浏览器`。
|
||||||
- **🔐 端侧运行**
|
- 纯页面内 JavaScript,一切都在你的网页中完成。
|
||||||
- **🧠 HTML 脱水**
|
- The best tool for your agent to control web pages.
|
||||||
- **💬 自然语言接口**
|
- **📖 基于文本的 DOM 操作**
|
||||||
- **🎨 HITL 交互界面**
|
- 无需截图,无需 OCR 或多模态模型。
|
||||||
|
- 无需特殊权限。
|
||||||
以及 😉
|
- **🧠 用你自己的 LLM**
|
||||||
|
- **🎨 精美 UI,支持人机协同**
|
||||||
- **🧪 实验性的 Chrome 扩展,支持跨页面控制** - `packages/extension`
|
- **🐙 可选的 [Chrome 扩展](https://alibaba.github.io/page-agent/#/docs/features/chrome-extension),支持跨页面任务。**
|
||||||
|
|
||||||
👉 [**🗺️ Roadmap**](https://github.com/alibaba/page-agent/issues/96)
|
|
||||||
|
|
||||||
## 🚀 快速开始
|
## 🚀 快速开始
|
||||||
|
|
||||||
@@ -39,20 +37,16 @@
|
|||||||
通过我们免费的 Demo LLM 快速体验 PageAgent:
|
通过我们免费的 Demo LLM 快速体验 PageAgent:
|
||||||
|
|
||||||
```html
|
```html
|
||||||
<script
|
<script src="{URL}" crossorigin="true"></script>
|
||||||
src="https://registry.npmmirror.com/page-agent/1.2.0/files/dist/iife/page-agent.demo.js"
|
|
||||||
crossorigin="true"
|
|
||||||
></script>
|
|
||||||
```
|
```
|
||||||
|
|
||||||
> - **⚠️ 仅用于技术评估。** Demo LLM 有速率和使用限制,可能随时变更。
|
|
||||||
> - **🌷 建议使用自己的 LLM API。**
|
|
||||||
|
|
||||||
| Mirrors | URL |
|
| Mirrors | URL |
|
||||||
| ------- | ---------------------------------------------------------------------------------- |
|
| ------- | ---------------------------------------------------------------------------------- |
|
||||||
| Global | https://cdn.jsdelivr.net/npm/page-agent@1.2.0/dist/iife/page-agent.demo.js |
|
| Global | https://cdn.jsdelivr.net/npm/page-agent@1.2.0/dist/iife/page-agent.demo.js |
|
||||||
| China | https://registry.npmmirror.com/page-agent/1.2.0/files/dist/iife/page-agent.demo.js |
|
| China | https://registry.npmmirror.com/page-agent/1.2.0/files/dist/iife/page-agent.demo.js |
|
||||||
|
|
||||||
|
> **⚠️ 仅用于技术评估。** Demo LLM 有速率和使用限制,速度较慢,可能随时变更。
|
||||||
|
|
||||||
### NPM 安装
|
### NPM 安装
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
@@ -72,7 +66,7 @@ const agent = new PageAgent({
|
|||||||
await agent.execute('点击登录按钮')
|
await agent.execute('点击登录按钮')
|
||||||
```
|
```
|
||||||
|
|
||||||
适用于无法使用 NPM 的环境,我们也提供了 IIFE 构建的 CDN 方式。[@see CDN Usage](https://alibaba.github.io/page-agent/#/docs/integration/cdn-setup)
|
更多编程用法,请参阅 [📖 文档](https://alibaba.github.io/page-agent/#/docs/introduction/overview)。
|
||||||
|
|
||||||
## 🏗️ 架构设计
|
## 🏗️ 架构设计
|
||||||
|
|
||||||
@@ -80,12 +74,13 @@ PageAgent adopts a simplified monorepo structure:
|
|||||||
|
|
||||||
```
|
```
|
||||||
packages/
|
packages/
|
||||||
├── core/ # ** Core agent logic without UI(npm: @page-agent/core) **
|
├── core/ # ** Core agent logic (npm: @page-agent/core) **
|
||||||
├── page-agent/ # Exported agent and demo(npm: page-agent)
|
|
||||||
├── llms/ # LLM 客户端 (npm: @page-agent/llms)
|
├── llms/ # LLM 客户端 (npm: @page-agent/llms)
|
||||||
├── page-controller/ # DOM 操作 & 蒙层 & 模拟鼠标 (npm: @page-agent/page-controller)
|
├── page-controller/ # DOM 操作 (npm: @page-agent/page-controller)
|
||||||
├── ui/ # 面板 & i18n (npm: @page-agent/ui)
|
├── ui/ # 面板 UI (npm: @page-agent/ui)
|
||||||
└── website/ # 文档站点
|
├── page-agent/ # 入口类 & iife 包 (npm: page-agent)
|
||||||
|
├── extension/ # Chrome 扩展,支持跨页面任务
|
||||||
|
└── website/ # 网站 & 文档站点
|
||||||
```
|
```
|
||||||
|
|
||||||
## 🤝 贡献
|
## 🤝 贡献
|
||||||
|
|||||||
47
README.md
47
README.md
@@ -1,4 +1,4 @@
|
|||||||
# PageAgent 🤖🪄
|
# Page Agent
|
||||||
|
|
||||||
<picture>
|
<picture>
|
||||||
<source media="(prefers-color-scheme: dark)" srcset="https://img.alicdn.com/imgextra/i4/O1CN01qKig1P1FnhpFKNdi6_!!6000000000532-2-tps-1280-256.png">
|
<source media="(prefers-color-scheme: dark)" srcset="https://img.alicdn.com/imgextra/i4/O1CN01qKig1P1FnhpFKNdi6_!!6000000000532-2-tps-1280-256.png">
|
||||||
@@ -19,18 +19,16 @@ The GUI Agent Living in Your Webpage. Control web interfaces with natural langua
|
|||||||
|
|
||||||
## ✨ Features
|
## ✨ Features
|
||||||
|
|
||||||
- **🎯 Easy Integration**
|
- **🎯 Easy integration**
|
||||||
- No python. No headless browser. No browser extension. Just in-page scripts.
|
- No need for `browser extension` / `python` / `headless browser`.
|
||||||
- **🔐 Client-Side Processing**
|
- Just in-page javascript. Everything happens in your web page.
|
||||||
- **🧠 DOM Extraction**
|
- The best tool for your agent to control web pages.
|
||||||
- **💬 Natural Language Interface**
|
- **📖 Text-based DOM manipulation**
|
||||||
- **🎨 UI with Human in the loop**
|
- No screenshots. No OCR or multi-modal LLMs needed.
|
||||||
|
- No special permissions required.
|
||||||
And 😉
|
- **🧠 Bring your own LLMs**
|
||||||
|
- **🎨 Pretty UI with human-in-the-loop**
|
||||||
- **🧪 `cross-page` control with an experimental chrome extension** - `packages/extension`
|
- **🐙 Optional [chrome extension](https://alibaba.github.io/page-agent/#/docs/features/chrome-extension) for multi-page tasks.**
|
||||||
|
|
||||||
👉 [**🗺️ Roadmap**](https://github.com/alibaba/page-agent/issues/96)
|
|
||||||
|
|
||||||
## 🚀 Quick Start
|
## 🚀 Quick Start
|
||||||
|
|
||||||
@@ -39,20 +37,16 @@ And 😉
|
|||||||
Fastest way to try PageAgent with our free Demo LLM:
|
Fastest way to try PageAgent with our free Demo LLM:
|
||||||
|
|
||||||
```html
|
```html
|
||||||
<script
|
<script src="{URL}" crossorigin="true"></script>
|
||||||
src="https://cdn.jsdelivr.net/npm/page-agent@1.2.0/dist/iife/page-agent.demo.js"
|
|
||||||
crossorigin="true"
|
|
||||||
></script>
|
|
||||||
```
|
```
|
||||||
|
|
||||||
> - **⚠️ For technical evaluation only.** Demo LLM has rate limits and usage restrictions. May change without notice.
|
|
||||||
> - **🌷 Bring your own LLM API.**
|
|
||||||
|
|
||||||
| Mirrors | URL |
|
| Mirrors | URL |
|
||||||
| ------- | ---------------------------------------------------------------------------------- |
|
| ------- | ---------------------------------------------------------------------------------- |
|
||||||
| Global | https://cdn.jsdelivr.net/npm/page-agent@1.2.0/dist/iife/page-agent.demo.js |
|
| Global | https://cdn.jsdelivr.net/npm/page-agent@1.2.0/dist/iife/page-agent.demo.js |
|
||||||
| China | https://registry.npmmirror.com/page-agent/1.2.0/files/dist/iife/page-agent.demo.js |
|
| China | https://registry.npmmirror.com/page-agent/1.2.0/files/dist/iife/page-agent.demo.js |
|
||||||
|
|
||||||
|
> **⚠️ For technical evaluation only.** Demo LLM has rate limits and usage restrictions. Slow. May change without notice.
|
||||||
|
|
||||||
### NPM Installation
|
### NPM Installation
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
@@ -72,18 +66,21 @@ const agent = new PageAgent({
|
|||||||
await agent.execute('Click the login button')
|
await agent.execute('Click the login button')
|
||||||
```
|
```
|
||||||
|
|
||||||
|
For more programmatic usage, see [📖 Documentations](https://alibaba.github.io/page-agent/#/docs/introduction/overview).
|
||||||
|
|
||||||
## 🏗️ Structure
|
## 🏗️ Structure
|
||||||
|
|
||||||
PageAgent adopts a simplified monorepo structure:
|
PageAgent adopts a simplified monorepo structure:
|
||||||
|
|
||||||
```
|
```
|
||||||
packages/
|
packages/
|
||||||
├── core/ # ** Core agent logic without UI(npm: @page-agent/core) **
|
├── core/ # ** Core agent logic (npm: @page-agent/core) **
|
||||||
├── page-agent/ # Exported agent and demo(npm: page-agent)
|
|
||||||
├── llms/ # LLM client (npm: @page-agent/llms)
|
├── llms/ # LLM client (npm: @page-agent/llms)
|
||||||
├── page-controller/ # DOM operations & Visual Mask (npm: @page-agent/page-controller)
|
├── page-controller/ # DOM operations (npm: @page-agent/page-controller)
|
||||||
├── ui/ # Panel & i18n (npm: @page-agent/ui)
|
├── ui/ # Panel UI (npm: @page-agent/ui)
|
||||||
└── website/ # Demo & Documentation site
|
├── page-agent/ # Entry class and iife builds(npm: page-agent)
|
||||||
|
├── extension/ # Chrome extension for multi-page tasks
|
||||||
|
└── website/ # Website & Documentation site
|
||||||
```
|
```
|
||||||
|
|
||||||
## 🤝 Contributing
|
## 🤝 Contributing
|
||||||
|
|||||||
@@ -1,12 +1,18 @@
|
|||||||
# Page Agent Extension API
|
# Page Agent Extension API
|
||||||
|
|
||||||
This document describes how to integrate the Page Agent browser extension into your web application.
|
Integrate the Page Agent extension into your web app and trigger multi-page browser tasks from page JavaScript.
|
||||||
|
|
||||||
## Installation
|
## Installation
|
||||||
|
|
||||||
### 1. Install the browser extension
|
### 1. Install the browser extension
|
||||||
|
|
||||||
Install the Page Agent extension from the Chrome Web Store.
|
Primary channel:
|
||||||
|
|
||||||
|
- Chrome Web Store: https://chromewebstore.google.com/detail/page-agent-ext/akldabonmimlicnjlflnapfeklbfemhj
|
||||||
|
|
||||||
|
Latest updates are often published earlier on:
|
||||||
|
|
||||||
|
- GitHub Releases: https://github.com/alibaba/page-agent/releases
|
||||||
|
|
||||||
### 2. Install type definitions (recommended)
|
### 2. Install type definitions (recommended)
|
||||||
|
|
||||||
@@ -14,11 +20,19 @@ Install the Page Agent extension from the Chrome Web Store.
|
|||||||
npm install @page-agent/core --save-dev
|
npm install @page-agent/core --save-dev
|
||||||
```
|
```
|
||||||
|
|
||||||
### 3. Set up authentication
|
### 3. Authorization (Token)
|
||||||
|
|
||||||
The extension only injects APIs when it detects a valid token in `localStorage`.
|
The token allows your page JS to call the extension API (`window.PAGE_AGENT_EXT`) and execute multi-page tasks.
|
||||||
|
|
||||||
1. Open the extension's side panel to get your authorization token
|
Why token-based access is required:
|
||||||
|
|
||||||
|
- The extension has broad browser permissions (page access, navigation, multi-tab control).
|
||||||
|
- If abused, it can harm user privacy and security.
|
||||||
|
- Users must explicitly provide the token only to applications they trust.
|
||||||
|
|
||||||
|
Setup:
|
||||||
|
|
||||||
|
1. Open the extension side panel and copy your auth token.
|
||||||
2. Set the token in your page:
|
2. Set the token in your page:
|
||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
@@ -60,36 +74,36 @@ if (await waitForExtension()) {
|
|||||||
|
|
||||||
## Global API
|
## Global API
|
||||||
|
|
||||||
The extension injects the following APIs into the `window` object:
|
After token match, the extension injects APIs into `window`.
|
||||||
|
|
||||||
### `window.PAGE_AGENT_EXT_VERSION`
|
### `window.PAGE_AGENT_EXT_VERSION`
|
||||||
|
|
||||||
Extension version string (e.g., `"1.0.0"`). This is exposed separately to allow version checking before accessing the main API object.
|
Extension version string (for capability checks before using the main API).
|
||||||
|
|
||||||
### `window.PAGE_AGENT_EXT`
|
### `window.PAGE_AGENT_EXT`
|
||||||
|
|
||||||
Main API namespace object containing:
|
Main namespace object.
|
||||||
|
|
||||||
#### `PAGE_AGENT_EXT.execute(task, config)`
|
#### `PAGE_AGENT_EXT.execute(task, config)`
|
||||||
|
|
||||||
Execute an agent task.
|
Execute one agent task.
|
||||||
|
|
||||||
**Parameters:**
|
Parameters:
|
||||||
|
|
||||||
| Name | Type | Required | Description |
|
| Name | Type | Required | Description |
|
||||||
|------|------|----------|-------------|
|
| ---- | ---- | -------- | ----------- |
|
||||||
| `task` | `string` | Yes | Task description |
|
| `task` | `string` | Yes | Task description |
|
||||||
| `config` | `ExecuteConfig` | Yes | Execution configuration (LLM settings, options, and event callbacks) |
|
| `config` | `ExecuteConfig` | Yes | LLM settings, options, and callbacks |
|
||||||
|
|
||||||
**Returns:** `Promise<ExecutionResult>`
|
Returns: `Promise<ExecutionResult>`
|
||||||
|
|
||||||
#### `PAGE_AGENT_EXT.dispose()`
|
#### `PAGE_AGENT_EXT.dispose()`
|
||||||
|
|
||||||
Stop and destroy the current running agent.
|
Stop the current task.
|
||||||
|
|
||||||
## Types
|
## Types
|
||||||
|
|
||||||
Install `@page-agent/core` for full type definitions:
|
Install `@page-agent/core` for complete types:
|
||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
import type {
|
import type {
|
||||||
@@ -104,10 +118,7 @@ export interface ExecuteConfig {
|
|||||||
apiKey: string
|
apiKey: string
|
||||||
model: string
|
model: string
|
||||||
|
|
||||||
/**
|
// Include the initial tab where page JS starts. Default: true.
|
||||||
* Whether to include the initial tab (that holds this main world script) in the task.
|
|
||||||
* @default true
|
|
||||||
*/
|
|
||||||
includeInitialTab?: boolean
|
includeInitialTab?: boolean
|
||||||
|
|
||||||
onStatusChange?: (status: AgentStatus) => void
|
onStatusChange?: (status: AgentStatus) => void
|
||||||
@@ -119,20 +130,13 @@ export interface ExecuteConfig {
|
|||||||
export type Execute = (task: string, config: ExecuteConfig) => Promise<ExecutionResult>
|
export type Execute = (task: string, config: ExecuteConfig) => Promise<ExecutionResult>
|
||||||
```
|
```
|
||||||
|
|
||||||
### AgentStatus
|
`AgentStatus`
|
||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
type AgentStatus = 'idle' | 'running' | 'completed' | 'error'
|
type AgentStatus = 'idle' | 'running' | 'completed' | 'error'
|
||||||
```
|
```
|
||||||
|
|
||||||
| Status | Description |
|
`AgentActivity`
|
||||||
|--------|-------------|
|
|
||||||
| `idle` | Agent is idle, ready to execute |
|
|
||||||
| `running` | Agent is executing a task |
|
|
||||||
| `completed` | Task completed successfully |
|
|
||||||
| `error` | Task failed with an error |
|
|
||||||
|
|
||||||
### AgentActivity
|
|
||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
type AgentActivity =
|
type AgentActivity =
|
||||||
@@ -143,15 +147,7 @@ type AgentActivity =
|
|||||||
| { type: 'error'; message: string }
|
| { type: 'error'; message: string }
|
||||||
```
|
```
|
||||||
|
|
||||||
| Type | Description |
|
`HistoricalEvent`
|
||||||
|------|-------------|
|
|
||||||
| `thinking` | Agent is analyzing the page and planning |
|
|
||||||
| `executing` | Agent is executing a tool action |
|
|
||||||
| `executed` | Tool execution completed |
|
|
||||||
| `retrying` | Retrying after a failure |
|
|
||||||
| `error` | An error occurred |
|
|
||||||
|
|
||||||
### HistoricalEvent
|
|
||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
type HistoricalEvent =
|
type HistoricalEvent =
|
||||||
@@ -162,7 +158,7 @@ type HistoricalEvent =
|
|||||||
| { type: 'error'; message: string; rawResponse?: unknown }
|
| { type: 'error'; message: string; rawResponse?: unknown }
|
||||||
```
|
```
|
||||||
|
|
||||||
### ExecutionResult
|
`ExecutionResult`
|
||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
interface ExecutionResult {
|
interface ExecutionResult {
|
||||||
@@ -183,81 +179,22 @@ const result = await window.PAGE_AGENT_EXT!.execute(
|
|||||||
baseURL: 'https://api.openai.com/v1',
|
baseURL: 'https://api.openai.com/v1',
|
||||||
apiKey: process.env.OPENAI_API_KEY!,
|
apiKey: process.env.OPENAI_API_KEY!,
|
||||||
model: 'gpt-5.2',
|
model: 'gpt-5.2',
|
||||||
}
|
includeInitialTab: false, // Optional: exclude current tab
|
||||||
)
|
onStatusChange: (status) => console.log(status),
|
||||||
|
onActivity: (activity) => console.log(activity),
|
||||||
if (result.success) {
|
|
||||||
console.log('Task completed:', result.data)
|
|
||||||
} else {
|
|
||||||
console.error('Task failed')
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
### Exclude Initial Tab
|
|
||||||
|
|
||||||
By default, the agent includes the initial tab (where the script runs) in the task. Set `includeInitialTab: false` to exclude it:
|
|
||||||
|
|
||||||
```typescript
|
|
||||||
const result = await window.PAGE_AGENT_EXT!.execute(
|
|
||||||
'Open a new tab and search for page-agent on GitHub',
|
|
||||||
{
|
|
||||||
baseURL: 'https://api.openai.com/v1',
|
|
||||||
apiKey: process.env.OPENAI_API_KEY!,
|
|
||||||
model: 'gpt-5.2',
|
|
||||||
includeInitialTab: false, // Agent will open new tabs only
|
|
||||||
}
|
}
|
||||||
)
|
)
|
||||||
```
|
```
|
||||||
|
|
||||||
### With Event Callbacks
|
### Stop the Current Task
|
||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
await window.PAGE_AGENT_EXT!.execute('Navigate to the settings page', {
|
|
||||||
baseURL: 'https://api.openai.com/v1',
|
|
||||||
apiKey: process.env.OPENAI_API_KEY!,
|
|
||||||
model: 'gpt-5.2',
|
|
||||||
onStatusChange: (status) => {
|
|
||||||
updateUI({ agentStatus: status })
|
|
||||||
},
|
|
||||||
onActivity: (activity) => {
|
|
||||||
switch (activity.type) {
|
|
||||||
case 'thinking':
|
|
||||||
showSpinner('Agent is thinking...')
|
|
||||||
break
|
|
||||||
case 'executing':
|
|
||||||
showSpinner(`Executing: ${activity.tool}`)
|
|
||||||
break
|
|
||||||
case 'executed':
|
|
||||||
log(`${activity.tool} completed in ${activity.duration}ms`)
|
|
||||||
break
|
|
||||||
case 'error':
|
|
||||||
showError(activity.message)
|
|
||||||
break
|
|
||||||
}
|
|
||||||
},
|
|
||||||
onHistoryUpdate: (history) => {
|
|
||||||
renderHistory(history)
|
|
||||||
},
|
|
||||||
})
|
|
||||||
```
|
|
||||||
|
|
||||||
### Stop Execution
|
|
||||||
|
|
||||||
```typescript
|
|
||||||
// Start a task
|
|
||||||
window.PAGE_AGENT_EXT!.execute('Scroll through all pages', {
|
|
||||||
baseURL: 'https://api.openai.com/v1',
|
|
||||||
apiKey: process.env.OPENAI_API_KEY!,
|
|
||||||
model: 'gpt-5.2',
|
|
||||||
})
|
|
||||||
|
|
||||||
// Later, stop it
|
|
||||||
window.PAGE_AGENT_EXT!.dispose()
|
window.PAGE_AGENT_EXT!.dispose()
|
||||||
```
|
```
|
||||||
|
|
||||||
## Window Type Declaration
|
## Window Type Declaration
|
||||||
|
|
||||||
If not using `@page-agent/core`, add this to your project:
|
If you are not importing `@page-agent/core`, add:
|
||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
import type {
|
import type {
|
||||||
@@ -283,7 +220,7 @@ declare global {
|
|||||||
PAGE_AGENT_EXT_VERSION?: string
|
PAGE_AGENT_EXT_VERSION?: string
|
||||||
PAGE_AGENT_EXT?: {
|
PAGE_AGENT_EXT?: {
|
||||||
version: string
|
version: string
|
||||||
execute: (task: string, config: ExecuteConfig) => Promise<ExecutionResult>
|
execute: Execute
|
||||||
dispose: () => void
|
dispose: () => void
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -1,12 +1,18 @@
|
|||||||
# Page Agent 浏览器插件 API
|
# Page Agent 浏览器插件 API
|
||||||
|
|
||||||
本文档介绍如何在网页应用中接入 Page Agent 浏览器插件。
|
在你的网页应用中接入 Page Agent 插件,并通过页面 JavaScript 发起多页面浏览器任务。
|
||||||
|
|
||||||
## 安装
|
## 安装
|
||||||
|
|
||||||
### 1. 安装浏览器插件
|
### 1. 安装浏览器插件
|
||||||
|
|
||||||
从 Chrome 应用商店安装 Page Agent 插件。
|
首选渠道:
|
||||||
|
|
||||||
|
- Chrome 应用商店:https://chromewebstore.google.com/detail/page-agent-ext/akldabonmimlicnjlflnapfeklbfemhj
|
||||||
|
|
||||||
|
通常更快提供最新构建的渠道:
|
||||||
|
|
||||||
|
- GitHub Releases:https://github.com/alibaba/page-agent/releases
|
||||||
|
|
||||||
### 2. 安装类型定义(推荐)
|
### 2. 安装类型定义(推荐)
|
||||||
|
|
||||||
@@ -14,11 +20,19 @@
|
|||||||
npm install @page-agent/core --save-dev
|
npm install @page-agent/core --save-dev
|
||||||
```
|
```
|
||||||
|
|
||||||
### 3. 配置认证
|
### 3. 授权(Token)
|
||||||
|
|
||||||
插件在页面加载后检测 `localStorage` 中的 token,匹配时才会注入 API。
|
token 用于让页面 JS 调用扩展 API(`window.PAGE_AGENT_EXT`)并执行多页面任务。
|
||||||
|
|
||||||
1. 打开插件的侧边栏面板,获取授权 token
|
为什么必须使用 token:
|
||||||
|
|
||||||
|
- 插件具备较广的浏览器权限(页面访问、导航、多标签控制)。
|
||||||
|
- 若被滥用,可能危害用户隐私与安全。
|
||||||
|
- 用户必须主动将 token 提供给其信任的应用。
|
||||||
|
|
||||||
|
配置步骤:
|
||||||
|
|
||||||
|
1. 在扩展侧边栏中复制 auth token。
|
||||||
2. 在页面中设置 token:
|
2. 在页面中设置 token:
|
||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
@@ -60,32 +74,32 @@ if (await waitForExtension()) {
|
|||||||
|
|
||||||
## 全局 API
|
## 全局 API
|
||||||
|
|
||||||
插件在 `window` 对象上注入以下 API:
|
token 匹配后,插件会在 `window` 上注入 API。
|
||||||
|
|
||||||
### `window.PAGE_AGENT_EXT_VERSION`
|
### `window.PAGE_AGENT_EXT_VERSION`
|
||||||
|
|
||||||
插件版本号字符串(例如 `"1.0.0"`)。单独暴露版本号,方便在访问主 API 对象前进行版本检查。
|
插件版本号字符串,可用于在访问主 API 前做能力检查。
|
||||||
|
|
||||||
### `window.PAGE_AGENT_EXT`
|
### `window.PAGE_AGENT_EXT`
|
||||||
|
|
||||||
主 API 命名空间对象,包含:
|
主命名空间对象。
|
||||||
|
|
||||||
#### `PAGE_AGENT_EXT.execute(task, config)`
|
#### `PAGE_AGENT_EXT.execute(task, config)`
|
||||||
|
|
||||||
执行 Agent 任务。
|
执行 Agent 任务。
|
||||||
|
|
||||||
**参数:**
|
参数:
|
||||||
|
|
||||||
| 名称 | 类型 | 必填 | 说明 |
|
| 名称 | 类型 | 必填 | 说明 |
|
||||||
|------|------|------|------|
|
| ---- | ---- | ---- | ---- |
|
||||||
| `task` | `string` | 是 | 任务描述 |
|
| `task` | `string` | 是 | 任务描述 |
|
||||||
| `config` | `ExecuteConfig` | 是 | 执行配置(LLM 设置、选项和事件回调) |
|
| `config` | `ExecuteConfig` | 是 | LLM 设置、执行选项和回调 |
|
||||||
|
|
||||||
**返回:** `Promise<ExecutionResult>`
|
返回:`Promise<ExecutionResult>`
|
||||||
|
|
||||||
#### `PAGE_AGENT_EXT.dispose()`
|
#### `PAGE_AGENT_EXT.dispose()`
|
||||||
|
|
||||||
停止并销毁当前运行的 Agent。
|
停止当前任务。
|
||||||
|
|
||||||
## 类型定义
|
## 类型定义
|
||||||
|
|
||||||
@@ -104,10 +118,7 @@ export interface ExecuteConfig {
|
|||||||
apiKey: string
|
apiKey: string
|
||||||
model: string
|
model: string
|
||||||
|
|
||||||
/**
|
// 是否包含启动脚本所在标签页。默认 true。
|
||||||
* 是否将初始标签页(运行此脚本的页面)包含在任务中。
|
|
||||||
* @default true
|
|
||||||
*/
|
|
||||||
includeInitialTab?: boolean
|
includeInitialTab?: boolean
|
||||||
|
|
||||||
onStatusChange?: (status: AgentStatus) => void
|
onStatusChange?: (status: AgentStatus) => void
|
||||||
@@ -119,20 +130,13 @@ export interface ExecuteConfig {
|
|||||||
export type Execute = (task: string, config: ExecuteConfig) => Promise<ExecutionResult>
|
export type Execute = (task: string, config: ExecuteConfig) => Promise<ExecutionResult>
|
||||||
```
|
```
|
||||||
|
|
||||||
### AgentStatus
|
`AgentStatus`
|
||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
type AgentStatus = 'idle' | 'running' | 'completed' | 'error'
|
type AgentStatus = 'idle' | 'running' | 'completed' | 'error'
|
||||||
```
|
```
|
||||||
|
|
||||||
| 状态 | 说明 |
|
`AgentActivity`
|
||||||
|------|------|
|
|
||||||
| `idle` | 空闲,准备执行 |
|
|
||||||
| `running` | 正在执行任务 |
|
|
||||||
| `completed` | 任务成功完成 |
|
|
||||||
| `error` | 任务执行失败 |
|
|
||||||
|
|
||||||
### AgentActivity
|
|
||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
type AgentActivity =
|
type AgentActivity =
|
||||||
@@ -143,15 +147,7 @@ type AgentActivity =
|
|||||||
| { type: 'error'; message: string }
|
| { type: 'error'; message: string }
|
||||||
```
|
```
|
||||||
|
|
||||||
| 类型 | 说明 |
|
`HistoricalEvent`
|
||||||
|------|------|
|
|
||||||
| `thinking` | Agent 正在分析页面并规划 |
|
|
||||||
| `executing` | 正在执行工具操作 |
|
|
||||||
| `executed` | 工具执行完成 |
|
|
||||||
| `retrying` | 失败后重试 |
|
|
||||||
| `error` | 发生错误 |
|
|
||||||
|
|
||||||
### HistoricalEvent
|
|
||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
type HistoricalEvent =
|
type HistoricalEvent =
|
||||||
@@ -162,7 +158,7 @@ type HistoricalEvent =
|
|||||||
| { type: 'error'; message: string; rawResponse?: unknown }
|
| { type: 'error'; message: string; rawResponse?: unknown }
|
||||||
```
|
```
|
||||||
|
|
||||||
### ExecutionResult
|
`ExecutionResult`
|
||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
interface ExecutionResult {
|
interface ExecutionResult {
|
||||||
@@ -183,81 +179,22 @@ const result = await window.PAGE_AGENT_EXT!.execute(
|
|||||||
baseURL: 'https://api.openai.com/v1',
|
baseURL: 'https://api.openai.com/v1',
|
||||||
apiKey: process.env.OPENAI_API_KEY!,
|
apiKey: process.env.OPENAI_API_KEY!,
|
||||||
model: 'gpt-5.2',
|
model: 'gpt-5.2',
|
||||||
}
|
includeInitialTab: false, // 可选:排除当前标签页
|
||||||
)
|
onStatusChange: (status) => console.log(status),
|
||||||
|
onActivity: (activity) => console.log(activity),
|
||||||
if (result.success) {
|
|
||||||
console.log('任务完成:', result.data)
|
|
||||||
} else {
|
|
||||||
console.error('任务失败')
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
### 排除初始标签页
|
|
||||||
|
|
||||||
默认情况下,Agent 会将初始标签页(运行脚本的页面)包含在任务中。设置 `includeInitialTab: false` 可以排除它:
|
|
||||||
|
|
||||||
```typescript
|
|
||||||
const result = await window.PAGE_AGENT_EXT!.execute(
|
|
||||||
'打开新标签页并在 GitHub 上搜索 page-agent',
|
|
||||||
{
|
|
||||||
baseURL: 'https://api.openai.com/v1',
|
|
||||||
apiKey: process.env.OPENAI_API_KEY!,
|
|
||||||
model: 'gpt-5.2',
|
|
||||||
includeInitialTab: false, // Agent 只会打开新标签页
|
|
||||||
}
|
}
|
||||||
)
|
)
|
||||||
```
|
```
|
||||||
|
|
||||||
### 使用事件回调
|
### 停止当前任务
|
||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
await window.PAGE_AGENT_EXT!.execute('导航到设置页面', {
|
|
||||||
baseURL: 'https://api.openai.com/v1',
|
|
||||||
apiKey: process.env.OPENAI_API_KEY!,
|
|
||||||
model: 'gpt-5.2',
|
|
||||||
onStatusChange: (status) => {
|
|
||||||
updateUI({ agentStatus: status })
|
|
||||||
},
|
|
||||||
onActivity: (activity) => {
|
|
||||||
switch (activity.type) {
|
|
||||||
case 'thinking':
|
|
||||||
showSpinner('Agent 正在思考...')
|
|
||||||
break
|
|
||||||
case 'executing':
|
|
||||||
showSpinner(`正在执行: ${activity.tool}`)
|
|
||||||
break
|
|
||||||
case 'executed':
|
|
||||||
log(`${activity.tool} 完成,耗时 ${activity.duration}ms`)
|
|
||||||
break
|
|
||||||
case 'error':
|
|
||||||
showError(activity.message)
|
|
||||||
break
|
|
||||||
}
|
|
||||||
},
|
|
||||||
onHistoryUpdate: (history) => {
|
|
||||||
renderHistory(history)
|
|
||||||
},
|
|
||||||
})
|
|
||||||
```
|
|
||||||
|
|
||||||
### 停止执行
|
|
||||||
|
|
||||||
```typescript
|
|
||||||
// 启动任务
|
|
||||||
window.PAGE_AGENT_EXT!.execute('滚动浏览所有页面', {
|
|
||||||
baseURL: 'https://api.openai.com/v1',
|
|
||||||
apiKey: process.env.OPENAI_API_KEY!,
|
|
||||||
model: 'gpt-5.2',
|
|
||||||
})
|
|
||||||
|
|
||||||
// 稍后停止
|
|
||||||
window.PAGE_AGENT_EXT!.dispose()
|
window.PAGE_AGENT_EXT!.dispose()
|
||||||
```
|
```
|
||||||
|
|
||||||
## Window 类型声明
|
## Window 类型声明
|
||||||
|
|
||||||
如果不使用 `@page-agent/core`,可以添加以下声明:
|
如果你不直接引入 `@page-agent/core`,可添加以下声明:
|
||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
import type {
|
import type {
|
||||||
@@ -283,7 +220,7 @@ declare global {
|
|||||||
PAGE_AGENT_EXT_VERSION?: string
|
PAGE_AGENT_EXT_VERSION?: string
|
||||||
PAGE_AGENT_EXT?: {
|
PAGE_AGENT_EXT?: {
|
||||||
version: string
|
version: string
|
||||||
execute: (task: string, config: ExecuteConfig) => Promise<ExecutionResult>
|
execute: Execute
|
||||||
dispose: () => void
|
dispose: () => void
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -1,11 +1,13 @@
|
|||||||
import { siGithub } from 'simple-icons'
|
import { siChromewebstore, siGithub } from 'simple-icons'
|
||||||
|
|
||||||
import BetaNotice from '@/components/BetaNotice'
|
|
||||||
import CodeEditor from '@/components/CodeEditor'
|
import CodeEditor from '@/components/CodeEditor'
|
||||||
import { useLanguage } from '@/i18n/context'
|
import { useLanguage } from '@/i18n/context'
|
||||||
|
|
||||||
export default function ChromeExtension() {
|
export default function ChromeExtension() {
|
||||||
const { isZh } = useLanguage()
|
const { isZh } = useLanguage()
|
||||||
|
const chromeWebStoreUrl =
|
||||||
|
'https://chromewebstore.google.com/detail/page-agent-ext/akldabonmimlicnjlflnapfeklbfemhj'
|
||||||
|
const githubReleasesUrl = 'https://github.com/alibaba/page-agent/releases'
|
||||||
|
|
||||||
return (
|
return (
|
||||||
<div>
|
<div>
|
||||||
@@ -13,70 +15,92 @@ export default function ChromeExtension() {
|
|||||||
|
|
||||||
<p className="text-xl text-gray-600 dark:text-gray-300 mb-8 leading-relaxed">
|
<p className="text-xl text-gray-600 dark:text-gray-300 mb-8 leading-relaxed">
|
||||||
{isZh
|
{isZh
|
||||||
? '可选的 Chrome 扩展,解锁多页任务和第三方 API 集成。'
|
? '可选的 Chrome 扩展。PageAgent.js 继续负责页面内自动化;扩展 API 额外提供多页面任务、浏览器级控制,以及从浏览器外部发起任务的能力。'
|
||||||
: 'Optional Chrome extension that unlocks multi-page tasks and third-party API integration.'}
|
: 'An optional Chrome extension. PageAgent.js keeps handling in-page automation, while the extension API adds multi-page tasks, browser-level control, and tasks initiated from outside the browser.'}
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
<BetaNotice />
|
|
||||||
|
|
||||||
<div className="space-y-8 mt-8">
|
<div className="space-y-8 mt-8">
|
||||||
{/* Hero Section */}
|
|
||||||
<section className="p-6 bg-linear-to-r from-blue-50 to-purple-50 dark:from-blue-900/20 dark:to-purple-900/20 rounded-xl">
|
|
||||||
<div className="flex items-start gap-4">
|
|
||||||
<div>
|
|
||||||
<p className="text-gray-600 dark:text-gray-300">
|
|
||||||
{isZh
|
|
||||||
? '解锁多页任务!借助 Chrome 扩展,Agent 可以跨标签页和页面导航,突破单页限制。'
|
|
||||||
: 'Unlock multi-page tasks! With the Chrome extension, your agent can navigate across tabs and pages, breaking the single-page limitation.'}
|
|
||||||
</p>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
</section>
|
|
||||||
|
|
||||||
{/* Features */}
|
{/* Features */}
|
||||||
<section>
|
<section>
|
||||||
<h2 className="text-2xl font-bold mb-4">{isZh ? '核心特性' : 'Key Features'}</h2>
|
<h2 className="text-2xl font-bold mb-4">{isZh ? '核心特性' : 'Key Features'}</h2>
|
||||||
<div className="grid md:grid-cols-2 gap-4">
|
<div className="grid md:grid-cols-3 gap-4">
|
||||||
<div className="p-4 bg-gray-50 dark:bg-gray-800 rounded-lg">
|
<div className="p-4 bg-gray-50 dark:bg-gray-800 rounded-lg">
|
||||||
<h3 className="font-semibold mb-2">🔓 {isZh ? '多页任务' : 'Multi-Page Tasks'}</h3>
|
<h3 className="font-semibold mb-2">🔓 {isZh ? '多页任务' : 'Multi-Page Tasks'}</h3>
|
||||||
<p className="text-gray-600 dark:text-gray-300 text-sm">
|
<p className="text-gray-600 dark:text-gray-300 text-sm">
|
||||||
{isZh
|
{isZh
|
||||||
? '跨多个页面和标签页执行任务,不再局限于单页操作。'
|
? '跨多个页面和标签页连续执行任务,不再受限于单页上下文。'
|
||||||
: 'Execute tasks across multiple pages and tabs. No longer limited to single-page operations.'}
|
: 'Run tasks across multiple pages and tabs without being limited to a single page context.'}
|
||||||
</p>
|
</p>
|
||||||
</div>
|
</div>
|
||||||
<div className="p-4 bg-gray-50 dark:bg-gray-800 rounded-lg">
|
<div className="p-4 bg-gray-50 dark:bg-gray-800 rounded-lg">
|
||||||
<h3 className="font-semibold mb-2">
|
<h3 className="font-semibold mb-2">
|
||||||
🔌 {isZh ? '开放第三方接口' : 'Third-Party API'}
|
🧭 {isZh ? '浏览器级控制' : 'Browser-Level Control'}
|
||||||
</h3>
|
</h3>
|
||||||
<p className="text-gray-600 dark:text-gray-300 text-sm">
|
<p className="text-gray-600 dark:text-gray-300 text-sm">
|
||||||
{isZh
|
{isZh
|
||||||
? '用户授权后,你的网页、本地 Agent 或云端 Agent 都能通过扩展操作用户浏览器!'
|
? '支持跨标签导航、页面切换和更完整的浏览器自动化能力。'
|
||||||
: 'After user authorization, your webpage, local agent, or cloud agent can control the browser through the extension.'}
|
: 'Enable richer browser automation, including cross-tab navigation and page switching.'}
|
||||||
|
</p>
|
||||||
|
</div>
|
||||||
|
<div className="p-4 bg-gray-50 dark:bg-gray-800 rounded-lg">
|
||||||
|
<h3 className="font-semibold mb-2">
|
||||||
|
🔌 {isZh ? '开放集成接口' : 'Open Integration API'}
|
||||||
|
</h3>
|
||||||
|
<p className="text-gray-600 dark:text-gray-300 text-sm">
|
||||||
|
{isZh
|
||||||
|
? '用户主动授权后,页面 JS、本地 Agent 或云端 Agent 可通过扩展发起多页面任务。'
|
||||||
|
: 'With explicit user authorization, page JS, local agents, or cloud agents can trigger multi-page tasks through the extension.'}
|
||||||
</p>
|
</p>
|
||||||
</div>
|
</div>
|
||||||
</div>
|
</div>
|
||||||
</section>
|
</section>
|
||||||
|
|
||||||
{/* Download */}
|
{/* Install */}
|
||||||
<section>
|
<section>
|
||||||
<h2 className="text-2xl font-bold mb-4">{isZh ? '下载测试版' : 'Download Beta'}</h2>
|
<h2 className="text-2xl font-bold mb-4">{isZh ? '获取扩展' : 'Get the Extension'}</h2>
|
||||||
<p className="text-gray-600 dark:text-gray-300 mb-4">
|
<div className="flex flex-wrap gap-3">
|
||||||
{isZh
|
<a
|
||||||
? '扩展目前处于 Beta 阶段,请从 GitHub Releases 下载最新版本。'
|
href={chromeWebStoreUrl}
|
||||||
: 'The extension is currently in beta. Download the latest version from GitHub Releases.'}
|
target="_blank"
|
||||||
</p>
|
rel="noopener noreferrer"
|
||||||
<a
|
className="inline-flex items-center gap-2 px-6 py-3 bg-blue-600 hover:bg-blue-700 text-white font-medium rounded-lg transition-colors"
|
||||||
href="https://github.com/alibaba/page-agent/releases"
|
>
|
||||||
target="_blank"
|
<svg className="w-5 h-5" fill="currentColor" viewBox="0 0 24 24">
|
||||||
rel="noopener noreferrer"
|
<path d={siChromewebstore.path} />
|
||||||
className="inline-flex items-center gap-2 px-6 py-3 bg-blue-600 hover:bg-blue-700 text-white font-medium rounded-lg transition-colors"
|
</svg>
|
||||||
>
|
{isZh ? '从 Chrome 应用商店安装' : 'Install from Chrome Web Store'}
|
||||||
<svg className="w-5 h-5" fill="currentColor" viewBox="0 0 24 24">
|
</a>
|
||||||
<path d={siGithub.path} />
|
<a
|
||||||
</svg>
|
href={githubReleasesUrl}
|
||||||
{isZh ? '前往 GitHub Releases 下载' : 'Download from GitHub Releases'}
|
target="_blank"
|
||||||
</a>
|
rel="noopener noreferrer"
|
||||||
|
className="inline-flex items-center gap-2 px-6 py-3 bg-gray-900 hover:bg-gray-800 dark:bg-gray-700 dark:hover:bg-gray-600 text-white font-medium rounded-lg transition-colors"
|
||||||
|
>
|
||||||
|
<svg className="w-5 h-5" fill="currentColor" viewBox="0 0 24 24">
|
||||||
|
<path d={siGithub.path} />
|
||||||
|
</svg>
|
||||||
|
{isZh ? 'GitHub Releases(更新版本)' : 'GitHub Releases (faster updates)'}
|
||||||
|
</a>
|
||||||
|
</div>
|
||||||
|
</section>
|
||||||
|
|
||||||
|
{/* Relationship with PageAgent.js */}
|
||||||
|
<section>
|
||||||
|
<h2 className="text-2xl font-bold mb-4">
|
||||||
|
{isZh ? '与 PageAgent.js 的关系' : 'How It Relates to PageAgent.js'}
|
||||||
|
</h2>
|
||||||
|
<div className="p-5 bg-gray-50 dark:bg-gray-800 rounded-lg space-y-3 text-gray-600 dark:text-gray-300">
|
||||||
|
<p>
|
||||||
|
{isZh
|
||||||
|
? 'PageAgent.js 本身即可在页面内完成自动化。Chrome 扩展是可选的能力扩展。'
|
||||||
|
: 'PageAgent.js already works for in-page automation. The Chrome extension is optional, not a dependency.'}
|
||||||
|
</p>
|
||||||
|
<p>
|
||||||
|
{isZh
|
||||||
|
? '通过扩展,你可以执行多页面任务、控制浏览器,以及从浏览器外部(本地服务或云端服务)发起任务。'
|
||||||
|
: 'With the extension, you can perform multi-page tasks, browser-level control, and tasks triggered outside the browser (local or cloud services).'}
|
||||||
|
</p>
|
||||||
|
</div>
|
||||||
</section>
|
</section>
|
||||||
|
|
||||||
{/* Third-party Integration */}
|
{/* Third-party Integration */}
|
||||||
@@ -86,32 +110,33 @@ export default function ChromeExtension() {
|
|||||||
</h2>
|
</h2>
|
||||||
<p className="text-gray-600 dark:text-gray-300 mb-4">
|
<p className="text-gray-600 dark:text-gray-300 mb-4">
|
||||||
{isZh
|
{isZh
|
||||||
? '用户授权后,外部应用可以调用扩展 API 来控制浏览器。'
|
? '通过页面 JavaScript 调用 `window.PAGE_AGENT_EXT`,你的应用可以发起跨页面任务并控制浏览器行为。'
|
||||||
: 'After user authorization, external applications can call the extension API to control the browser.'}
|
: 'By calling `window.PAGE_AGENT_EXT` from page JavaScript, your app can trigger multi-page tasks and control browser behavior.'}
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
{/* Auth Flow */}
|
<h3 className="text-xl font-semibold mb-3">
|
||||||
<h3 className="text-xl font-semibold mb-3">{isZh ? '授权流程' : 'Authorization Flow'}</h3>
|
{isZh ? '授权与安全' : 'Authorization and Security'}
|
||||||
|
</h3>
|
||||||
<p className="text-gray-600 dark:text-gray-300 mb-4">
|
<p className="text-gray-600 dark:text-gray-300 mb-4">
|
||||||
{isZh
|
{isZh
|
||||||
? '扩展使用基于 Token 的授权机制,扩展端和页面端必须持有匹配的 Token。'
|
? '扩展权限范围较广(例如页面访问、导航、多标签控制)。若被滥用,可能危害用户隐私。为此,调用能力由 Token 保护,用户必须主动将 Token 提供给其信任的应用。'
|
||||||
: 'The extension uses a token-based authorization mechanism. Both extension and page must have matching tokens.'}
|
: 'The extension has broad permissions (such as page access, navigation, and multi-tab control). If abused, it can harm user privacy. That is why access is protected by a token, and users must actively share the token only with applications they trust.'}
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
<CodeEditor
|
<CodeEditor
|
||||||
code={
|
code={
|
||||||
isZh
|
isZh
|
||||||
? `// 1. 用户安装扩展并在扩展设置中配置 auth token
|
? `// 1) 用户在扩展侧边栏获取 auth token
|
||||||
// 2. 你的页面读取相同的 token 并存入 localStorage
|
// 2) 仅在可信应用中设置该 token
|
||||||
// 3. Token 匹配后,扩展会暴露 window.PAGE_AGENT_EXT 对象
|
// 3) token 匹配后,扩展会暴露 window.PAGE_AGENT_EXT
|
||||||
|
|
||||||
// ⚠️ 请在扩展弹窗中查看你的 auth token,然后填入下方
|
// ⚠️ 不要把 token 提供给不可信页面或脚本
|
||||||
localStorage.setItem('PageAgentExtUserAuthToken', '<从扩展中获取的-token>')`
|
localStorage.setItem('PageAgentExtUserAuthToken', '<从扩展中获取的-token>')`
|
||||||
: `// 1. User installs extension and sets an auth token in extension settings
|
: `// 1) Get auth token from the extension side panel
|
||||||
// 2. Your page reads the same token and stores it in localStorage
|
// 2) Set it only in trusted applications
|
||||||
// 3. After token match, extension exposes window.PAGE_AGENT_EXT object
|
// 3) After token match, extension exposes window.PAGE_AGENT_EXT
|
||||||
|
|
||||||
// ⚠️ Check your extension popup for the auth token
|
// ⚠️ Never provide the token to untrusted pages or scripts
|
||||||
localStorage.setItem('PageAgentExtUserAuthToken', '<your-token-from-extension>')`
|
localStorage.setItem('PageAgentExtUserAuthToken', '<your-token-from-extension>')`
|
||||||
}
|
}
|
||||||
language="javascript"
|
language="javascript"
|
||||||
@@ -152,13 +177,87 @@ localStorage.setItem('PageAgentExtUserAuthToken', '<your-token-from-extension>')
|
|||||||
</div>
|
</div>
|
||||||
</section>
|
</section>
|
||||||
|
|
||||||
<h3 className="text-xl font-semibold my-3">PAGE_AGENT_EXT.execute(task, config)</h3>
|
{/* TypeScript Declaration */}
|
||||||
|
<h2 className="text-2xl font-bold mb-4">
|
||||||
|
{isZh ? 'TypeScript 类型声明' : 'TypeScript Declaration'}
|
||||||
|
</h2>
|
||||||
<p className="text-gray-600 dark:text-gray-300 mb-4">
|
<p className="text-gray-600 dark:text-gray-300 mb-4">
|
||||||
{isZh
|
{isZh
|
||||||
? '使用配置执行任务。返回一个 Promise,在任务完成时 resolve。config 参数包含 LLM 设置、选项和事件回调。'
|
? '推荐把 `execute` 的类型声明加入你的项目,获得完整类型提示。'
|
||||||
: 'Execute a task with configuration. Returns a Promise that resolves when the task completes. Config includes LLM settings, options, and event callbacks.'}
|
: 'Add this `execute` declaration to your project for full type support.'}
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
|
<CodeEditor
|
||||||
|
code={
|
||||||
|
isZh
|
||||||
|
? `import type {
|
||||||
|
AgentActivity,
|
||||||
|
AgentStatus,
|
||||||
|
ExecutionResult,
|
||||||
|
HistoricalEvent
|
||||||
|
} from '@page-agent/core'
|
||||||
|
|
||||||
|
interface ExecuteConfig {
|
||||||
|
baseURL: string // LLM API 端点
|
||||||
|
apiKey: string // API 密钥
|
||||||
|
model: string // 模型名称
|
||||||
|
|
||||||
|
includeInitialTab?: boolean
|
||||||
|
onStatusChange?: (status: AgentStatus) => void
|
||||||
|
onActivity?: (activity: AgentActivity) => void
|
||||||
|
onHistoryUpdate?: (history: HistoricalEvent[]) => void
|
||||||
|
onDispose?: () => void
|
||||||
|
}
|
||||||
|
|
||||||
|
type Execute = (task: string, config: ExecuteConfig) => Promise<ExecutionResult>
|
||||||
|
|
||||||
|
declare global {
|
||||||
|
interface Window {
|
||||||
|
PAGE_AGENT_EXT_VERSION?: string
|
||||||
|
PAGE_AGENT_EXT?: {
|
||||||
|
version: string
|
||||||
|
execute: Execute
|
||||||
|
dispose: () => void
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}`
|
||||||
|
: `import type {
|
||||||
|
AgentActivity,
|
||||||
|
AgentStatus,
|
||||||
|
ExecutionResult,
|
||||||
|
HistoricalEvent
|
||||||
|
} from '@page-agent/core'
|
||||||
|
|
||||||
|
interface ExecuteConfig {
|
||||||
|
baseURL: string // LLM API endpoint
|
||||||
|
apiKey: string // API key
|
||||||
|
model: string // Model name
|
||||||
|
|
||||||
|
includeInitialTab?: boolean
|
||||||
|
onStatusChange?: (status: AgentStatus) => void
|
||||||
|
onActivity?: (activity: AgentActivity) => void
|
||||||
|
onHistoryUpdate?: (history: HistoricalEvent[]) => void
|
||||||
|
onDispose?: () => void
|
||||||
|
}
|
||||||
|
|
||||||
|
type Execute = (task: string, config: ExecuteConfig) => Promise<ExecutionResult>
|
||||||
|
|
||||||
|
declare global {
|
||||||
|
interface Window {
|
||||||
|
PAGE_AGENT_EXT_VERSION?: string
|
||||||
|
PAGE_AGENT_EXT?: {
|
||||||
|
version: string
|
||||||
|
execute: Execute
|
||||||
|
dispose: () => void
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}`
|
||||||
|
}
|
||||||
|
language="typescript"
|
||||||
|
/>
|
||||||
|
|
||||||
|
<h3 className="text-xl font-semibold mt-6 mb-3">PAGE_AGENT_EXT.execute(task, config)</h3>
|
||||||
|
|
||||||
<CodeEditor
|
<CodeEditor
|
||||||
code={
|
code={
|
||||||
isZh
|
isZh
|
||||||
@@ -168,7 +267,7 @@ const result = await window.PAGE_AGENT_EXT.execute(
|
|||||||
{
|
{
|
||||||
baseURL: 'https://api.openai.com/v1',
|
baseURL: 'https://api.openai.com/v1',
|
||||||
apiKey: 'your-api-key',
|
apiKey: 'your-api-key',
|
||||||
model: 'gpt-5-2',
|
model: 'gpt-5.2',
|
||||||
// includeInitialTab: false, // 设为 false 排除初始标签页
|
// includeInitialTab: false, // 设为 false 排除初始标签页
|
||||||
onStatusChange: status => console.log('状态变化:', status),
|
onStatusChange: status => console.log('状态变化:', status),
|
||||||
onActivity: activity => console.log('活动:', activity),
|
onActivity: activity => console.log('活动:', activity),
|
||||||
@@ -184,7 +283,7 @@ const result = await window.PAGE_AGENT_EXT.execute(
|
|||||||
{
|
{
|
||||||
baseURL: 'https://api.openai.com/v1',
|
baseURL: 'https://api.openai.com/v1',
|
||||||
apiKey: 'your-api-key',
|
apiKey: 'your-api-key',
|
||||||
model: 'gpt-5-2',
|
model: 'gpt-5.2',
|
||||||
// includeInitialTab: false, // Set to false to exclude initial tab
|
// includeInitialTab: false, // Set to false to exclude initial tab
|
||||||
onStatusChange: status => console.log('Status change:', status),
|
onStatusChange: status => console.log('Status change:', status),
|
||||||
onActivity: activity => console.log('Activity:', activity),
|
onActivity: activity => console.log('Activity:', activity),
|
||||||
@@ -217,111 +316,29 @@ window.PAGE_AGENT_EXT.dispose()`
|
|||||||
/>
|
/>
|
||||||
</section>
|
</section>
|
||||||
|
|
||||||
{/* ExecuteConfig */}
|
|
||||||
<section>
|
|
||||||
<h2 className="text-2xl font-bold mb-4">{isZh ? '执行配置' : 'Execute Configuration'}</h2>
|
|
||||||
<p className="text-gray-600 dark:text-gray-300 mb-4">
|
|
||||||
{isZh
|
|
||||||
? 'config 参数包含 LLM 设置、选项和事件回调,用于控制任务执行行为。'
|
|
||||||
: 'The config parameter includes LLM settings, options, and event callbacks to control task execution behavior.'}
|
|
||||||
</p>
|
|
||||||
|
|
||||||
<CodeEditor
|
|
||||||
code={
|
|
||||||
isZh
|
|
||||||
? `interface ExecuteConfig {
|
|
||||||
baseURL: string // LLM API 端点
|
|
||||||
apiKey: string // API 密钥
|
|
||||||
model: string // 模型名称
|
|
||||||
|
|
||||||
// 是否将初始标签页包含在任务中,默认 true
|
|
||||||
includeInitialTab?: boolean
|
|
||||||
|
|
||||||
// Agent 状态变化时调用(idle, running, error, completed 等)
|
|
||||||
onStatusChange?: (status: AgentStatus) => void
|
|
||||||
|
|
||||||
// Agent 执行活动时调用(如点击、输入、导航等操作)
|
|
||||||
onActivity?: (activity: AgentActivity) => void
|
|
||||||
|
|
||||||
// 历史记录更新时调用(包含完整的事件历史)
|
|
||||||
onHistoryUpdate?: (history: HistoricalEvent[]) => void
|
|
||||||
|
|
||||||
// Agent 被停止时调用
|
|
||||||
onDispose?: () => void
|
|
||||||
}`
|
|
||||||
: `interface ExecuteConfig {
|
|
||||||
baseURL: string // LLM API endpoint
|
|
||||||
apiKey: string // API key
|
|
||||||
model: string // Model name
|
|
||||||
|
|
||||||
// Whether to include the initial tab in the task, default true
|
|
||||||
includeInitialTab?: boolean
|
|
||||||
|
|
||||||
// Called when agent status changes (idle, running, error, completed, etc.)
|
|
||||||
onStatusChange?: (status: AgentStatus) => void
|
|
||||||
|
|
||||||
// Called when agent performs an activity (click, input, navigation, etc.)
|
|
||||||
onActivity?: (activity: AgentActivity) => void
|
|
||||||
|
|
||||||
// Called when history is updated (contains full event history)
|
|
||||||
onHistoryUpdate?: (history: HistoricalEvent[]) => void
|
|
||||||
|
|
||||||
// Called when agent is disposed
|
|
||||||
onDispose?: () => void
|
|
||||||
}`
|
|
||||||
}
|
|
||||||
language="typescript"
|
|
||||||
/>
|
|
||||||
</section>
|
|
||||||
|
|
||||||
{/* Security Notice */}
|
|
||||||
<section className="p-4 bg-yellow-50 dark:bg-yellow-900/20 rounded-lg">
|
|
||||||
<h3 className="text-lg font-semibold text-yellow-900 dark:text-yellow-300 mb-2">
|
|
||||||
⚠️ {isZh ? '安全须知' : 'Security Notes'}
|
|
||||||
</h3>
|
|
||||||
<ul className="text-gray-600 dark:text-gray-300 space-y-1 text-sm">
|
|
||||||
<li>
|
|
||||||
•{' '}
|
|
||||||
{isZh
|
|
||||||
? '用户必须在扩展设置中显式授权每个域名'
|
|
||||||
: 'Users must explicitly authorize each domain in extension settings'}
|
|
||||||
</li>
|
|
||||||
<li>
|
|
||||||
•{' '}
|
|
||||||
{isZh
|
|
||||||
? '生产环境建议使用后端代理 LLM API Key'
|
|
||||||
: 'Consider using backend proxy for LLM API keys in production'}
|
|
||||||
</li>
|
|
||||||
</ul>
|
|
||||||
</section>
|
|
||||||
|
|
||||||
{/* Integration Guide */}
|
{/* Integration Guide */}
|
||||||
<section>
|
<section>
|
||||||
<h2 className="text-2xl font-bold mb-4">
|
<h2 className="text-2xl font-bold mb-4">
|
||||||
{isZh
|
{isZh
|
||||||
? '将 MultiPageAgent 融入你自己的插件'
|
? '将 MultiPageAgent 集成你自己的插件'
|
||||||
: 'Integrate MultiPageAgent into Your Extension'}
|
: 'Integrate MultiPageAgent into Your Extension'}
|
||||||
</h2>
|
</h2>
|
||||||
<p className="text-gray-600 dark:text-gray-300 mb-4">
|
<p className="text-gray-600 dark:text-gray-300 mb-4">
|
||||||
{isZh
|
{isZh
|
||||||
? '你可以将 MultiPageAgent 集成到自己的浏览器扩展中,实现跨页面的 AI 自动化能力。'
|
? '建议先阅读扩展 API 文档,再参考 background entry implementation。'
|
||||||
: 'You can integrate MultiPageAgent into your own browser extension for cross-page AI automation capabilities.'}
|
: 'Start with the extension API docs, then use the background entry implementation as a reference.'}
|
||||||
|
<a
|
||||||
|
href="https://github.com/alibaba/page-agent/blob/main/packages/extension/src/entrypoints/background.ts"
|
||||||
|
target="_blank"
|
||||||
|
rel="noopener noreferrer"
|
||||||
|
className="inline-flex items-center gap-2 text-blue-600 hover:text-blue-700 dark:text-blue-400 dark:hover:text-blue-300"
|
||||||
|
>
|
||||||
|
<svg className="w-5 h-5" fill="currentColor" viewBox="0 0 24 24">
|
||||||
|
<path d={siGithub.path} />
|
||||||
|
</svg>
|
||||||
|
packages/extension/src/entrypoints/background.ts
|
||||||
|
</a>
|
||||||
</p>
|
</p>
|
||||||
<p className="text-gray-600 dark:text-gray-300 mb-4">TODO</p>
|
|
||||||
<p className="text-gray-600 dark:text-gray-300 mb-4">
|
|
||||||
{isZh ? '参考源码实现:' : 'Reference implementation:'}
|
|
||||||
</p>
|
|
||||||
<a
|
|
||||||
href="https://github.com/alibaba/page-agent/blob/main/packages/extension/src/entrypoints/background.ts"
|
|
||||||
target="_blank"
|
|
||||||
rel="noopener noreferrer"
|
|
||||||
className="inline-flex items-center gap-2 text-blue-600 hover:text-blue-700 dark:text-blue-400 dark:hover:text-blue-300"
|
|
||||||
>
|
|
||||||
<svg className="w-5 h-5" fill="currentColor" viewBox="0 0 24 24">
|
|
||||||
<path d={siGithub.path} />
|
|
||||||
</svg>
|
|
||||||
packages/extension/src/entrypoints/background.ts
|
|
||||||
</a>
|
|
||||||
</section>
|
</section>
|
||||||
</div>
|
</div>
|
||||||
</div>
|
</div>
|
||||||
|
|||||||
@@ -28,20 +28,5 @@
|
|||||||
"erasableSyntaxOnly": true,
|
"erasableSyntaxOnly": true,
|
||||||
"noFallthroughCasesInSwitch": true,
|
"noFallthroughCasesInSwitch": true,
|
||||||
"noUncheckedSideEffectImports": true
|
"noUncheckedSideEffectImports": true
|
||||||
|
|
||||||
// "paths": {
|
|
||||||
// // Simplified monorepo solution (raw npm workspace with hoisting)
|
|
||||||
// "@page-agent/page-controller": ["./packages/page-controller/src/PageController.ts"],
|
|
||||||
// "page-agent": ["./packages/page-agent/src/PageAgent.ts"]
|
|
||||||
// }
|
|
||||||
}
|
}
|
||||||
// "references": [
|
|
||||||
// { "path": "./packages/page-controller" },
|
|
||||||
// { "path": "./packages/page-agent" },
|
|
||||||
// { "path": "./packages/website" }
|
|
||||||
// ],
|
|
||||||
// "include": ["packages/*/src/**/*.ts", "packages/*/src/**/*.tsx"],
|
|
||||||
// "exclude": ["node_modules", "dist", "packages/*/dist"]
|
|
||||||
// "files": ["env.d.ts"]
|
|
||||||
// "files": []
|
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -1,3 +1,5 @@
|
|||||||
|
// this is only for IDE ts language server to work.
|
||||||
|
// do not use this for building or linting.
|
||||||
{
|
{
|
||||||
"extends": "./tsconfig.base.json",
|
"extends": "./tsconfig.base.json",
|
||||||
"references": [
|
"references": [
|
||||||
|
|||||||
Reference in New Issue
Block a user