docs: update extension related docs

2026-02-12 17:19:14 +08:00
parent 11d66f42c4
commit f19b3cc2cc
9 changed files with 312 additions and 438 deletions
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -22,6 +22,7 @@ npm start                    # Start website dev server
 npm run build                # Build all packages
 npm run build:libs           # Build all libraries
 npm run lint                 # ESLint with TypeScript strict rules
+npm run zip -w @page-agent/ext # Zip the extension package
 ```

 ## Architecture
@@ -36,7 +37,7 @@ packages/
 ├── page-agent/              # npm: "page-agent" entry class (with UI + controller + demo builds)
 ├── website/                 # @page-agent/website (private)
 ├── llms/                    # @page-agent/llms
-├── extension/               # 🚧 WIP: Browser extension (WXT + React)
+├── extension/               # Browser extension (WXT + React)
 ├── page-controller/         # @page-agent/page-controller
 └── ui/                      # @page-agent/ui
 ```
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -20,10 +20,11 @@ Thank you for your interest in contributing to Page-Agent! We welcome contributi

 ### Project Structure

-This is a **monorepo** with npm workspaces containing **3 main packages**:
+This is a **monorepo** with npm workspaces containing **4 main packages**:

 - **Page Agent** (`packages/page-agent/`) - Main entry with built-in UI Panel, published as `page-agent` on npm
 - **Core** (`packages/core/`) - Core agent logic without UI (npm: `@page-agent/core`)
+- **Extension** (`packages/extension/`) - Chrome extension for multi-page tasks and browser-level automation
 - **Website** (`packages/website/`) - React documentation and landing page. Also as demo and test page for the core lib. private package `@page-agent/website`

 We use a simplified monorepo solution with `native npm-workspace + ts reference + vite alias`. No fancy tooling. Hoisting is required.
@@ -145,6 +146,16 @@ If your lame AI assistant does not support [AGENTS.md](https://agents.md/). Add
 npm start
 ```

+### Extension Development
+
+```bash
+npm run dev -w @page-agent/ext
+npm run zip -w @page-agent/ext
+```
+
+- Load extension in Chrome via `chrome://extensions` -> **Load unpacked**
+- Use `packages/extension/docs/extension_api.md` (EN) or `packages/extension/docs/extension_api_zh.md` (ZH) for API integration details
+
 ### Testing on Other Websites

 - Start and serve a local `iife` script
@@ -193,14 +204,6 @@ By contributing to this project, you agree that your contributions will be licen

 > You may need to sign a github CLA before you create a PR.

-### Browser-Use Attribution
-
-Parts of this project are derived from the [browser-use](https://github.com/browser-use/browser-use) project (MIT License). When contributing to DOM-related functionality:
-
- Maintain existing attribution comments
- Follow similar patterns for consistency
- Credit browser-use for derived concepts
-
 ## 💬 Questions?

 - Open a GitHub issue for technical questions
--- a/README-zh.md
+++ b/README-zh.md
@@ -1,4 +1,4 @@
-# PageAgent 🤖🪄
+# Page Agent

 <picture>
  <source media="(prefers-color-scheme: dark)" srcset="https://img.alicdn.com/imgextra/i4/O1CN01qKig1P1FnhpFKNdi6_!!6000000000532-2-tps-1280-256.png">
@@ -20,17 +20,15 @@
 ## ✨ Features

 - **🎯 轻松集成** 
-    - 无需 Python，无需无头浏览器，无需浏览器插件。纯页面内脚本。
- **🔐 端侧运行**
- **🧠 HTML 脱水**
- **💬 自然语言接口**
- **🎨 HITL 交互界面**
-
-以及 😉
-
- **🧪 实验性的 Chrome 扩展，支持跨页面控制** - `packages/extension`
-
-👉 [**🗺️ Roadmap**](https://github.com/alibaba/page-agent/issues/96)
+  - 无需 `浏览器插件` / `Python` / `无头浏览器`。
+  - 纯页面内 JavaScript，一切都在你的网页中完成。
+  - The best tool for your agent to control web pages.
+- **📖 基于文本的 DOM 操作**
+  - 无需截图，无需 OCR 或多模态模型。
+  - 无需特殊权限。
+- **🧠 用你自己的 LLM**
+- **🎨 精美 UI，支持人机协同**
+- **🐙 可选的 [Chrome 扩展](https://alibaba.github.io/page-agent/#/docs/features/chrome-extension)，支持跨页面任务。**

 ## 🚀 快速开始

@@ -39,20 +37,16 @@
 通过我们免费的 Demo LLM 快速体验 PageAgent：

 ```html
-<script
-    src="https://registry.npmmirror.com/page-agent/1.2.0/files/dist/iife/page-agent.demo.js"
-    crossorigin="true"
-></script>
+<script src="{URL}" crossorigin="true"></script>
 ```

-> - **⚠️ 仅用于技术评估。** Demo LLM 有速率和使用限制，可能随时变更。
-> - **🌷 建议使用自己的 LLM API。**
-
 | Mirrors | URL                                                                                |
 | ------- | ---------------------------------------------------------------------------------- |
 | Global  | https://cdn.jsdelivr.net/npm/page-agent@1.2.0/dist/iife/page-agent.demo.js         |
 | China   | https://registry.npmmirror.com/page-agent/1.2.0/files/dist/iife/page-agent.demo.js |

+> **⚠️ 仅用于技术评估。** Demo LLM 有速率和使用限制，速度较慢，可能随时变更。
+
 ### NPM 安装

 ```bash
@@ -72,7 +66,7 @@ const agent = new PageAgent({
 await agent.execute('点击登录按钮')
 ```

-适用于无法使用 NPM 的环境，我们也提供了 IIFE 构建的 CDN 方式。[@see CDN Usage](https://alibaba.github.io/page-agent/#/docs/integration/cdn-setup)
+更多编程用法，请参阅 [📖 文档](https://alibaba.github.io/page-agent/#/docs/introduction/overview)。

 ## 🏗️ 架构设计

@@ -80,12 +74,13 @@ PageAgent adopts a simplified monorepo structure:

 ```
 packages/
-├── core/                # ** Core agent logic without UI(npm: @page-agent/core) **
-├── page-agent/          # Exported agent and demo(npm: page-agent)
+├── core/                # ** Core agent logic (npm: @page-agent/core) **
 ├── llms/                # LLM 客户端 (npm: @page-agent/llms)
-├── page-controller/     # DOM 操作 & 蒙层 & 模拟鼠标 (npm: @page-agent/page-controller)
-├── ui/                  # 面板 & i18n (npm: @page-agent/ui)
-└── website/             # 文档站点
+├── page-controller/     # DOM 操作 (npm: @page-agent/page-controller)
+├── ui/                  # 面板 UI (npm: @page-agent/ui)
+├── page-agent/          # 入口类 & iife 包 (npm: page-agent)
+├── extension/           # Chrome 扩展，支持跨页面任务
+└── website/             # 网站 & 文档站点
 ```

 ## 🤝 贡献
--- a/README.md
+++ b/README.md
@@ -1,4 +1,4 @@
-# PageAgent 🤖🪄
+# Page Agent

 <picture>
  <source media="(prefers-color-scheme: dark)" srcset="https://img.alicdn.com/imgextra/i4/O1CN01qKig1P1FnhpFKNdi6_!!6000000000532-2-tps-1280-256.png">
@@ -19,18 +19,16 @@ The GUI Agent Living in Your Webpage. Control web interfaces with natural langua

 ## ✨ Features

- **🎯 Easy Integration**
-    - No python. No headless browser. No browser extension. Just in-page scripts.
- **🔐 Client-Side Processing**
- **🧠 DOM Extraction**
- **💬 Natural Language Interface**
- **🎨 UI with Human in the loop**
-
-And 😉
-
- **🧪 `cross-page` control with an experimental chrome extension** - `packages/extension`
-
-👉 [**🗺️ Roadmap**](https://github.com/alibaba/page-agent/issues/96)
+- **🎯 Easy integration** 
+  - No need for `browser extension` / `python` / `headless browser`. 
+  - Just in-page javascript. Everything happens in your web page.
+  - The best tool for your agent to control web pages.
+- **📖 Text-based DOM manipulation**
+  - No screenshots. No OCR or multi-modal LLMs needed.
+  - No special permissions required.
+- **🧠 Bring your own LLMs**
+- **🎨 Pretty UI with human-in-the-loop**
+- **🐙 Optional [chrome extension](https://alibaba.github.io/page-agent/#/docs/features/chrome-extension) for multi-page tasks.**

 ## 🚀 Quick Start

@@ -39,20 +37,16 @@ And 😉
 Fastest way to try PageAgent with our free Demo LLM:

 ```html
-<script
-    src="https://cdn.jsdelivr.net/npm/page-agent@1.2.0/dist/iife/page-agent.demo.js"
-    crossorigin="true"
-></script>
+<script src="{URL}" crossorigin="true"></script>
 ```

-> - **⚠️ For technical evaluation only.** Demo LLM has rate limits and usage restrictions. May change without notice.
-> - **🌷 Bring your own LLM API.**
-
 | Mirrors | URL                                                                                |
 | ------- | ---------------------------------------------------------------------------------- |
 | Global  | https://cdn.jsdelivr.net/npm/page-agent@1.2.0/dist/iife/page-agent.demo.js         |
 | China   | https://registry.npmmirror.com/page-agent/1.2.0/files/dist/iife/page-agent.demo.js |

+> **⚠️ For technical evaluation only.** Demo LLM has rate limits and usage restrictions. Slow. May change without notice.
+
 ### NPM Installation

 ```bash
@@ -72,18 +66,21 @@ const agent = new PageAgent({
 await agent.execute('Click the login button')
 ```

+For more programmatic usage, see [📖 Documentations](https://alibaba.github.io/page-agent/#/docs/introduction/overview).
+
 ## 🏗️ Structure

 PageAgent adopts a simplified monorepo structure:

 ```
 packages/
-├── core/                # ** Core agent logic without UI(npm: @page-agent/core) **
-├── page-agent/          # Exported agent and demo(npm: page-agent)
+├── core/                # ** Core agent logic (npm: @page-agent/core) **
 ├── llms/                # LLM client (npm: @page-agent/llms)
-├── page-controller/     # DOM operations & Visual Mask (npm: @page-agent/page-controller)
-├── ui/                  # Panel & i18n (npm: @page-agent/ui)
-└── website/             # Demo & Documentation site
+├── page-controller/     # DOM operations (npm: @page-agent/page-controller)
+├── ui/                  # Panel UI (npm: @page-agent/ui)
+├── page-agent/          # Entry class and iife builds(npm: page-agent)
+├── extension/           # Chrome extension for multi-page tasks
+└── website/             # Website & Documentation site
 ```

 ## 🤝 Contributing
--- a/packages/extension/docs/extension_api.md
+++ b/packages/extension/docs/extension_api.md
@@ -1,12 +1,18 @@
 # Page Agent Extension API

-This document describes how to integrate the Page Agent browser extension into your web application.
+Integrate the Page Agent extension into your web app and trigger multi-page browser tasks from page JavaScript.

 ## Installation

 ### 1. Install the browser extension

-Install the Page Agent extension from the Chrome Web Store.
+Primary channel:
+
+- Chrome Web Store: https://chromewebstore.google.com/detail/page-agent-ext/akldabonmimlicnjlflnapfeklbfemhj
+
+Latest updates are often published earlier on:
+
+- GitHub Releases: https://github.com/alibaba/page-agent/releases

 ### 2. Install type definitions (recommended)

@@ -14,11 +20,19 @@ Install the Page Agent extension from the Chrome Web Store.
 npm install @page-agent/core --save-dev
 ```

-### 3. Set up authentication
+### 3. Authorization (Token)

-The extension only injects APIs when it detects a valid token in `localStorage`.
+The token allows your page JS to call the extension API (`window.PAGE_AGENT_EXT`) and execute multi-page tasks.

-1. Open the extension's side panel to get your authorization token
+Why token-based access is required:
+
+- The extension has broad browser permissions (page access, navigation, multi-tab control).
+- If abused, it can harm user privacy and security.
+- Users must explicitly provide the token only to applications they trust.
+
+Setup:
+
+1. Open the extension side panel and copy your auth token.
 2. Set the token in your page:

 ```typescript
@@ -60,36 +74,36 @@ if (await waitForExtension()) {

 ## Global API

-The extension injects the following APIs into the `window` object:
+After token match, the extension injects APIs into `window`.

 ### `window.PAGE_AGENT_EXT_VERSION`

-Extension version string (e.g., `"1.0.0"`). This is exposed separately to allow version checking before accessing the main API object.
+Extension version string (for capability checks before using the main API).

 ### `window.PAGE_AGENT_EXT`

-Main API namespace object containing:
+Main namespace object.

 #### `PAGE_AGENT_EXT.execute(task, config)`

-Execute an agent task.
+Execute one agent task.

-**Parameters:**
+Parameters:

 | Name | Type | Required | Description |
-|------|------|----------|-------------|
+| ---- | ---- | -------- | ----------- |
 | `task` | `string` | Yes | Task description |
-| `config` | `ExecuteConfig` | Yes | Execution configuration (LLM settings, options, and event callbacks) |
+| `config` | `ExecuteConfig` | Yes | LLM settings, options, and callbacks |

-**Returns:** `Promise<ExecutionResult>`
+Returns: `Promise<ExecutionResult>`

 #### `PAGE_AGENT_EXT.dispose()`

-Stop and destroy the current running agent.
+Stop the current task.

 ## Types

-Install `@page-agent/core` for full type definitions:
+Install `@page-agent/core` for complete types:

 ```typescript
 import type {
@@ -104,10 +118,7 @@ export interface ExecuteConfig {
  apiKey: string
  model: string

-  /**
-   * Whether to include the initial tab (that holds this main world script) in the task.
-   * @default true
-   */
+  // Include the initial tab where page JS starts. Default: true.
  includeInitialTab?: boolean

  onStatusChange?: (status: AgentStatus) => void
@@ -119,20 +130,13 @@ export interface ExecuteConfig {
 export type Execute = (task: string, config: ExecuteConfig) => Promise<ExecutionResult>
 ```

-### AgentStatus
+`AgentStatus`

 ```typescript
 type AgentStatus = 'idle' | 'running' | 'completed' | 'error'
 ```

-| Status | Description |
-|--------|-------------|
-| `idle` | Agent is idle, ready to execute |
-| `running` | Agent is executing a task |
-| `completed` | Task completed successfully |
-| `error` | Task failed with an error |
-
-### AgentActivity
+`AgentActivity`

 ```typescript
 type AgentActivity =
@@ -143,15 +147,7 @@ type AgentActivity =
  | { type: 'error'; message: string }
 ```

-| Type | Description |
-|------|-------------|
-| `thinking` | Agent is analyzing the page and planning |
-| `executing` | Agent is executing a tool action |
-| `executed` | Tool execution completed |
-| `retrying` | Retrying after a failure |
-| `error` | An error occurred |
-
-### HistoricalEvent
+`HistoricalEvent`

 ```typescript
 type HistoricalEvent =
@@ -162,7 +158,7 @@ type HistoricalEvent =
  | { type: 'error'; message: string; rawResponse?: unknown }
 ```

-### ExecutionResult
+`ExecutionResult`

 ```typescript
 interface ExecutionResult {
@@ -183,81 +179,22 @@ const result = await window.PAGE_AGENT_EXT!.execute(
    baseURL: 'https://api.openai.com/v1',
    apiKey: process.env.OPENAI_API_KEY!,
    model: 'gpt-5.2',
-  }
-)
-
-if (result.success) {
-  console.log('Task completed:', result.data)
-} else {
-  console.error('Task failed')
-}
-```
-
-### Exclude Initial Tab
-
-By default, the agent includes the initial tab (where the script runs) in the task. Set `includeInitialTab: false` to exclude it:
-
-```typescript
-const result = await window.PAGE_AGENT_EXT!.execute(
-  'Open a new tab and search for page-agent on GitHub',
-  {
-    baseURL: 'https://api.openai.com/v1',
-    apiKey: process.env.OPENAI_API_KEY!,
-    model: 'gpt-5.2',
-    includeInitialTab: false,  // Agent will open new tabs only
+    includeInitialTab: false, // Optional: exclude current tab
+    onStatusChange: (status) => console.log(status),
+    onActivity: (activity) => console.log(activity),
  }
 )
 ```

-### With Event Callbacks
+### Stop the Current Task

 ```typescript
-await window.PAGE_AGENT_EXT!.execute('Navigate to the settings page', {
-  baseURL: 'https://api.openai.com/v1',
-  apiKey: process.env.OPENAI_API_KEY!,
-  model: 'gpt-5.2',
-  onStatusChange: (status) => {
-    updateUI({ agentStatus: status })
-  },
-  onActivity: (activity) => {
-    switch (activity.type) {
-      case 'thinking':
-        showSpinner('Agent is thinking...')
-        break
-      case 'executing':
-        showSpinner(`Executing: ${activity.tool}`)
-        break
-      case 'executed':
-        log(`${activity.tool} completed in ${activity.duration}ms`)
-        break
-      case 'error':
-        showError(activity.message)
-        break
-    }
-  },
-  onHistoryUpdate: (history) => {
-    renderHistory(history)
-  },
-})
-```
-
-### Stop Execution
-
-```typescript
-// Start a task
-window.PAGE_AGENT_EXT!.execute('Scroll through all pages', {
-  baseURL: 'https://api.openai.com/v1',
-  apiKey: process.env.OPENAI_API_KEY!,
-  model: 'gpt-5.2',
-})
-
-// Later, stop it
 window.PAGE_AGENT_EXT!.dispose()
 ```

 ## Window Type Declaration

-If not using `@page-agent/core`, add this to your project:
+If you are not importing `@page-agent/core`, add:

 ```typescript
 import type {
@@ -283,7 +220,7 @@ declare global {
    PAGE_AGENT_EXT_VERSION?: string
    PAGE_AGENT_EXT?: {
      version: string
-      execute: (task: string, config: ExecuteConfig) => Promise<ExecutionResult>
+      execute: Execute
      dispose: () => void
    }
  }
--- a/packages/extension/docs/extension_api_zh.md
+++ b/packages/extension/docs/extension_api_zh.md
@@ -1,12 +1,18 @@
 # Page Agent 浏览器插件 API

-本文档介绍如何在网页应用中接入 Page Agent 浏览器插件。
+在你的网页应用中接入 Page Agent 插件，并通过页面 JavaScript 发起多页面浏览器任务。

 ## 安装

 ### 1. 安装浏览器插件

-从 Chrome 应用商店安装 Page Agent 插件。
+首选渠道：
+
+- Chrome 应用商店：https://chromewebstore.google.com/detail/page-agent-ext/akldabonmimlicnjlflnapfeklbfemhj
+
+通常更快提供最新构建的渠道：
+
+- GitHub Releases：https://github.com/alibaba/page-agent/releases

 ### 2. 安装类型定义（推荐）

@@ -14,11 +20,19 @@
 npm install @page-agent/core --save-dev
 ```

-### 3. 配置认证
+### 3. 授权（Token）

-插件在页面加载后检测 `localStorage` 中的 token，匹配时才会注入 API。
+token 用于让页面 JS 调用扩展 API（`window.PAGE_AGENT_EXT`）并执行多页面任务。

-1. 打开插件的侧边栏面板，获取授权 token
+为什么必须使用 token：
+
+- 插件具备较广的浏览器权限（页面访问、导航、多标签控制）。
+- 若被滥用，可能危害用户隐私与安全。
+- 用户必须主动将 token 提供给其信任的应用。
+
+配置步骤：
+
+1. 在扩展侧边栏中复制 auth token。
 2. 在页面中设置 token：

 ```typescript
@@ -60,32 +74,32 @@ if (await waitForExtension()) {

 ## 全局 API

-插件在 `window` 对象上注入以下 API：
+token 匹配后，插件会在 `window` 上注入 API。

 ### `window.PAGE_AGENT_EXT_VERSION`

-插件版本号字符串（例如 `"1.0.0"`）。单独暴露版本号，方便在访问主 API 对象前进行版本检查。
+插件版本号字符串，可用于在访问主 API 前做能力检查。

 ### `window.PAGE_AGENT_EXT`

-主 API 命名空间对象，包含：
+主命名空间对象。

 #### `PAGE_AGENT_EXT.execute(task, config)`

 执行 Agent 任务。

-**参数：**
+参数：

 | 名称 | 类型 | 必填 | 说明 |
-|------|------|------|------|
+| ---- | ---- | ---- | ---- |
 | `task` | `string` | 是 | 任务描述 |
-| `config` | `ExecuteConfig` | 是 | 执行配置（LLM 设置、选项和事件回调） |
+| `config` | `ExecuteConfig` | 是 | LLM 设置、执行选项和回调 |

-**返回：** `Promise<ExecutionResult>`
+返回：`Promise<ExecutionResult>`

 #### `PAGE_AGENT_EXT.dispose()`

-停止并销毁当前运行的 Agent。
+停止当前任务。

 ## 类型定义

@@ -104,10 +118,7 @@ export interface ExecuteConfig {
  apiKey: string
  model: string

-  /**
-   * 是否将初始标签页（运行此脚本的页面）包含在任务中。
-   * @default true
-   */
+  // 是否包含启动脚本所在标签页。默认 true。
  includeInitialTab?: boolean

  onStatusChange?: (status: AgentStatus) => void
@@ -119,20 +130,13 @@ export interface ExecuteConfig {
 export type Execute = (task: string, config: ExecuteConfig) => Promise<ExecutionResult>
 ```

-### AgentStatus
+`AgentStatus`

 ```typescript
 type AgentStatus = 'idle' | 'running' | 'completed' | 'error'
 ```

-| 状态 | 说明 |
-|------|------|
-| `idle` | 空闲，准备执行 |
-| `running` | 正在执行任务 |
-| `completed` | 任务成功完成 |
-| `error` | 任务执行失败 |
-
-### AgentActivity
+`AgentActivity`

 ```typescript
 type AgentActivity =
@@ -143,15 +147,7 @@ type AgentActivity =
  | { type: 'error'; message: string }
 ```

-| 类型 | 说明 |
-|------|------|
-| `thinking` | Agent 正在分析页面并规划 |
-| `executing` | 正在执行工具操作 |
-| `executed` | 工具执行完成 |
-| `retrying` | 失败后重试 |
-| `error` | 发生错误 |
-
-### HistoricalEvent
+`HistoricalEvent`

 ```typescript
 type HistoricalEvent =
@@ -162,7 +158,7 @@ type HistoricalEvent =
  | { type: 'error'; message: string; rawResponse?: unknown }
 ```

-### ExecutionResult
+`ExecutionResult`

 ```typescript
 interface ExecutionResult {
@@ -183,81 +179,22 @@ const result = await window.PAGE_AGENT_EXT!.execute(
    baseURL: 'https://api.openai.com/v1',
    apiKey: process.env.OPENAI_API_KEY!,
    model: 'gpt-5.2',
-  }
-)
-
-if (result.success) {
-  console.log('任务完成:', result.data)
-} else {
-  console.error('任务失败')
-}
-```
-
-### 排除初始标签页
-
-默认情况下，Agent 会将初始标签页（运行脚本的页面）包含在任务中。设置 `includeInitialTab: false` 可以排除它：
-
-```typescript
-const result = await window.PAGE_AGENT_EXT!.execute(
-  '打开新标签页并在 GitHub 上搜索 page-agent',
-  {
-    baseURL: 'https://api.openai.com/v1',
-    apiKey: process.env.OPENAI_API_KEY!,
-    model: 'gpt-5.2',
-    includeInitialTab: false,  // Agent 只会打开新标签页
+    includeInitialTab: false, // 可选：排除当前标签页
+    onStatusChange: (status) => console.log(status),
+    onActivity: (activity) => console.log(activity),
  }
 )
 ```

-### 使用事件回调
+### 停止当前任务

 ```typescript
-await window.PAGE_AGENT_EXT!.execute('导航到设置页面', {
-  baseURL: 'https://api.openai.com/v1',
-  apiKey: process.env.OPENAI_API_KEY!,
-  model: 'gpt-5.2',
-  onStatusChange: (status) => {
-    updateUI({ agentStatus: status })
-  },
-  onActivity: (activity) => {
-    switch (activity.type) {
-      case 'thinking':
-        showSpinner('Agent 正在思考...')
-        break
-      case 'executing':
-        showSpinner(`正在执行: ${activity.tool}`)
-        break
-      case 'executed':
-        log(`${activity.tool} 完成，耗时 ${activity.duration}ms`)
-        break
-      case 'error':
-        showError(activity.message)
-        break
-    }
-  },
-  onHistoryUpdate: (history) => {
-    renderHistory(history)
-  },
-})
-```
-
-### 停止执行
-
-```typescript
-// 启动任务
-window.PAGE_AGENT_EXT!.execute('滚动浏览所有页面', {
-  baseURL: 'https://api.openai.com/v1',
-  apiKey: process.env.OPENAI_API_KEY!,
-  model: 'gpt-5.2',
-})
-
-// 稍后停止
 window.PAGE_AGENT_EXT!.dispose()
 ```

 ## Window 类型声明

-如果不使用 `@page-agent/core`，可以添加以下声明：
+如果你不直接引入 `@page-agent/core`，可添加以下声明：

 ```typescript
 import type {
@@ -283,7 +220,7 @@ declare global {
    PAGE_AGENT_EXT_VERSION?: string
    PAGE_AGENT_EXT?: {
      version: string
-      execute: (task: string, config: ExecuteConfig) => Promise<ExecutionResult>
+      execute: Execute
      dispose: () => void
    }
  }
--- a/packages/website/src/pages/docs/features/chrome-extension/page.tsx
+++ b/packages/website/src/pages/docs/features/chrome-extension/page.tsx
@@ -1,11 +1,13 @@
-import { siGithub } from 'simple-icons'
+import { siChromewebstore, siGithub } from 'simple-icons'

-import BetaNotice from '@/components/BetaNotice'
 import CodeEditor from '@/components/CodeEditor'
 import { useLanguage } from '@/i18n/context'

 export default function ChromeExtension() {
 	const { isZh } = useLanguage()
+	const chromeWebStoreUrl =
+		'https://chromewebstore.google.com/detail/page-agent-ext/akldabonmimlicnjlflnapfeklbfemhj'
+	const githubReleasesUrl = 'https://github.com/alibaba/page-agent/releases'

 	return (
 		<div>
@@ -13,70 +15,92 @@ export default function ChromeExtension() {

 			<p className="text-xl text-gray-600 dark:text-gray-300 mb-8 leading-relaxed">
 				{isZh
-					? '可选的 Chrome 扩展，解锁多页任务和第三方 API 集成。'
-					: 'Optional Chrome extension that unlocks multi-page tasks and third-party API integration.'}
+					? '可选的 Chrome 扩展。PageAgent.js 继续负责页面内自动化；扩展 API 额外提供多页面任务、浏览器级控制，以及从浏览器外部发起任务的能力。'
+					: 'An optional Chrome extension. PageAgent.js keeps handling in-page automation, while the extension API adds multi-page tasks, browser-level control, and tasks initiated from outside the browser.'}
 			</p>

-			<BetaNotice />
-
 			<div className="space-y-8 mt-8">
-				{/* Hero Section */}
-				<section className="p-6 bg-linear-to-r from-blue-50 to-purple-50 dark:from-blue-900/20 dark:to-purple-900/20 rounded-xl">
-					<div className="flex items-start gap-4">
-						<div>
-							<p className="text-gray-600 dark:text-gray-300">
-								{isZh
-									? '解锁多页任务！借助 Chrome 扩展，Agent 可以跨标签页和页面导航，突破单页限制。'
-									: 'Unlock multi-page tasks! With the Chrome extension, your agent can navigate across tabs and pages, breaking the single-page limitation.'}
-							</p>
-						</div>
-					</div>
-				</section>
-
 				{/* Features */}
 				<section>
 					<h2 className="text-2xl font-bold mb-4">{isZh ? '核心特性' : 'Key Features'}</h2>
-					<div className="grid md:grid-cols-2 gap-4">
+					<div className="grid md:grid-cols-3 gap-4">
 						<div className="p-4 bg-gray-50 dark:bg-gray-800 rounded-lg">
 							<h3 className="font-semibold mb-2">🔓 {isZh ? '多页任务' : 'Multi-Page Tasks'}</h3>
 							<p className="text-gray-600 dark:text-gray-300 text-sm">
 								{isZh
-									? '跨多个页面和标签页执行任务，不再局限于单页操作。'
-									: 'Execute tasks across multiple pages and tabs. No longer limited to single-page operations.'}
+									? '跨多个页面和标签页连续执行任务，不再受限于单页上下文。'
+									: 'Run tasks across multiple pages and tabs without being limited to a single page context.'}
 							</p>
 						</div>
 						<div className="p-4 bg-gray-50 dark:bg-gray-800 rounded-lg">
 							<h3 className="font-semibold mb-2">
-								🔌 {isZh ? '开放第三方接口' : 'Third-Party API'}
+								🧭 {isZh ? '浏览器级控制' : 'Browser-Level Control'}
 							</h3>
 							<p className="text-gray-600 dark:text-gray-300 text-sm">
 								{isZh
-									? '用户授权后，你的网页、本地 Agent 或云端 Agent 都能通过扩展操作用户浏览器！'
-									: 'After user authorization, your webpage, local agent, or cloud agent can control the browser through the extension.'}
+									? '支持跨标签导航、页面切换和更完整的浏览器自动化能力。'
+									: 'Enable richer browser automation, including cross-tab navigation and page switching.'}
+							</p>
+						</div>
+						<div className="p-4 bg-gray-50 dark:bg-gray-800 rounded-lg">
+							<h3 className="font-semibold mb-2">
+								🔌 {isZh ? '开放集成接口' : 'Open Integration API'}
+							</h3>
+							<p className="text-gray-600 dark:text-gray-300 text-sm">
+								{isZh
+									? '用户主动授权后，页面 JS、本地 Agent 或云端 Agent 可通过扩展发起多页面任务。'
+									: 'With explicit user authorization, page JS, local agents, or cloud agents can trigger multi-page tasks through the extension.'}
 							</p>
 						</div>
 					</div>
 				</section>

-				{/* Download */}
+				{/* Install */}
 				<section>
-					<h2 className="text-2xl font-bold mb-4">{isZh ? '下载测试版' : 'Download Beta'}</h2>
-					<p className="text-gray-600 dark:text-gray-300 mb-4">
-						{isZh
-							? '扩展目前处于 Beta 阶段，请从 GitHub Releases 下载最新版本。'
-							: 'The extension is currently in beta. Download the latest version from GitHub Releases.'}
-					</p>
+					<h2 className="text-2xl font-bold mb-4">{isZh ? '获取扩展' : 'Get the Extension'}</h2>
+					<div className="flex flex-wrap gap-3">
 						<a
-						href="https://github.com/alibaba/page-agent/releases"
+							href={chromeWebStoreUrl}
 							target="_blank"
 							rel="noopener noreferrer"
 							className="inline-flex items-center gap-2 px-6 py-3 bg-blue-600 hover:bg-blue-700 text-white font-medium rounded-lg transition-colors"
+						>
+							<svg className="w-5 h-5" fill="currentColor" viewBox="0 0 24 24">
+								<path d={siChromewebstore.path} />
+							</svg>
+							{isZh ? '从 Chrome 应用商店安装' : 'Install from Chrome Web Store'}
+						</a>
+						<a
+							href={githubReleasesUrl}
+							target="_blank"
+							rel="noopener noreferrer"
+							className="inline-flex items-center gap-2 px-6 py-3 bg-gray-900 hover:bg-gray-800 dark:bg-gray-700 dark:hover:bg-gray-600 text-white font-medium rounded-lg transition-colors"
 						>
 							<svg className="w-5 h-5" fill="currentColor" viewBox="0 0 24 24">
 								<path d={siGithub.path} />
 							</svg>
-						{isZh ? '前往 GitHub Releases 下载' : 'Download from GitHub Releases'}
+							{isZh ? 'GitHub Releases（更新版本）' : 'GitHub Releases (faster updates)'}
 						</a>
+					</div>
+				</section>
+
+				{/* Relationship with PageAgent.js */}
+				<section>
+					<h2 className="text-2xl font-bold mb-4">
+						{isZh ? '与 PageAgent.js 的关系' : 'How It Relates to PageAgent.js'}
+					</h2>
+					<div className="p-5 bg-gray-50 dark:bg-gray-800 rounded-lg space-y-3 text-gray-600 dark:text-gray-300">
+						<p>
+							{isZh
+								? 'PageAgent.js 本身即可在页面内完成自动化。Chrome 扩展是可选的能力扩展。'
+								: 'PageAgent.js already works for in-page automation. The Chrome extension is optional, not a dependency.'}
+						</p>
+						<p>
+							{isZh
+								? '通过扩展，你可以执行多页面任务、控制浏览器，以及从浏览器外部（本地服务或云端服务）发起任务。'
+								: 'With the extension, you can perform multi-page tasks, browser-level control, and tasks triggered outside the browser (local or cloud services).'}
+						</p>
+					</div>
 				</section>

 				{/* Third-party Integration */}
@@ -86,32 +110,33 @@ export default function ChromeExtension() {
 					</h2>
 					<p className="text-gray-600 dark:text-gray-300 mb-4">
 						{isZh
-							? '用户授权后，外部应用可以调用扩展 API 来控制浏览器。'
-							: 'After user authorization, external applications can call the extension API to control the browser.'}
+							? '通过页面 JavaScript 调用 `window.PAGE_AGENT_EXT`，你的应用可以发起跨页面任务并控制浏览器行为。'
+							: 'By calling `window.PAGE_AGENT_EXT` from page JavaScript, your app can trigger multi-page tasks and control browser behavior.'}
 					</p>

-					{/* Auth Flow */}
-					<h3 className="text-xl font-semibold mb-3">{isZh ? '授权流程' : 'Authorization Flow'}</h3>
+					<h3 className="text-xl font-semibold mb-3">
+						{isZh ? '授权与安全' : 'Authorization and Security'}
+					</h3>
 					<p className="text-gray-600 dark:text-gray-300 mb-4">
 						{isZh
-							? '扩展使用基于 Token 的授权机制，扩展端和页面端必须持有匹配的 Token。'
-							: 'The extension uses a token-based authorization mechanism. Both extension and page must have matching tokens.'}
+							? '扩展权限范围较广（例如页面访问、导航、多标签控制）。若被滥用，可能危害用户隐私。为此，调用能力由 Token 保护，用户必须主动将 Token 提供给其信任的应用。'
+							: 'The extension has broad permissions (such as page access, navigation, and multi-tab control). If abused, it can harm user privacy. That is why access is protected by a token, and users must actively share the token only with applications they trust.'}
 					</p>

 					<CodeEditor
 						code={
 							isZh
-								? `// 1. 用户安装扩展并在扩展设置中配置 auth token
-// 2. 你的页面读取相同的 token 并存入 localStorage
-// 3. Token 匹配后，扩展会暴露 window.PAGE_AGENT_EXT 对象
+								? `// 1) 用户在扩展侧边栏获取 auth token
+// 2) 仅在可信应用中设置该 token
+// 3) token 匹配后，扩展会暴露 window.PAGE_AGENT_EXT

-// ⚠️ 请在扩展弹窗中查看你的 auth token，然后填入下方
+// ⚠️ 不要把 token 提供给不可信页面或脚本
 localStorage.setItem('PageAgentExtUserAuthToken', '<从扩展中获取的-token>')`
-								: `// 1. User installs extension and sets an auth token in extension settings
-// 2. Your page reads the same token and stores it in localStorage
-// 3. After token match, extension exposes window.PAGE_AGENT_EXT object
+								: `// 1) Get auth token from the extension side panel
+// 2) Set it only in trusted applications
+// 3) After token match, extension exposes window.PAGE_AGENT_EXT

-// ⚠️ Check your extension popup for the auth token
+// ⚠️ Never provide the token to untrusted pages or scripts
 localStorage.setItem('PageAgentExtUserAuthToken', '<your-token-from-extension>')`
 						}
 						language="javascript"
@@ -152,13 +177,87 @@ localStorage.setItem('PageAgentExtUserAuthToken', '<your-token-from-extension>')
 						</div>
 					</section>

-					<h3 className="text-xl font-semibold my-3">PAGE_AGENT_EXT.execute(task, config)</h3>
+					{/* TypeScript Declaration */}
+					<h2 className="text-2xl font-bold mb-4">
+						{isZh ? 'TypeScript 类型声明' : 'TypeScript Declaration'}
+					</h2>
 					<p className="text-gray-600 dark:text-gray-300 mb-4">
 						{isZh
-							? '使用配置执行任务。返回一个 Promise，在任务完成时 resolve。config 参数包含 LLM 设置、选项和事件回调。'
-							: 'Execute a task with configuration. Returns a Promise that resolves when the task completes. Config includes LLM settings, options, and event callbacks.'}
+							? '推荐把 `execute` 的类型声明加入你的项目，获得完整类型提示。'
+							: 'Add this `execute` declaration to your project for full type support.'}
 					</p>

+					<CodeEditor
+						code={
+							isZh
+								? `import type {
+	AgentActivity,
+	AgentStatus,
+	ExecutionResult,
+	HistoricalEvent
+} from '@page-agent/core'
+
+interface ExecuteConfig {
+	baseURL: string   // LLM API 端点
+	apiKey: string    // API 密钥
+	model: string     // 模型名称
+
+	includeInitialTab?: boolean
+	onStatusChange?: (status: AgentStatus) => void
+	onActivity?: (activity: AgentActivity) => void
+	onHistoryUpdate?: (history: HistoricalEvent[]) => void
+	onDispose?: () => void
+}
+
+type Execute = (task: string, config: ExecuteConfig) => Promise<ExecutionResult>
+
+declare global {
+	interface Window {
+		PAGE_AGENT_EXT_VERSION?: string
+		PAGE_AGENT_EXT?: {
+			version: string
+			execute: Execute
+			dispose: () => void
+		}
+	}
+}`
+								: `import type {
+	AgentActivity,
+	AgentStatus,
+	ExecutionResult,
+	HistoricalEvent
+} from '@page-agent/core'
+
+interface ExecuteConfig {
+	baseURL: string   // LLM API endpoint
+	apiKey: string    // API key
+	model: string     // Model name
+
+	includeInitialTab?: boolean
+	onStatusChange?: (status: AgentStatus) => void
+	onActivity?: (activity: AgentActivity) => void
+	onHistoryUpdate?: (history: HistoricalEvent[]) => void
+	onDispose?: () => void
+}
+
+type Execute = (task: string, config: ExecuteConfig) => Promise<ExecutionResult>
+
+declare global {
+	interface Window {
+		PAGE_AGENT_EXT_VERSION?: string
+		PAGE_AGENT_EXT?: {
+			version: string
+			execute: Execute
+			dispose: () => void
+		}
+	}
+}`
+						}
+						language="typescript"
+					/>
+
+					<h3 className="text-xl font-semibold mt-6 mb-3">PAGE_AGENT_EXT.execute(task, config)</h3>
+
 					<CodeEditor
 						code={
 							isZh
@@ -168,7 +267,7 @@ const result = await window.PAGE_AGENT_EXT.execute(
 	{
 		baseURL: 'https://api.openai.com/v1',
 		apiKey: 'your-api-key',
-		model: 'gpt-5-2',
+		model: 'gpt-5.2',
 		// includeInitialTab: false, // 设为 false 排除初始标签页
 		onStatusChange: status => console.log('状态变化:', status),
 		onActivity: activity => console.log('活动:', activity),
@@ -184,7 +283,7 @@ const result = await window.PAGE_AGENT_EXT.execute(
 	{
 		baseURL: 'https://api.openai.com/v1',
 		apiKey: 'your-api-key',
-		model: 'gpt-5-2',
+		model: 'gpt-5.2',
 		// includeInitialTab: false, // Set to false to exclude initial tab
 		onStatusChange: status => console.log('Status change:', status),
 		onActivity: activity => console.log('Activity:', activity),
@@ -217,100 +316,17 @@ window.PAGE_AGENT_EXT.dispose()`
 					/>
 				</section>

-				{/* ExecuteConfig */}
-				<section>
-					<h2 className="text-2xl font-bold mb-4">{isZh ? '执行配置' : 'Execute Configuration'}</h2>
-					<p className="text-gray-600 dark:text-gray-300 mb-4">
-						{isZh
-							? 'config 参数包含 LLM 设置、选项和事件回调，用于控制任务执行行为。'
-							: 'The config parameter includes LLM settings, options, and event callbacks to control task execution behavior.'}
-					</p>
-
-					<CodeEditor
-						code={
-							isZh
-								? `interface ExecuteConfig {
-	baseURL: string   // LLM API 端点
-	apiKey: string    // API 密钥
-	model: string     // 模型名称
-
-	// 是否将初始标签页包含在任务中，默认 true
-	includeInitialTab?: boolean
-
-	// Agent 状态变化时调用（idle, running, error, completed 等）
-	onStatusChange?: (status: AgentStatus) => void
-
-	// Agent 执行活动时调用（如点击、输入、导航等操作）
-	onActivity?: (activity: AgentActivity) => void
-
-	// 历史记录更新时调用（包含完整的事件历史）
-	onHistoryUpdate?: (history: HistoricalEvent[]) => void
-
-	// Agent 被停止时调用
-	onDispose?: () => void
-}`
-								: `interface ExecuteConfig {
-	baseURL: string   // LLM API endpoint
-	apiKey: string    // API key
-	model: string     // Model name
-
-	// Whether to include the initial tab in the task, default true
-	includeInitialTab?: boolean
-
-	// Called when agent status changes (idle, running, error, completed, etc.)
-	onStatusChange?: (status: AgentStatus) => void
-
-	// Called when agent performs an activity (click, input, navigation, etc.)
-	onActivity?: (activity: AgentActivity) => void
-
-	// Called when history is updated (contains full event history)
-	onHistoryUpdate?: (history: HistoricalEvent[]) => void
-
-	// Called when agent is disposed
-	onDispose?: () => void
-}`
-						}
-						language="typescript"
-					/>
-				</section>
-
-				{/* Security Notice */}
-				<section className="p-4 bg-yellow-50 dark:bg-yellow-900/20 rounded-lg">
-					<h3 className="text-lg font-semibold text-yellow-900 dark:text-yellow-300 mb-2">
-						⚠️ {isZh ? '安全须知' : 'Security Notes'}
-					</h3>
-					<ul className="text-gray-600 dark:text-gray-300 space-y-1 text-sm">
-						<li>
-							•{' '}
-							{isZh
-								? '用户必须在扩展设置中显式授权每个域名'
-								: 'Users must explicitly authorize each domain in extension settings'}
-						</li>
-						<li>
-							•{' '}
-							{isZh
-								? '生产环境建议使用后端代理 LLM API Key'
-								: 'Consider using backend proxy for LLM API keys in production'}
-						</li>
-					</ul>
-				</section>
-
 				{/* Integration Guide */}
 				<section>
 					<h2 className="text-2xl font-bold mb-4">
 						{isZh
-							? '将 MultiPageAgent 融入你自己的插件'
+							? '将 MultiPageAgent 集成你自己的插件'
 							: 'Integrate MultiPageAgent into Your Extension'}
 					</h2>
 					<p className="text-gray-600 dark:text-gray-300 mb-4">
 						{isZh
-							? '你可以将 MultiPageAgent 集成到自己的浏览器扩展中，实现跨页面的 AI 自动化能力。'
-							: 'You can integrate MultiPageAgent into your own browser extension for cross-page AI automation capabilities.'}
-					</p>
-					<p className="text-gray-600 dark:text-gray-300 mb-4">TODO</p>
-					<p className="text-gray-600 dark:text-gray-300 mb-4">
-						{isZh ? '参考源码实现：' : 'Reference implementation:'}
-					</p>
+							? '建议先阅读扩展 API 文档，再参考 background entry implementation。'
+							: 'Start with the extension API docs, then use the background entry implementation as a reference.'}
 						<a
 							href="https://github.com/alibaba/page-agent/blob/main/packages/extension/src/entrypoints/background.ts"
 							target="_blank"
@@ -322,6 +338,7 @@ window.PAGE_AGENT_EXT.dispose()`
 							</svg>
 							packages/extension/src/entrypoints/background.ts
 						</a>
+					</p>
 				</section>
 			</div>
 		</div>
--- a/tsconfig.base.json
+++ b/tsconfig.base.json
@@ -28,20 +28,5 @@
        "erasableSyntaxOnly": true,
        "noFallthroughCasesInSwitch": true,
        "noUncheckedSideEffectImports": true
-
-        // "paths": {
-        // 	// Simplified monorepo solution (raw npm workspace with hoisting)
-        // 	"@page-agent/page-controller": ["./packages/page-controller/src/PageController.ts"],
-        // 	"page-agent": ["./packages/page-agent/src/PageAgent.ts"]
-        // }
    }
-    // "references": [
-    // 	{ "path": "./packages/page-controller" },
-    // 	{ "path": "./packages/page-agent" },
-    // 	{ "path": "./packages/website" }
-    // ],
-    // "include": ["packages/*/src/**/*.ts", "packages/*/src/**/*.tsx"],
-    // "exclude": ["node_modules", "dist", "packages/*/dist"]
-    // "files": ["env.d.ts"]
-    // "files": []
 }
--- a/tsconfig.json
+++ b/tsconfig.json
@@ -1,3 +1,5 @@
+// this is only for IDE ts language server to work.
+// do not use this for building or linting.
 {
    "extends": "./tsconfig.base.json",
    "references": [