refactor(PageController): implement PageController

This commit is contained in:
Simon
2025-12-05 16:18:01 +08:00
parent ad19a26a57
commit 683602bb6b
33 changed files with 823 additions and 363 deletions

136
AGENTS.md
View File

@@ -7,6 +7,10 @@ This is a **monorepo** with npm workspaces containing **two main packages**:
1. **Core Library** (`packages/page-agent/`) - Pure JavaScript/TypeScript AI agent library for browser DOM automation, published as `page-agent` on npm
2. **Website** (`packages/website/`) - React documentation and landing page. Also as demo and test page for the core lib. private package `@page-agent/website`
And other internal packages. Such as:
- **Page Controller** (`packages/page-controller/`) - DOM operations and element interactions module. Independent of LLM, can be tested in unit tests.
## Development Commands
### Core Commands
@@ -35,30 +39,66 @@ npm run build --workspace=@page-agent/website
### Monorepo Structure
```
We adopt a very simple monorepo solution: ts reference + vite alias.
We use the same vite config for dev and bundling. Local packages (even when they are published to npm) will be bundled into artifacts instead of installed from npm.
That is why we put local packages in devDependencies (with version "*") rather than dependencies.
You must update relative tsconfig and vite config if you add/remove/rename a package.
```bash
packages/
├── page-agent/ # npm: "page-agent"
│ ├── src/ # Core library source
├── page-agent/ # npm: "page-agent" ⭐ MAIN
│ ├── src/ # AI agent source
│ │ ├── PageAgent.ts # Main AI agent class
│ │ ├── tools/ # LLM tool definitions
│ │ ├── llms/ # LLM integration
│ │ └── ui/ # UI components
│ ├── vite.config.js # Library build (ES + UMD)
│ └── package.json
── website/ # npm: "@page-agent/website" (private)
├── src/ # Website source (formerly pages/)
── index.html
├── vite.config.js # Website build
└── package.json
── website/ # npm: "@page-agent/website" (private) ⭐ MAIN
├── src/ # Website source
── index.html # Entry of vite webpage
# ...internal packages below...
└── page-controller/ # npm: "@page-agent/page-controller"
└── src/ # DOM operations source
├── PageController.ts # Main controller class
├── actions.ts # Element interaction actions
└── dom/ # DOM tree extraction
```
### Module Boundaries (Critical)
- **Core library** (`packages/page-agent/`): NEVER import from website - must remain pure JavaScript
- **Website** (`packages/website/`): CAN import from `page-agent` for demos. Alias `@/``website/src/`
- **Page Agent** (`packages/page-agent/`): The core lib. Imports from all internal packages. Never import from website.
- **Page Controller** (`packages/page-controller/`): Internal lib. Pure DOM operations, NO LLM dependency. Never import from page-agent.
### PageController ↔ PageAgent Communication
All communication between PageAgent and PageController is async and isolated:
```typescript
// PageAgent delegates DOM operations to PageController
await this.pageController.updateTree() // Refresh DOM state
await this.pageController.clickElement(index) // Click by index
await this.pageController.inputText(index, text)
await this.pageController.scroll({ down: true, numPages: 1 })
// PageController exposes state via async methods
const simplifiedHTML = await this.pageController.getSimplifiedHTML()
const pageInfo = await this.pageController.getPageInfo()
```
DOM element references and internal state (selectorMap, elementTextMap) are encapsulated in PageController.
### DOM Pipeline
1. **DOM Extraction**: Convert live DOM to `FlatDomTree` via `src/dom/dom_tree/`
1. **DOM Extraction**: Convert live DOM to `FlatDomTree` via `page-controller/src/dom/dom_tree/`
2. **Dehydration**: DOM tree → simplified text for LLM processing
3. **LLM Processing**: AI model returns action plans
4. **Indexed Operations**: Map LLM responses back to specific DOM elements
3. **LLM Processing**: AI model returns action plans (in page-agent)
4. **Indexed Operations**: PageAgent calls PageController methods by element index
### Event Bus Communication
@@ -91,27 +131,41 @@ Library auto-initializes when injected via script tag:
Query params configure `PageAgentConfig` automatically in `src/entry.ts`.
## File Organization
## Key Files Reference
### Core Library (`packages/page-agent/src/`)
### Page Agent (`packages/page-agent/`)
- `entry.ts` - CDN/UMD entry point with auto-initialization
- `PageAgent.ts` - **Main AI agent class** orchestrating DOM operations
- `tools/` - Agent tool implementations for web actions
- `ui/` - UI components (Panel, SimulatorMask) with CSS modules
- `utils/bus.ts` - **Type-safe event bus** for decoupled communication
- `patches/` - Framework-specific optimizations (React, Antd compatibility)
- `llms/` - LLM integration and communication layer
- `dom/` - HTML serialization and page analysis utilities
- `config/` - Configuration constants and settings
| File | Description |
|------|-------------|
| `src/PageAgent.ts` | ⭐ Main AI agent class orchestrating tools and LLM |
| `src/entry.ts` | CDN/UMD entry point with auto-initialization |
| `src/tools/` | Tool definitions that call PageController methods |
| `src/utils/bus.ts` | Type-safe event bus for decoupled communication |
| `src/ui/` | UI components (Panel, SimulatorMask) with CSS modules |
| `src/llms/` | LLM integration and communication layer |
| `src/patches/` | Framework-specific optimizations (React, Antd) |
| `vite.config.js` | Library build configuration (ES + UMD) |
### Website (`packages/website/src/`)
### Page Controller (`packages/page-controller/`)
- `main.tsx` - Site entry with hash routing setup
- `router.tsx` - **Manual route definitions** (requires explicit registration)
- `components/DocsLayout.tsx` - Navigation structure (hardcoded nav items)
- `docs/[section]/[topic]/page.tsx` - Documentation pages
- `test-pages/` - Library integration test pages
| File | Description |
|------|-------------|
| `src/PageController.ts` | ⭐ Main controller class managing DOM state and actions |
| `src/actions.ts` | Element interaction implementations (click, input, scroll) |
| `src/dom/dom_tree/index.js` | Core DOM extraction engine (ported from browser-use) |
| `src/dom/getPageInfo.ts` | Page scroll/size information |
| `src/types.ts` | TypeScript interfaces for controller |
### Website (`packages/website/`)
| File | Description |
|------|-------------|
| `src/router.tsx` | ⭐ Central routing (manual registration required) |
| `src/components/DocsLayout.tsx` | Navigation structure (hardcoded nav items) |
| `src/main.tsx` | Site entry with hash routing setup |
| `src/docs/[section]/[topic]/page.tsx` | Documentation pages |
| `src/test-pages/` | Library integration test pages |
| `vite.config.js` | Website build configuration |
## Adding New Features
@@ -123,9 +177,15 @@ Query params configure `PageAgentConfig` automatically in `src/entry.ts`.
### New Agent Tool
1. Implement under `packages/page-agent/src/tools/`
2. Export via `packages/page-agent/src/tools/index.ts`
3. Wire into `PageAgent.ts` if needed
1. Implement tool in `packages/page-agent/src/tools/index.ts`
2. If tool needs DOM operations, add method to PageController first
3. Tool calls `this.pageController.methodName()` for DOM interactions
### New PageController Action
1. Add action implementation in `packages/page-controller/src/actions.ts`
2. Expose via async method in `PageController.ts`
3. Export from `packages/page-controller/src/index.ts`
### New UI Component
@@ -153,18 +213,6 @@ Query params configure `PageAgentConfig` automatically in `src/entry.ts`.
- Relative imports last
- Blank lines between groups
## Critical Files to Understand
- `packages/page-agent/src/PageAgent.ts` - Core AI agent class with DOM manipulation
- `packages/page-agent/src/dom/dom_tree/index.js` - DOM extraction engine
- `packages/page-agent/src/utils/bus.ts` - Type-safe event bus system
- `packages/page-agent/src/entry.ts` - Library entry point for CDN usage
- `packages/page-agent/vite.config.js` - Library build configuration
- `packages/website/src/router.tsx` - Central routing definition (manual registration required)
- `packages/website/src/components/DocsLayout.tsx` - Navigation structure
- `packages/website/vite.config.js` - Website build configuration
## Debugging Common Issues
### Blank Documentation Pages