refactor(PageController): implement PageController
This commit is contained in:
136
AGENTS.md
136
AGENTS.md
@@ -7,6 +7,10 @@ This is a **monorepo** with npm workspaces containing **two main packages**:
|
||||
1. **Core Library** (`packages/page-agent/`) - Pure JavaScript/TypeScript AI agent library for browser DOM automation, published as `page-agent` on npm
|
||||
2. **Website** (`packages/website/`) - React documentation and landing page. Also as demo and test page for the core lib. private package `@page-agent/website`
|
||||
|
||||
And other internal packages. Such as:
|
||||
|
||||
- **Page Controller** (`packages/page-controller/`) - DOM operations and element interactions module. Independent of LLM, can be tested in unit tests.
|
||||
|
||||
## Development Commands
|
||||
|
||||
### Core Commands
|
||||
@@ -35,30 +39,66 @@ npm run build --workspace=@page-agent/website
|
||||
|
||||
### Monorepo Structure
|
||||
|
||||
```
|
||||
We adopt a very simple monorepo solution: ts reference + vite alias.
|
||||
|
||||
We use the same vite config for dev and bundling. Local packages (even when they are published to npm) will be bundled into artifacts instead of installed from npm.
|
||||
That is why we put local packages in devDependencies (with version "*") rather than dependencies.
|
||||
|
||||
You must update relative tsconfig and vite config if you add/remove/rename a package.
|
||||
|
||||
```bash
|
||||
packages/
|
||||
├── page-agent/ # npm: "page-agent"
|
||||
│ ├── src/ # Core library source
|
||||
├── page-agent/ # npm: "page-agent" ⭐ MAIN
|
||||
│ ├── src/ # AI agent source
|
||||
│ │ ├── PageAgent.ts # Main AI agent class
|
||||
│ │ ├── tools/ # LLM tool definitions
|
||||
│ │ ├── llms/ # LLM integration
|
||||
│ │ └── ui/ # UI components
|
||||
│ ├── vite.config.js # Library build (ES + UMD)
|
||||
│ └── package.json
|
||||
└── website/ # npm: "@page-agent/website" (private)
|
||||
├── src/ # Website source (formerly pages/)
|
||||
├── index.html
|
||||
├── vite.config.js # Website build
|
||||
└── package.json
|
||||
├── website/ # npm: "@page-agent/website" (private) ⭐ MAIN
|
||||
│ ├── src/ # Website source
|
||||
│ └── index.html # Entry of vite webpage
|
||||
│
|
||||
│ # ...internal packages below...
|
||||
│
|
||||
└── page-controller/ # npm: "@page-agent/page-controller"
|
||||
└── src/ # DOM operations source
|
||||
├── PageController.ts # Main controller class
|
||||
├── actions.ts # Element interaction actions
|
||||
└── dom/ # DOM tree extraction
|
||||
```
|
||||
|
||||
### Module Boundaries (Critical)
|
||||
|
||||
- **Core library** (`packages/page-agent/`): NEVER import from website - must remain pure JavaScript
|
||||
- **Website** (`packages/website/`): CAN import from `page-agent` for demos. Alias `@/` → `website/src/`
|
||||
- **Page Agent** (`packages/page-agent/`): The core lib. Imports from all internal packages. Never import from website.
|
||||
- **Page Controller** (`packages/page-controller/`): Internal lib. Pure DOM operations, NO LLM dependency. Never import from page-agent.
|
||||
|
||||
### PageController ↔ PageAgent Communication
|
||||
|
||||
All communication between PageAgent and PageController is async and isolated:
|
||||
|
||||
```typescript
|
||||
// PageAgent delegates DOM operations to PageController
|
||||
await this.pageController.updateTree() // Refresh DOM state
|
||||
await this.pageController.clickElement(index) // Click by index
|
||||
await this.pageController.inputText(index, text)
|
||||
await this.pageController.scroll({ down: true, numPages: 1 })
|
||||
|
||||
// PageController exposes state via async methods
|
||||
const simplifiedHTML = await this.pageController.getSimplifiedHTML()
|
||||
const pageInfo = await this.pageController.getPageInfo()
|
||||
```
|
||||
|
||||
DOM element references and internal state (selectorMap, elementTextMap) are encapsulated in PageController.
|
||||
|
||||
### DOM Pipeline
|
||||
|
||||
1. **DOM Extraction**: Convert live DOM to `FlatDomTree` via `src/dom/dom_tree/`
|
||||
1. **DOM Extraction**: Convert live DOM to `FlatDomTree` via `page-controller/src/dom/dom_tree/`
|
||||
2. **Dehydration**: DOM tree → simplified text for LLM processing
|
||||
3. **LLM Processing**: AI model returns action plans
|
||||
4. **Indexed Operations**: Map LLM responses back to specific DOM elements
|
||||
3. **LLM Processing**: AI model returns action plans (in page-agent)
|
||||
4. **Indexed Operations**: PageAgent calls PageController methods by element index
|
||||
|
||||
### Event Bus Communication
|
||||
|
||||
@@ -91,27 +131,41 @@ Library auto-initializes when injected via script tag:
|
||||
|
||||
Query params configure `PageAgentConfig` automatically in `src/entry.ts`.
|
||||
|
||||
## File Organization
|
||||
## Key Files Reference
|
||||
|
||||
### Core Library (`packages/page-agent/src/`)
|
||||
### Page Agent (`packages/page-agent/`)
|
||||
|
||||
- `entry.ts` - CDN/UMD entry point with auto-initialization
|
||||
- `PageAgent.ts` - **Main AI agent class** orchestrating DOM operations
|
||||
- `tools/` - Agent tool implementations for web actions
|
||||
- `ui/` - UI components (Panel, SimulatorMask) with CSS modules
|
||||
- `utils/bus.ts` - **Type-safe event bus** for decoupled communication
|
||||
- `patches/` - Framework-specific optimizations (React, Antd compatibility)
|
||||
- `llms/` - LLM integration and communication layer
|
||||
- `dom/` - HTML serialization and page analysis utilities
|
||||
- `config/` - Configuration constants and settings
|
||||
| File | Description |
|
||||
|------|-------------|
|
||||
| `src/PageAgent.ts` | ⭐ Main AI agent class orchestrating tools and LLM |
|
||||
| `src/entry.ts` | CDN/UMD entry point with auto-initialization |
|
||||
| `src/tools/` | Tool definitions that call PageController methods |
|
||||
| `src/utils/bus.ts` | Type-safe event bus for decoupled communication |
|
||||
| `src/ui/` | UI components (Panel, SimulatorMask) with CSS modules |
|
||||
| `src/llms/` | LLM integration and communication layer |
|
||||
| `src/patches/` | Framework-specific optimizations (React, Antd) |
|
||||
| `vite.config.js` | Library build configuration (ES + UMD) |
|
||||
|
||||
### Website (`packages/website/src/`)
|
||||
### Page Controller (`packages/page-controller/`)
|
||||
|
||||
- `main.tsx` - Site entry with hash routing setup
|
||||
- `router.tsx` - **Manual route definitions** (requires explicit registration)
|
||||
- `components/DocsLayout.tsx` - Navigation structure (hardcoded nav items)
|
||||
- `docs/[section]/[topic]/page.tsx` - Documentation pages
|
||||
- `test-pages/` - Library integration test pages
|
||||
| File | Description |
|
||||
|------|-------------|
|
||||
| `src/PageController.ts` | ⭐ Main controller class managing DOM state and actions |
|
||||
| `src/actions.ts` | Element interaction implementations (click, input, scroll) |
|
||||
| `src/dom/dom_tree/index.js` | Core DOM extraction engine (ported from browser-use) |
|
||||
| `src/dom/getPageInfo.ts` | Page scroll/size information |
|
||||
| `src/types.ts` | TypeScript interfaces for controller |
|
||||
|
||||
### Website (`packages/website/`)
|
||||
|
||||
| File | Description |
|
||||
|------|-------------|
|
||||
| `src/router.tsx` | ⭐ Central routing (manual registration required) |
|
||||
| `src/components/DocsLayout.tsx` | Navigation structure (hardcoded nav items) |
|
||||
| `src/main.tsx` | Site entry with hash routing setup |
|
||||
| `src/docs/[section]/[topic]/page.tsx` | Documentation pages |
|
||||
| `src/test-pages/` | Library integration test pages |
|
||||
| `vite.config.js` | Website build configuration |
|
||||
|
||||
## Adding New Features
|
||||
|
||||
@@ -123,9 +177,15 @@ Query params configure `PageAgentConfig` automatically in `src/entry.ts`.
|
||||
|
||||
### New Agent Tool
|
||||
|
||||
1. Implement under `packages/page-agent/src/tools/`
|
||||
2. Export via `packages/page-agent/src/tools/index.ts`
|
||||
3. Wire into `PageAgent.ts` if needed
|
||||
1. Implement tool in `packages/page-agent/src/tools/index.ts`
|
||||
2. If tool needs DOM operations, add method to PageController first
|
||||
3. Tool calls `this.pageController.methodName()` for DOM interactions
|
||||
|
||||
### New PageController Action
|
||||
|
||||
1. Add action implementation in `packages/page-controller/src/actions.ts`
|
||||
2. Expose via async method in `PageController.ts`
|
||||
3. Export from `packages/page-controller/src/index.ts`
|
||||
|
||||
### New UI Component
|
||||
|
||||
@@ -153,18 +213,6 @@ Query params configure `PageAgentConfig` automatically in `src/entry.ts`.
|
||||
- Relative imports last
|
||||
- Blank lines between groups
|
||||
|
||||
## Critical Files to Understand
|
||||
|
||||
- `packages/page-agent/src/PageAgent.ts` - Core AI agent class with DOM manipulation
|
||||
- `packages/page-agent/src/dom/dom_tree/index.js` - DOM extraction engine
|
||||
- `packages/page-agent/src/utils/bus.ts` - Type-safe event bus system
|
||||
- `packages/page-agent/src/entry.ts` - Library entry point for CDN usage
|
||||
- `packages/page-agent/vite.config.js` - Library build configuration
|
||||
|
||||
- `packages/website/src/router.tsx` - Central routing definition (manual registration required)
|
||||
- `packages/website/src/components/DocsLayout.tsx` - Navigation structure
|
||||
- `packages/website/vite.config.js` - Website build configuration
|
||||
|
||||
## Debugging Common Issues
|
||||
|
||||
### Blank Documentation Pages
|
||||
|
||||
Reference in New Issue
Block a user