# Instructions for coding assistants ## Project Overview This is a **monorepo** with npm workspaces containing **two main packages**: 1. **Core Library** (`packages/page-agent/`) - Pure JavaScript/TypeScript AI agent library for browser DOM automation, published as `page-agent` on npm 2. **Website** (`packages/website/`) - React documentation and landing page. Also as demo and test page for the core lib. private package `@page-agent/website` And other internal packages. Such as: - **Page Controller** (`packages/page-controller/`) - DOM operations and element interactions module. Independent of LLM, can be tested in unit tests. ## Development Commands ### Core Commands ```bash npm start # Start website dev server npm run dev # Same as start npm run build # Build all packages npm run build:lib # Build page-agent library only npm run lint # ESLint with TypeScript strict rules ``` ### Package-specific Commands ```bash # Core library npm run build --workspace=page-agent npm run build:watch --workspace=page-agent # Website npm run dev --workspace=@page-agent/website npm run build --workspace=@page-agent/website ``` ## Architecture & Critical Patterns ### Monorepo Structure We adopt a very simple monorepo solution: ts reference + vite alias. We use the same vite config for dev and bundling. Local packages (even when they are published to npm) will be bundled into artifacts instead of installed from npm. That is why we put local packages in devDependencies (with version "*") rather than dependencies. You must update relative tsconfig and vite config if you add/remove/rename a package. ```bash packages/ ├── page-agent/ # npm: "page-agent" ⭐ MAIN │ ├── src/ # AI agent source │ │ ├── PageAgent.ts # Main AI agent class │ │ ├── tools/ # LLM tool definitions │ │ ├── llms/ # LLM integration │ │ └── ui/ # UI components │ ├── vite.config.js # Library build (ES + UMD) │ └── package.json ├── website/ # npm: "@page-agent/website" (private) ⭐ MAIN │ ├── src/ # Website source │ └── index.html # Entry of vite webpage │ │ # ...internal packages below... │ └── page-controller/ # npm: "@page-agent/page-controller" └── src/ # DOM operations source ├── PageController.ts # Main controller class ├── actions.ts # Element interaction actions └── dom/ # DOM tree extraction ``` ### Module Boundaries (Critical) - **Website** (`packages/website/`): CAN import from `page-agent` for demos. Alias `@/` → `website/src/` - **Page Agent** (`packages/page-agent/`): The core lib. Imports from all internal packages. Never import from website. - **Page Controller** (`packages/page-controller/`): Internal lib. Pure DOM operations, NO LLM dependency. Never import from page-agent. ### PageController ↔ PageAgent Communication All communication between PageAgent and PageController is async and isolated: ```typescript // PageAgent delegates DOM operations to PageController await this.pageController.updateTree() // Refresh DOM state await this.pageController.clickElement(index) // Click by index await this.pageController.inputText(index, text) await this.pageController.scroll({ down: true, numPages: 1 }) // PageController exposes state via async methods const simplifiedHTML = await this.pageController.getSimplifiedHTML() const pageInfo = await this.pageController.getPageInfo() ``` DOM element references and internal state (selectorMap, elementTextMap) are encapsulated in PageController. ### DOM Pipeline 1. **DOM Extraction**: Convert live DOM to `FlatDomTree` via `page-controller/src/dom/dom_tree/` 2. **Dehydration**: DOM tree → simplified text for LLM processing 3. **LLM Processing**: AI model returns action plans (in page-agent) 4. **Indexed Operations**: PageAgent calls PageController methods by element index ### Event Bus Communication Use `src/utils/bus.ts` for decoupled PageAgent ↔ UI communication: ```typescript // Emit from PageAgent getEventBus().emit('panel:show') getEventBus().emit('panel:update', { status: 'thinking' }) // Listen in UI components getEventBus().on('panel:show', () => panel.show()) ``` ### Hash Routing Requirement Uses wouter with `useHashLocation` for static hosting: ```tsx // Always hash-based routes ``` ### CDN Auto-Injection Pattern Library auto-initializes when injected via script tag: ```html ``` Query params configure `PageAgentConfig` automatically in `src/entry.ts`. ## Key Files Reference ### Page Agent (`packages/page-agent/`) | File | Description | |------|-------------| | `src/PageAgent.ts` | ⭐ Main AI agent class orchestrating tools and LLM | | `src/entry.ts` | CDN/UMD entry point with auto-initialization | | `src/tools/` | Tool definitions that call PageController methods | | `src/utils/bus.ts` | Type-safe event bus for decoupled communication | | `src/ui/` | UI components (Panel, SimulatorMask) with CSS modules | | `src/llms/` | LLM integration and communication layer | | `src/patches/` | Framework-specific optimizations (React, Antd) | | `vite.config.js` | Library build configuration (ES + UMD) | ### Page Controller (`packages/page-controller/`) | File | Description | |------|-------------| | `src/PageController.ts` | ⭐ Main controller class managing DOM state and actions | | `src/actions.ts` | Element interaction implementations (click, input, scroll) | | `src/dom/dom_tree/index.js` | Core DOM extraction engine (ported from browser-use) | | `src/dom/getPageInfo.ts` | Page scroll/size information | | `src/types.ts` | TypeScript interfaces for controller | ### Website (`packages/website/`) | File | Description | |------|-------------| | `src/router.tsx` | ⭐ Central routing (manual registration required) | | `src/components/DocsLayout.tsx` | Navigation structure (hardcoded nav items) | | `src/main.tsx` | Site entry with hash routing setup | | `src/docs/[section]/[topic]/page.tsx` | Documentation pages | | `src/test-pages/` | Library integration test pages | | `vite.config.js` | Website build configuration | ## Adding New Features ### New Documentation Page 1. Create `packages/website/src/docs/
//page.tsx` 2. Add route to `packages/website/src/router.tsx` with `
+ ` wrapper 3. Add navigation item to `DocsLayout.tsx` ### New Agent Tool 1. Implement tool in `packages/page-agent/src/tools/index.ts` 2. If tool needs DOM operations, add method to PageController first 3. Tool calls `this.pageController.methodName()` for DOM interactions ### New PageController Action 1. Add action implementation in `packages/page-controller/src/actions.ts` 2. Expose via async method in `PageController.ts` 3. Export from `packages/page-controller/src/index.ts` ### New UI Component 1. Create in `packages/page-agent/src/ui/` with colocated CSS modules 2. Use event bus for PageAgent communication ## Code Standards ### TypeScript - Explicit typing for exported/public APIs - ESLint relaxes some unsafe rules for rapid iteration ### CSS & Styling - **Prefer Tailwind CSS over custom CSS** - Custom CSS variables for theme gradients in `src/index.css` - Dark mode support via `dark:` classes - CSS modules for component-specific styles ### Import Organization - External libraries first - Internal modules (`@/`, `@pages/`) - Relative imports last - Blank lines between groups ## Debugging Common Issues ### Blank Documentation Pages 1. Verify route exists in `packages/website/src/router.tsx` 2. Check component import path 3. Verify CSS isn't hiding content (check dark mode classes) 4. Test with minimal component first ### Library Integration Issues 1. Check `packages/page-agent/dist/lib/page-agent.umd.js` builds correctly 2. Test CDN injection with query params 3. Verify event bus communications are properly typed 4. Use `packages/website/src/test-pages/` for isolated testing