4e7f755ae9ef9d8a5057a692b1c0a9425853321a
## Changes 1. **Fix keyboard event semantics** (per review feedback) - Only dispatch keydown/keyup for single-character input - Avoids inconsistent event payloads for multi-character strings - Prevents confusion in editors that correlate key events with text changes 2. **Remove extra blank line** - Formatting consistency Reviewer noted that dispatching key events with only the last character of multi-character text creates semantic inconsistency with the actual DOM mutation (which inserts the full string at once). This fix follows the suggested change from the review.
Page Agent
The GUI Agent Living in Your Webpage. Control web interfaces with natural language.
🌐 English | 中文
👉 🚀 Demo | 📖 Documentation | 📢 Join HN Discussion
✨ Features
- 🎯 Easy integration
- No need for
browser extension/python/headless browser. - Just in-page javascript. Everything happens in your web page.
- The best tool for your agent to control web pages.
- No need for
- 📖 Text-based DOM manipulation
- No screenshots. No OCR or multi-modal LLMs needed.
- No special permissions required.
- 🧠 Bring your own LLMs
- 🎨 Pretty UI with human-in-the-loop
- 🐙 Optional chrome extension for multi-page tasks.
💡 Use Cases
- SaaS AI Copilot — Ship an AI copilot in your product in lines of code. No backend rewrite needed.
- Smart Form Filling — Turn 20-click workflows into one sentence. Perfect for ERP, CRM, and admin systems.
- Accessibility — Make any web app accessible through natural language. Voice commands, screen readers, zero barrier.
- Multi-page Agent — Extend your agent's reach across browser tabs with the optional chrome extension.
🚀 Quick Start
One-line integration
Fastest way to try PageAgent with our free Demo LLM:
<script src="{URL}" crossorigin="true"></script>
| Mirrors | URL |
|---|---|
| Global | https://cdn.jsdelivr.net/npm/page-agent@1.5.4/dist/iife/page-agent.demo.js |
| China | https://registry.npmmirror.com/page-agent/1.5.4/files/dist/iife/page-agent.demo.js |
⚠️ For technical evaluation only. This demo CDN uses our free testing LLM API. By using it, you agree to its terms.
NPM Installation
npm install page-agent
import { PageAgent } from 'page-agent'
const agent = new PageAgent({
model: 'qwen3.5-plus',
baseURL: 'https://dashscope.aliyuncs.com/compatible-mode/v1',
apiKey: 'YOUR_API_KEY',
language: 'en-US',
})
await agent.execute('Click the login button')
For more programmatic usage, see 📖 Documentations.
🤝 Contributing
We welcome contributions from the community! Follow our instructions in CONTRIBUTING.md for environment setup and local development.
Please read Code of Conduct before contributing.
👏 Acknowledgments
This project builds upon the excellent work of browser-use.
PageAgent is designed for client-side web enhancement, not server-side automation.
DOM processing components and prompt are derived from browser-use:
Browser Use
Copyright (c) 2024 Gregor Zunic
Licensed under the MIT License
Original browser-use project: <https://github.com/browser-use/browser-use>
We gratefully acknowledge the browser-use project and its contributors for their
excellent work on web automation and DOM interaction patterns that helped make
this project possible.
Third-party dependencies and their licenses can be found in the package.json
file and in the node_modules directory after installation.
📄 License
⭐ Star this repo if you find PageAgent helpful!
Languages
TypeScript
81.5%
JavaScript
11.7%
CSS
5.3%
HTML
1.2%
Python
0.3%