Files
page-agent/packages/llms/README.md
2025-12-22 16:12:34 +08:00

83 lines
3.1 KiB
Markdown

# @page-agent/llms
LLM client with a **reflection-before-action** mental model for page-agent.
## Why This Package Exists
The LLM module and the agent logic are inherently coupled. This package exists not to decouple them, but to **define the interface contract** between the LLM and the agent.
The core abstraction is the `MacroToolInput` — a structured output format that **forces the model to reflect before acting**.
## The Reflection-Before-Action Model
Every tool call must first output its reasoning state before the actual action:
```typescript
interface MacroToolInput {
// Reflection (mandatory before any action)
evaluation_previous_goal?: string // How well did the previous action work?
memory?: string // Key information to remember
next_goal?: string // What to accomplish next
// Action (the actual operation)
action: Record<string, any>
}
```
This design ensures that:
1. **The model evaluates its previous action** before deciding the next step
2. **Working memory is explicitly maintained** across conversation turns
3. **Goals are clearly stated**, making the agent's reasoning transparent and debuggable
## Architecture
```
┌─────────────────────────────────────────────────────┐
│ PageAgent │
│ - Maintains agent state and history │
│ - Orchestrates tool execution │
│ - Assembles prompts with browser state │
└─────────────────────┬───────────────────────────────┘
│ uses
┌─────────────────────────────────────────────────────┐
│ @page-agent/llms │
│ - Defines MacroToolInput contract │
│ - Handles LLM API calls │
│ - Parses and validates structured output │
│ - Executes tool calls │
└─────────────────────────────────────────────────────┘
```
## Key Components
| Export | Description |
|--------|-------------|
| `LLM` | Main LLM client class with retry logic |
| `MacroToolInput` | The reflection-before-action input schema |
| `AgentBrain` | Agent's thinking state (eval, memory, goal) |
| `LLMConfig` | Configuration for LLM connection |
| `parseLLMConfig` | Parse and apply defaults to config |
## Usage
This package is used internally by `page-agent`. Direct usage:
```typescript
import { LLM, type MacroToolInput } from '@page-agent/llms'
const llm = new LLM({
model: 'gpt-4o',
apiKey: 'your-api-key',
baseURL: 'https://api.openai.com/v1',
})
const result = await llm.invoke(messages, tools, abortSignal)
```
## License
MIT