83 lines
3.1 KiB
Markdown
83 lines
3.1 KiB
Markdown
# @page-agent/llms
|
|
|
|
LLM client with a **reflection-before-action** mental model for page-agent.
|
|
|
|
## Why This Package Exists
|
|
|
|
The LLM module and the agent logic are inherently coupled. This package exists not to decouple them, but to **define the interface contract** between the LLM and the agent.
|
|
|
|
The core abstraction is the `MacroToolInput` — a structured output format that **forces the model to reflect before acting**.
|
|
|
|
## The Reflection-Before-Action Model
|
|
|
|
Every tool call must first output its reasoning state before the actual action:
|
|
|
|
```typescript
|
|
interface MacroToolInput {
|
|
// Reflection (mandatory before any action)
|
|
evaluation_previous_goal?: string // How well did the previous action work?
|
|
memory?: string // Key information to remember
|
|
next_goal?: string // What to accomplish next
|
|
|
|
// Action (the actual operation)
|
|
action: Record<string, any>
|
|
}
|
|
```
|
|
|
|
This design ensures that:
|
|
|
|
1. **The model evaluates its previous action** before deciding the next step
|
|
2. **Working memory is explicitly maintained** across conversation turns
|
|
3. **Goals are clearly stated**, making the agent's reasoning transparent and debuggable
|
|
|
|
## Architecture
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────┐
|
|
│ PageAgent │
|
|
│ - Maintains agent state and history │
|
|
│ - Orchestrates tool execution │
|
|
│ - Assembles prompts with browser state │
|
|
└─────────────────────┬───────────────────────────────┘
|
|
│ uses
|
|
▼
|
|
┌─────────────────────────────────────────────────────┐
|
|
│ @page-agent/llms │
|
|
│ - Defines MacroToolInput contract │
|
|
│ - Handles LLM API calls │
|
|
│ - Parses and validates structured output │
|
|
│ - Executes tool calls │
|
|
└─────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
## Key Components
|
|
|
|
| Export | Description |
|
|
|--------|-------------|
|
|
| `LLM` | Main LLM client class with retry logic |
|
|
| `MacroToolInput` | The reflection-before-action input schema |
|
|
| `AgentBrain` | Agent's thinking state (eval, memory, goal) |
|
|
| `LLMConfig` | Configuration for LLM connection |
|
|
| `parseLLMConfig` | Parse and apply defaults to config |
|
|
|
|
## Usage
|
|
|
|
This package is used internally by `page-agent`. Direct usage:
|
|
|
|
```typescript
|
|
import { LLM, type MacroToolInput } from '@page-agent/llms'
|
|
|
|
const llm = new LLM({
|
|
model: 'gpt-4o',
|
|
apiKey: 'your-api-key',
|
|
baseURL: 'https://api.openai.com/v1',
|
|
})
|
|
|
|
const result = await llm.invoke(messages, tools, abortSignal)
|
|
```
|
|
|
|
## License
|
|
|
|
MIT
|
|
|