feat: create llms package and mv files

2025-12-22 16:12:34 +08:00
parent b36a0c0261
commit 7c2d000e29
19 changed files with 217 additions and 1 deletions
--- a/packages/llms/README.md
+++ b/packages/llms/README.md
@@ -0,0 +1,82 @@
+# @page-agent/llms
+
+LLM client with a **reflection-before-action** mental model for page-agent.
+
+## Why This Package Exists
+
+The LLM module and the agent logic are inherently coupled. This package exists not to decouple them, but to **define the interface contract** between the LLM and the agent.
+
+The core abstraction is the `MacroToolInput` — a structured output format that **forces the model to reflect before acting**.
+
+## The Reflection-Before-Action Model
+
+Every tool call must first output its reasoning state before the actual action:
+
+```typescript
+interface MacroToolInput {
+  // Reflection (mandatory before any action)
+  evaluation_previous_goal?: string  // How well did the previous action work?
+  memory?: string                     // Key information to remember
+  next_goal?: string                  // What to accomplish next
+
+  // Action (the actual operation)
+  action: Record<string, any>
+}
+```
+
+This design ensures that:
+
+1. **The model evaluates its previous action** before deciding the next step
+2. **Working memory is explicitly maintained** across conversation turns
+3. **Goals are clearly stated**, making the agent's reasoning transparent and debuggable
+
+## Architecture
+
+```
+┌─────────────────────────────────────────────────────┐
+│                    PageAgent                        │
+│  - Maintains agent state and history                │
+│  - Orchestrates tool execution                      │
+│  - Assembles prompts with browser state             │
+└─────────────────────┬───────────────────────────────┘
+                      │ uses
+                      ▼
+┌─────────────────────────────────────────────────────┐
+│                 @page-agent/llms                    │
+│  - Defines MacroToolInput contract                  │
+│  - Handles LLM API calls                            │
+│  - Parses and validates structured output           │
+│  - Executes tool calls                              │
+└─────────────────────────────────────────────────────┘
+```
+
+## Key Components
+
+| Export | Description |
+|--------|-------------|
+| `LLM` | Main LLM client class with retry logic |
+| `MacroToolInput` | The reflection-before-action input schema |
+| `AgentBrain` | Agent's thinking state (eval, memory, goal) |
+| `LLMConfig` | Configuration for LLM connection |
+| `parseLLMConfig` | Parse and apply defaults to config |
+
+## Usage
+
+This package is used internally by `page-agent`. Direct usage:
+
+```typescript
+import { LLM, type MacroToolInput } from '@page-agent/llms'
+
+const llm = new LLM({
+  model: 'gpt-4o',
+  apiKey: 'your-api-key',
+  baseURL: 'https://api.openai.com/v1',
+})
+
+const result = await llm.invoke(messages, tools, abortSignal)
+```
+
+## License
+
+MIT
+