162 lines
4.3 KiB
Markdown
162 lines
4.3 KiB
Markdown
# Intelligent Cabin Agent
|
|
|
|
## Quick Start
|
|
|
|
1. Create and activate the Python 3.11 virtual environment:
|
|
|
|
```bash
|
|
uv venv .venv --python 3.11
|
|
source .venv/bin/activate
|
|
```
|
|
2. Install dependencies:
|
|
|
|
```bash
|
|
pip install -r requirements.txt
|
|
```
|
|
|
|
3. Start the service:
|
|
|
|
```bash
|
|
.venv/bin/uvicorn app.main:app --host 127.0.0.1 --port 8000
|
|
```
|
|
|
|
4. Open the API docs:
|
|
|
|
```text
|
|
http://127.0.0.1:8000/docs
|
|
```
|
|
|
|
Demo UI:
|
|
|
|
```text
|
|
http://127.0.0.1:8000/demo
|
|
```
|
|
|
|
Architecture and flow review:
|
|
|
|
```text
|
|
solution_review.md
|
|
```
|
|
|
|
The demo console supports:
|
|
|
|
- local browser session history restore
|
|
- runtime matcher/classifier/session backend switching
|
|
- matcher routing debug panel with Top-K candidates
|
|
- local `rewrite -> keyword/classifier/retrieval -> fusion` decision trace
|
|
- direct display of classifier backend / raw label / fallback reason / raw candidate payload
|
|
- workflow JSON visualization
|
|
|
|
## Core APIs
|
|
|
|
- `POST /api/v1/agent/chat`
|
|
- `POST /api/v1/agent/fill-slots`
|
|
- `GET /health`
|
|
|
|
## Current Scope
|
|
|
|
- Configurable session backend: memory / Redis
|
|
- Config-driven intent registry
|
|
- Router layer with pluggable matcher / extractor
|
|
- Rule-based fast-path intent routing
|
|
- Basic slot extraction
|
|
- Plugin registry with mock handlers
|
|
- Workflow response payloads
|
|
|
|
## Runtime Config
|
|
|
|
- `AGENT_SESSION_BACKEND=memory|redis`
|
|
- `AGENT_REDIS_URL=redis://127.0.0.1:6379/0`
|
|
- `AGENT_REDIS_KEY_PREFIX=agent:session`
|
|
- `AGENT_SESSION_TTL_SECONDS=86400`
|
|
- `AGENT_MATCHER_PIPELINE=keyword`
|
|
- `AGENT_SLOT_EXTRACTOR_BACKEND=heuristic`
|
|
- `AGENT_CLASSIFIER_BACKEND=mock`
|
|
- `AGENT_CLASSIFIER_THRESHOLD=1.2`
|
|
- `AGENT_CLASSIFIER_BERT_THRESHOLD=0.0`
|
|
- `AGENT_CLASSIFIER_MODEL_PATH=/path/to/model`
|
|
- `AGENT_CLASSIFIER_LABEL_MAP_PATH=/path/to/label_map.json`
|
|
- `AGENT_CLASSIFIER_REMOTE_URL=http://127.0.0.1:9000/classify`
|
|
- `AGENT_CLASSIFIER_REMOTE_TIMEOUT_SECONDS=3.0`
|
|
- `AGENT_LOCAL_EXECUTE_THRESHOLD=1.65`
|
|
- `AGENT_LOCAL_ROUTE_TO_CLOUD_THRESHOLD=0.75`
|
|
- `AGENT_LOCAL_CLARIFY_MARGIN_THRESHOLD=0.12`
|
|
- `AGENT_PLANNER_BACKEND=heuristic|dashscope`
|
|
- `AGENT_PLANNER_BASE_URL=https://your-base-url/v1`
|
|
- `AGENT_PLANNER_API_KEY=your-api-key`
|
|
- `AGENT_PLANNER_MODEL_NAME=qwen3.5-plus`
|
|
- `AGENT_PLANNER_TIMEOUT_SECONDS=6.0`
|
|
|
|
Matcher pipeline examples:
|
|
|
|
- `AGENT_MATCHER_PIPELINE=keyword`
|
|
- `AGENT_MATCHER_PIPELINE=keyword,classifier,retrieval`
|
|
- `AGENT_MATCHER_PIPELINE=keyword,retrieval`
|
|
- `AGENT_MATCHER_PIPELINE=classifier`
|
|
- `AGENT_MATCHER_PIPELINE=retrieval`
|
|
|
|
Classifier backend examples:
|
|
|
|
- `AGENT_CLASSIFIER_BACKEND=mock`
|
|
- `AGENT_CLASSIFIER_BACKEND=bert`
|
|
- `AGENT_CLASSIFIER_BACKEND=remote`
|
|
- `AGENT_CLASSIFIER_TOP_K=3`
|
|
|
|
For local BERT models:
|
|
|
|
- install optional runtime deps such as `transformers` and a backend like `torch`
|
|
- point `AGENT_CLASSIFIER_MODEL_PATH` to the local model directory
|
|
- if the model outputs labels like `LABEL_0`, provide `AGENT_CLASSIFIER_LABEL_MAP_PATH`
|
|
- use `AGENT_CLASSIFIER_BERT_THRESHOLD` instead of the mock threshold
|
|
|
|
Example label map:
|
|
|
|
```json
|
|
{
|
|
"LABEL_0": "cs_query_order",
|
|
"LABEL_1": "cs_cancel_order",
|
|
"LABEL_2": "cabin_play_music"
|
|
}
|
|
```
|
|
|
|
Remote classifier expected request payload:
|
|
|
|
```json
|
|
{
|
|
"text": "我的订单现在什么情况 A808001",
|
|
"top_k": 3,
|
|
"labels": ["cs_query_order", "cs_cancel_order", "cabin_play_music"]
|
|
}
|
|
```
|
|
|
|
Remote classifier response payload:
|
|
|
|
```json
|
|
{
|
|
"intent_id": "cs_query_order",
|
|
"label": "LABEL_0",
|
|
"score": 0.982,
|
|
"model_name": "bert-remote-v1",
|
|
"candidates": [
|
|
{"label": "LABEL_0", "intent_id": "cs_query_order", "score": 0.982},
|
|
{"label": "LABEL_1", "intent_id": "cs_cancel_order", "score": 0.011},
|
|
{"label": "LABEL_7", "intent_id": "cs_query_logistics", "score": 0.007}
|
|
]
|
|
}
|
|
```
|
|
|
|
When `bert` or `remote` is unavailable or below threshold, the classifier falls back to `mock`, and the demo debug panel shows both the attempted backend and the fallback reason.
|
|
|
|
Planner notes:
|
|
|
|
- keep the planner key in environment variables, not in source code or front-end code
|
|
- the planner uses `POST {base_url}/chat/completions`
|
|
- for DashScope OpenAI-compatible endpoints, use the compatible `v1` base URL and set `AGENT_PLANNER_BACKEND=dashscope`
|
|
- when cloud planning is unavailable, the service falls back to a local heuristic planner for multi-command splitting
|
|
|
|
## Next Steps
|
|
|
|
- Replace rule router with classifier + retrieval + LLM
|
|
- Connect real business plugins
|
|
- Add automated tests
|