Update project and configurations

This commit is contained in:
Zou-Seay
2026-06-11 16:28:00 +08:00
parent 12d3922091
commit a29a91867d
237 changed files with 164880 additions and 90 deletions

View File

@@ -0,0 +1,65 @@
# 本地 BERT 意图识别测试报告
## 概览
- 模型目录:`/Users/hwp/Documents/trae_projects/intelligent_cabin/models/local_bert_intent`
- 评测集:`/Users/hwp/Documents/trae_projects/intelligent_cabin/app/data/bert_intent_eval_independent.jsonl`
- 评测阈值:`0.0`
- 测试样本数:`42`
- 总体准确率:`0.9762`
## 训练摘要
- 基座模型:`hfl/chinese-macbert-base`
- 训练集 / 验证集:`1557 / 401`
- 最佳验证准确率:`0.9875`
- 训练设备:`mps`
## 分类别结果
- `business`: 33/34 = 0.9706
- `out_of_scope`: 4/4 = 1.0
- `social`: 4/4 = 1.0
## 分标签结果
- `__out_of_scope__` (out_of_scope): 4/4 = 1.0
- `__social__` (social): 4/4 = 1.0
- `cabin_ac_off` (business): 1/1 = 1.0
- `cabin_ac_on` (business): 1/1 = 1.0
- `cabin_defog_front_on` (business): 1/1 = 1.0
- `cabin_defog_rear_on` (business): 1/1 = 1.0
- `cabin_fan_down` (business): 1/1 = 1.0
- `cabin_fan_up` (business): 1/1 = 1.0
- `cabin_lights_off` (business): 1/1 = 1.0
- `cabin_lights_on` (business): 1/1 = 1.0
- `cabin_lock_doors` (business): 1/1 = 1.0
- `cabin_mirror_fold` (business): 1/1 = 1.0
- `cabin_mirror_unfold` (business): 1/1 = 1.0
- `cabin_nav_cancel` (business): 1/1 = 1.0
- `cabin_nav_to` (business): 1/1 = 1.0
- `cabin_next_track` (business): 1/1 = 1.0
- `cabin_pause_music` (business): 1/1 = 1.0
- `cabin_play_music` (business): 1/1 = 1.0
- `cabin_previous_track` (business): 1/1 = 1.0
- `cabin_seat_heat_off` (business): 1/1 = 1.0
- `cabin_seat_heat_on` (business): 1/1 = 1.0
- `cabin_set_ac` (business): 1/1 = 1.0
- `cabin_sunroof_close` (business): 1/1 = 1.0
- `cabin_sunroof_open` (business): 1/1 = 1.0
- `cabin_unlock_doors` (business): 1/1 = 1.0
- `cabin_volume_down` (business): 1/1 = 1.0
- `cabin_volume_mute` (business): 1/1 = 1.0
- `cabin_volume_up` (business): 1/1 = 1.0
- `cabin_window_close` (business): 1/1 = 1.0
- `cabin_window_open` (business): 0/1 = 0.0
- `cabin_wiper_off` (business): 1/1 = 1.0
- `cabin_wiper_on` (business): 1/1 = 1.0
- `cs_cancel_order` (business): 1/1 = 1.0
- `cs_query_logistics` (business): 1/1 = 1.0
- `cs_query_order` (business): 1/1 = 1.0
- `cs_transfer_human` (business): 1/1 = 1.0
## 错误样例
- 文本:`左前窗打开一点` | 类别:`business` | 期望:`cabin_window_open` | 预测:`cabin_defog_front_on` | 分数:`0.9951`
## 结论
- 当前本地 MacBERT 已具备较强的业务意图识别能力,可作为本地快链路分类器。
- 误判主要集中在方向相反或语义接近的控制指令,下一步应补充对抗样本和真实口语表达。
- 上线前建议继续补充 ASR 错字、多轮短句和多意图子句级样本。

View File

@@ -0,0 +1,426 @@
{
"model_dir": "/Users/hwp/Documents/trae_projects/intelligent_cabin/models/local_bert_intent",
"threshold": 0.0,
"test_path": "/Users/hwp/Documents/trae_projects/intelligent_cabin/app/data/bert_intent_eval_independent.jsonl",
"test_case_count": 42,
"accuracy": 0.9762,
"train_summary": {
"base_model": "hfl/chinese-macbert-base",
"epochs": 16,
"batch_size": 8,
"learning_rate": 2e-05,
"train_size": 1557,
"dev_size": 401,
"best_dev_accuracy": 0.9875,
"device": "mps"
},
"per_category": [
{
"category": "business",
"total": 34,
"correct": 33,
"accuracy": 0.9706
},
{
"category": "out_of_scope",
"total": 4,
"correct": 4,
"accuracy": 1.0
},
{
"category": "social",
"total": 4,
"correct": 4,
"accuracy": 1.0
}
],
"per_label": [
{
"label": "__out_of_scope__",
"category": "out_of_scope",
"total": 4,
"correct": 4,
"accuracy": 1.0
},
{
"label": "__social__",
"category": "social",
"total": 4,
"correct": 4,
"accuracy": 1.0
},
{
"label": "cabin_ac_off",
"category": "business",
"total": 1,
"correct": 1,
"accuracy": 1.0
},
{
"label": "cabin_ac_on",
"category": "business",
"total": 1,
"correct": 1,
"accuracy": 1.0
},
{
"label": "cabin_defog_front_on",
"category": "business",
"total": 1,
"correct": 1,
"accuracy": 1.0
},
{
"label": "cabin_defog_rear_on",
"category": "business",
"total": 1,
"correct": 1,
"accuracy": 1.0
},
{
"label": "cabin_fan_down",
"category": "business",
"total": 1,
"correct": 1,
"accuracy": 1.0
},
{
"label": "cabin_fan_up",
"category": "business",
"total": 1,
"correct": 1,
"accuracy": 1.0
},
{
"label": "cabin_lights_off",
"category": "business",
"total": 1,
"correct": 1,
"accuracy": 1.0
},
{
"label": "cabin_lights_on",
"category": "business",
"total": 1,
"correct": 1,
"accuracy": 1.0
},
{
"label": "cabin_lock_doors",
"category": "business",
"total": 1,
"correct": 1,
"accuracy": 1.0
},
{
"label": "cabin_mirror_fold",
"category": "business",
"total": 1,
"correct": 1,
"accuracy": 1.0
},
{
"label": "cabin_mirror_unfold",
"category": "business",
"total": 1,
"correct": 1,
"accuracy": 1.0
},
{
"label": "cabin_nav_cancel",
"category": "business",
"total": 1,
"correct": 1,
"accuracy": 1.0
},
{
"label": "cabin_nav_to",
"category": "business",
"total": 1,
"correct": 1,
"accuracy": 1.0
},
{
"label": "cabin_next_track",
"category": "business",
"total": 1,
"correct": 1,
"accuracy": 1.0
},
{
"label": "cabin_pause_music",
"category": "business",
"total": 1,
"correct": 1,
"accuracy": 1.0
},
{
"label": "cabin_play_music",
"category": "business",
"total": 1,
"correct": 1,
"accuracy": 1.0
},
{
"label": "cabin_previous_track",
"category": "business",
"total": 1,
"correct": 1,
"accuracy": 1.0
},
{
"label": "cabin_seat_heat_off",
"category": "business",
"total": 1,
"correct": 1,
"accuracy": 1.0
},
{
"label": "cabin_seat_heat_on",
"category": "business",
"total": 1,
"correct": 1,
"accuracy": 1.0
},
{
"label": "cabin_set_ac",
"category": "business",
"total": 1,
"correct": 1,
"accuracy": 1.0
},
{
"label": "cabin_sunroof_close",
"category": "business",
"total": 1,
"correct": 1,
"accuracy": 1.0
},
{
"label": "cabin_sunroof_open",
"category": "business",
"total": 1,
"correct": 1,
"accuracy": 1.0
},
{
"label": "cabin_unlock_doors",
"category": "business",
"total": 1,
"correct": 1,
"accuracy": 1.0
},
{
"label": "cabin_volume_down",
"category": "business",
"total": 1,
"correct": 1,
"accuracy": 1.0
},
{
"label": "cabin_volume_mute",
"category": "business",
"total": 1,
"correct": 1,
"accuracy": 1.0
},
{
"label": "cabin_volume_up",
"category": "business",
"total": 1,
"correct": 1,
"accuracy": 1.0
},
{
"label": "cabin_window_close",
"category": "business",
"total": 1,
"correct": 1,
"accuracy": 1.0
},
{
"label": "cabin_window_open",
"category": "business",
"total": 1,
"correct": 0,
"accuracy": 0.0
},
{
"label": "cabin_wiper_off",
"category": "business",
"total": 1,
"correct": 1,
"accuracy": 1.0
},
{
"label": "cabin_wiper_on",
"category": "business",
"total": 1,
"correct": 1,
"accuracy": 1.0
},
{
"label": "cs_cancel_order",
"category": "business",
"total": 1,
"correct": 1,
"accuracy": 1.0
},
{
"label": "cs_query_logistics",
"category": "business",
"total": 1,
"correct": 1,
"accuracy": 1.0
},
{
"label": "cs_query_order",
"category": "business",
"total": 1,
"correct": 1,
"accuracy": 1.0
},
{
"label": "cs_transfer_human",
"category": "business",
"total": 1,
"correct": 1,
"accuracy": 1.0
}
],
"errors": [
{
"text": "左前窗打开一点",
"category": "business",
"expected_label": "cabin_window_open",
"predicted_label": "cabin_defog_front_on",
"score": 0.9951,
"raw_label": "cabin_defog_front_on",
"ok": false,
"top_candidates": [
{
"intent_id": "cabin_defog_front_on",
"score": 0.9951
},
{
"intent_id": "cabin_sunroof_open",
"score": 0.0005
},
{
"intent_id": "cabin_lights_on",
"score": 0.0004
}
]
}
],
"confusion": {
"cabin_ac_off": {
"cabin_ac_off": 1
},
"cabin_ac_on": {
"cabin_ac_on": 1
},
"cabin_defog_front_on": {
"cabin_defog_front_on": 1
},
"cabin_defog_rear_on": {
"cabin_defog_rear_on": 1
},
"cabin_fan_down": {
"cabin_fan_down": 1
},
"cabin_fan_up": {
"cabin_fan_up": 1
},
"cabin_lights_off": {
"cabin_lights_off": 1
},
"cabin_lights_on": {
"cabin_lights_on": 1
},
"cabin_lock_doors": {
"cabin_lock_doors": 1
},
"cabin_mirror_fold": {
"cabin_mirror_fold": 1
},
"cabin_mirror_unfold": {
"cabin_mirror_unfold": 1
},
"cabin_nav_cancel": {
"cabin_nav_cancel": 1
},
"cabin_nav_to": {
"cabin_nav_to": 1
},
"cabin_next_track": {
"cabin_next_track": 1
},
"cabin_pause_music": {
"cabin_pause_music": 1
},
"cabin_play_music": {
"cabin_play_music": 1
},
"cabin_previous_track": {
"cabin_previous_track": 1
},
"cabin_seat_heat_off": {
"cabin_seat_heat_off": 1
},
"cabin_seat_heat_on": {
"cabin_seat_heat_on": 1
},
"cabin_set_ac": {
"cabin_set_ac": 1
},
"cabin_sunroof_close": {
"cabin_sunroof_close": 1
},
"cabin_sunroof_open": {
"cabin_sunroof_open": 1
},
"cabin_unlock_doors": {
"cabin_unlock_doors": 1
},
"cabin_volume_down": {
"cabin_volume_down": 1
},
"cabin_volume_mute": {
"cabin_volume_mute": 1
},
"cabin_volume_up": {
"cabin_volume_up": 1
},
"cabin_window_close": {
"cabin_window_close": 1
},
"cabin_window_open": {
"cabin_defog_front_on": 1
},
"cabin_wiper_off": {
"cabin_wiper_off": 1
},
"cabin_wiper_on": {
"cabin_wiper_on": 1
},
"cs_cancel_order": {
"cs_cancel_order": 1
},
"cs_query_logistics": {
"cs_query_logistics": 1
},
"cs_query_order": {
"cs_query_order": 1
},
"cs_transfer_human": {
"cs_transfer_human": 1
},
"__social__": {
"__social__": 4
},
"__out_of_scope__": {
"__out_of_scope__": 4
}
}
}

View File

@@ -0,0 +1,47 @@
# 本地多标签 Detector 独立评测报告
## 概览
- 模型目录:`/Users/hwp/Documents/trae_projects/intelligent_cabin/models/local_bert_multi_intent`
- 评测集:`/Users/hwp/Documents/trae_projects/intelligent_cabin/app/data/bert_intent_multilabel_eval_independent.jsonl`
- 样本数:`37`
- 阈值 / top_k / max_labels`0.45 / 8 / 4`
- `micro_precision``0.9362`
- `micro_recall``0.6377`
- `micro_f1``0.7586`
- `exact_match``0.5135`
- `multi_sentence_recall``0.4138`
- `single_guard_false_alarm_rate``0.0`
## 分类别结果
- `cabin_parallel`: count=15 micro_f1=0.807 exact_match=0.4667
- `cabin_sequence`: count=9 micro_f1=0.5385 exact_match=0.3333
- `cs_conditional`: count=3 micro_f1=0.9091 exact_match=0.6667
- `cs_sequence`: count=2 micro_f1=0.6667 exact_match=0.0
- `single_guard`: count=8 micro_f1=0.875 exact_match=0.875
## 主要混淆
- 漏掉 `cabin_sunroof_open`,同时误报 `cabin_window_open``1`
- 漏掉 `cabin_pause_music`,同时误报 `cabin_play_music``1`
- 漏掉 `cabin_window_open`,同时误报 `cabin_defog_front_on``1`
## 错误样例
- 文本:`锁车门,再把后视镜收起来` | 类别:`cabin_sequence` | 期望:`['cabin_lock_doors', 'cabin_mirror_fold']` | 预测:`[]`
- 文本:`把车门解锁,再把镜子展开` | 类别:`cabin_sequence` | 期望:`['cabin_mirror_unfold', 'cabin_unlock_doors']` | 预测:`[]`
- 文本:`路线别导了,音乐也停一下` | 类别:`cabin_parallel` | 期望:`['cabin_nav_cancel', 'cabin_pause_music']` | 预测:`[]`
- 文本:`雨停了,雨刮关掉,再把窗开一点` | 类别:`cabin_sequence` | 期望:`['cabin_window_open', 'cabin_wiper_off']` | 预测:`[]`
- 文本:`把天窗合上,然后把音乐暂停` | 类别:`cabin_sequence` | 期望:`['cabin_pause_music', 'cabin_sunroof_close']` | 预测:`[]`
- 文本:`先把音量调大,再切下一首` | 类别:`cabin_parallel` | 期望:`['cabin_next_track', 'cabin_volume_up']` | 预测:`[]`
- 文本:`静音之后切回上一首` | 类别:`cabin_sequence` | 期望:`['cabin_previous_track', 'cabin_volume_mute']` | 预测:`[]`
- 文本:`把天窗打开透口气,再开空调` | 类别:`cabin_parallel` | 期望:`['cabin_ac_on', 'cabin_sunroof_open']` | 预测:`['cabin_ac_on', 'cabin_window_open']`
- 文本:`音乐停一下,然后导航到公司` | 类别:`cabin_sequence` | 期望:`['cabin_nav_to', 'cabin_pause_music']` | 预测:`['cabin_nav_to', 'cabin_play_music']`
- 文本:`把左前窗降一点` | 类别:`single_guard` | 期望:`['cabin_window_open']` | 预测:`['cabin_defog_front_on']`
- 文本:`车里闷,给我透个气,再放点轻松的歌` | 类别:`cabin_parallel` | 期望:`['cabin_play_music', 'cabin_window_open']` | 预测:`['cabin_play_music']`
- 文本:`把空调开了,风别太小,再来首歌` | 类别:`cabin_parallel` | 期望:`['cabin_ac_on', 'cabin_fan_up', 'cabin_play_music']` | 预测:`['cabin_ac_on', 'cabin_play_music']`
- 文本:`开导航去徐家汇,顺便把风量调大` | 类别:`cabin_parallel` | 期望:`['cabin_fan_up', 'cabin_nav_to']` | 预测:`['cabin_nav_to']`
- 文本:`温度调到二十三度,风稍微小一点` | 类别:`cabin_parallel` | 期望:`['cabin_fan_down', 'cabin_set_ac']` | 预测:`['cabin_set_ac']`
- 文本:`帮我看A812302物流要是太慢就转人工` | 类别:`cs_conditional` | 期望:`['cs_query_logistics', 'cs_transfer_human']` | 预测:`['cs_query_logistics']`
## 结论建议
- 先看多意图句是否存在系统性漏召回,再看单意图是否被误报成多意图。
-`single_guard_false_alarm_rate` 偏高,需要先收紧 detector 阈值或补单意图负样本,再考虑进入 NER。
-`multi_sentence_recall` 不稳定,应继续补条件句、弱连接句和口语化多动作语料。

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,40 @@
# Joint NLU 独立评测报告
## 概览
- 模型目录:`/Users/hwp/Documents/trae_projects/intelligent_cabin/models/local_joint_bert_nlu`
- 评测集:`/Users/hwp/Documents/trae_projects/intelligent_cabin/app/data/joint_nlu_eval_independent.jsonl`
- 样本数:`43`
- `intent_accuracy``0.9302`
- `slot_exact_match``1.0`
- `joint_exact_match``0.9302`
- `slot_micro_precision``1.0`
- `slot_micro_recall``1.0`
- `slot_micro_f1``1.0`
## 训练摘要
- 训练集 / 评测集:`337 / 10`
- 训练阶段 `intent_accuracy``1.0`
- 训练阶段 `slot_exact_match``0.8`
## 分类别结果
- `failure_replay`: count=12 intent_acc=0.75 slot_exact=1.0 joint_exact=0.75
- `no_slot_control`: count=14 intent_acc=1.0 slot_exact=1.0 joint_exact=1.0
- `slot_destination`: count=4 intent_acc=1.0 slot_exact=1.0 joint_exact=1.0
- `slot_music`: count=5 intent_acc=1.0 slot_exact=1.0 joint_exact=1.0
- `slot_order`: count=4 intent_acc=1.0 slot_exact=1.0 joint_exact=1.0
- `slot_temperature`: count=4 intent_acc=1.0 slot_exact=1.0 joint_exact=1.0
## 主要意图混淆
- 期望 `cabin_window_open`,预测成 `None``1`
- 期望 `cabin_window_open`,预测成 `cabin_play_music``1`
- 期望 `cabin_fan_up`,预测成 `cabin_fan_down``1`
## 失败样例回放
- 文本:`把左前窗降一点` | 类别:`failure_replay` | 期望意图:`cabin_window_open` | 预测意图:`None` | 期望槽位:`{}` | 预测槽位:`{}` | 缺失槽位:`[]` | 多出槽位:`[]`
- 文本:`给我透个气` | 类别:`failure_replay` | 期望意图:`cabin_window_open` | 预测意图:`cabin_play_music` | 期望槽位:`{}` | 预测槽位:`{}` | 缺失槽位:`[]` | 多出槽位:`[]`
- 文本:`风别太小` | 类别:`failure_replay` | 期望意图:`cabin_fan_up` | 预测意图:`cabin_fan_down` | 期望槽位:`{}` | 预测槽位:`{}` | 缺失槽位:`[]` | 多出槽位:`[]`
## 结论
- 先看 `failure_replay` 是否仍然错,能直接判断先前多意图失败到底是联合模型本体问题还是上层组合问题。
-`slot_music``slot_destination` 仍不稳,优先补 span 标注,不要回退到规则抽槽。
-`no_slot_control` 很稳但 `failure_replay` 中仍有大量错误,下一步应补长尾控制语义数据,而不是急着上更复杂结构。

File diff suppressed because it is too large Load Diff