睿真是一款第三方ocr(收费),做个笔记便于复用。
做的crm报销系统,需要识别发票,这里用的多张发票识别。
睿真并不难,只不过如果发票种类多,各个场景都要试一遍,是个实践性很强的工作。
使用
多张发票识别
整体报文有点多,而且格式化还格式化不出来,那么我们只拿主要内容。
主要内容在identify_results,这是个列表。
起始位置是identify_results后的[{
,结束位置是}]
就能找到主要信息。
数电火车票
报文:
[{
"details": {
"number": "25119110025000063959",
"date": "2025年01月24日",
"time": "21:35",
"name": "楚世云",
"station_geton": "北京丰台",
"station_getoff": "石家庄",
"train_number": "G6739",
"seat": "二等座",
"total": "151.00",
"kind": "交通",
"serial_number": "1002556086011699730102025",
"user_id": "1305281988****7839",
"seat_number": "08车05D号",
"electronic_mark": "1",
"type_of_business": "售",
"buyer": "楚世云",
"type_of_voucher": "电子发票(铁路电子客票)",
"date_of_issue": "2025年02月26日",
"phonics_of_departure_station": "Beijingfengtai",
"phonics_of_destination_station": "Shijiazhuang",
"voucher_mark": "1"
},
"extra": {
"qrcode": ["01,51,,25119110025000063959,151.00,20250226,,75ff"]
},
"orientation": 0,
"region": [0, 0, 924, 615],
"image_size": [924, 615],
"page": 0,
"type": "10503"
}]
注:
数电火车票
date # 一般来说这个字段是开票日期,但是在这里是乘车日期
date_of_issue # 这个是开票日期
航空客票(电子行程单)
报文:
[{
"details": {
"user_name": "张三",
"user_id": "220204********1512",
"number": "8362484282352",
"check_code": "5629",
"date": "2025年05月25日",
"agentcode": "SJ778/83631195",
"issue_by": "北京我遥我控科技有限公司",
"fare": "651.38",
"tax": "60.27",
"fuel_surcharge": "18.35",
"caac_development_fund": "50.00",
"insurance": "XXX",
"total": "780.00",
"flights": [{
"from": "重庆江北T3",
"to": "北京大兴",
"flight_number": "NS8036",
"date": "2025年05月24日",
"time": "20:15",
"seat": "T",
"carrier": "河北",
"class_name": "经济舱",
"allow": "20K",
"fare_basis": "T",
"not_valid_before": "",
"not_valid_after": "",
"flight_segment": "1"
}],
"kind": "交通",
"international_flag": "国内",
"endorsement": "不得签转/更改退票收费",
"electronic_mark": "1",
"issuing_status": "正常",
"qrcode": "01,61,,25138836112011770001,780.00,20250525,,89E5",
"receipt_number": "25138836112011770001",
"prompt_information": "",
"buyer": "测试 公司",
"buyer_tax_id": "9111010833980001",
"tax_rate": "9%",
"other_taxes": "0.00",
"seller": "",
"voucher_mark": "1",
"type_of_business": "售"
},
"extra": {
"check_code_candidates": ["5629"]
},
"orientation": 0,
"region": [0, 0, 1131, 658],
"image_size": [1131, 658],
"page": 0,
"type": "10506"
}]
注:
电子行程单
发票号码 是receipt_number
字段。
不含税金额是fare
字段。
校验码check_code是4位,好在查验等不需要传这个校验码。