Skip to main content

๐Ÿ” OCR Module Guide

This page is currently only available in Chinese. Please switch to ็ฎ€ไฝ“ไธญๆ–‡ for the full content.

๐Ÿ” OCR Module Guide

The OCR module calls the Baidu AI Cloud API and provides features like VAT invoice recognition โ†’ Excel.

Core Scenariosโ€‹

Recognize a Single Invoiceโ€‹

import office
office.ocr.VatInvoiceOCR2Excel(
input_path='./invoice_001.jpg',
output_path='./output/'
)

Batch-Recognize a Folderโ€‹

office.ocr.VatInvoiceOCR2Excel(
input_path='./all_invoices/',
output_path='./output/',
output_excel='monthly_invoice_summary.xlsx',
file_name=True
)

Recognize Image from URLโ€‹

office.ocr.VatInvoiceOCR2Excel(
img_url='https://example.com/invoice.jpg',
output_path='./output/'
)

Recognition Resultโ€‹

The Excel automatically includes the following fields:

  • Invoice code, invoice number, issue date
  • Seller / buyer info (name, tax ID)
  • Amount, tax amount, tax-exclusive amount, tax rate

Configuring Baidu OCR APIโ€‹

  1. Visit Baidu AI Cloud and register an account
  2. Create a "Text Recognition OCR" application
  3. Get the API Key and Secret Key
  4. Configure:
office.ocr.VatInvoiceOCR2Excel(
input_path='./invoices/',
id='your_api_id',
key='your_api_secret'
)

Full API see OCR API Reference

AI ๅŠžๅ…ฌๆ•ˆ็އ่ฏพ
35 ่ฎฒ AI ่‡ชๅŠจๅŒ–ๅŠžๅ…ฌๅฎžๆˆ˜่ฏพ็”จ Python + AI ๅค„็† Excelใ€Wordใ€PDFใ€้‚ฎไปถ็ญ‰ๅŠžๅ…ฌๅœบๆ™ฏใ€‚
ๅŽปๅญฆไน