Skip to main content

What is PaddleOCR?

Comprehensive OCR with multiple specialized pipelines for advanced document understanding.

Usage

from upsonic.ocr import OCR
from upsonic.ocr.layer_1.engines import (
    PaddleOCREngine, PPStructureV3Engine,
    PPChatOCRv4Engine, PaddleOCRVLEngine
)
# Also available: from upsonic.ocr import PaddleOCREngine, PPStructureV3Engine, ...

# General OCR (PP-OCRv5)
engine = PaddleOCREngine(lang='en', ocr_version='PP-OCRv5')
ocr = OCR(layer_1_ocr_engine=engine)
text = ocr.get_text('document.pdf')

# Advanced document structure recognition
engine_structure = PPStructureV3Engine(
    use_table_recognition=True,
    use_formula_recognition=True
)
ocr_structure = OCR(layer_1_ocr_engine=engine_structure)
result = ocr_structure.process_file('research_paper.pdf')

# Chat-based document understanding
engine_chat = PPChatOCRv4Engine(
    use_table_recognition=True,
    use_seal_recognition=True
)
ocr_chat = OCR(layer_1_ocr_engine=engine_chat)

# Vision-Language document understanding
engine_vl = PaddleOCRVLEngine(
    use_layout_detection=True,
    use_chart_recognition=True,
    format_block_content=True
)
ocr_vl = OCR(layer_1_ocr_engine=engine_vl)

PaddleOCREngine (General OCR)

ParameterTypeDefaultDescription
langstr'en'Language code
ocr_versionstr'PP-OCRv5'OCR version (‘PP-OCRv3’, ‘PP-OCRv4’, ‘PP-OCRv5’)
use_doc_orientation_classifyboolNoneEnable document orientation classification
use_doc_unwarpingboolNoneEnable document unwarping
use_textline_orientationboolNoneEnable text line orientation detection
text_det_limit_side_lenintNoneLimit on detection input side length
text_rec_score_threshfloatNoneText recognition score threshold
return_word_boxboolNoneReturn word-level bounding boxes

PPStructureV3Engine (Document Structure)

ParameterTypeDefaultDescription
use_table_recognitionboolNoneEnable table recognition
use_formula_recognitionboolNoneEnable formula recognition
use_seal_recognitionboolNoneEnable seal text recognition
use_chart_recognitionboolNoneEnable chart recognition
layout_thresholdfloatNoneLayout detection score threshold
langstr'en'Language code

PPChatOCRv4Engine (Chat-based OCR)

ParameterTypeDefaultDescription
use_table_recognitionboolNoneEnable table recognition
use_seal_recognitionboolNoneEnable seal recognition
mllm_chat_bot_configdictNoneMultimodal LLM configuration
retriever_configdictNoneRetriever configuration for vector search

PaddleOCRVLEngine (Vision-Language)

ParameterTypeDefaultDescription
use_layout_detectionboolNoneEnable layout detection
use_chart_recognitionboolNoneEnable chart recognition
format_block_contentboolNoneFormat content as Markdown
vl_rec_backendstr'local'VL recognition backend
temperaturefloatNoneSampling temperature for VLM

Supported Languages

40+ languages for PP-OCRv5, with extensive support in PP-OCRv3 for Asian, European, and Middle Eastern languages.