Attributes

OCR Orchestrator Parameters

The OCR orchestrator accepts the following parameters:

Parameter	Type	Default	Description
`layer_1_ocr_engine`	Engine instance	—	The configured Layer 1 engine instance
`layer_1_timeout`	float \| None	`None`	Timeout in seconds for Layer 1 processing. Raises `OCRTimeoutError` on expiry

Engine Configuration

Each engine is instantiated with its own parameters. Common parameters shared across most engines:

Parameter	Type	Default	Description
`languages`	List[str]	`['en']`	Languages to detect
`confidence_threshold`	float	`0.0`	Minimum confidence threshold (0.0-1.0) for accepting OCR results
`rotation_fix`	bool	`False`	Enable automatic rotation correction for skewed images
`enhance_contrast`	bool	`False`	Enhance image contrast before OCR processing
`remove_noise`	bool	`False`	Apply noise reduction filter to improve text clarity
`pdf_dpi`	int	`300`	DPI resolution for PDF rendering (higher = better quality, slower)
`preserve_formatting`	bool	`True`	Try to preserve text formatting (line breaks, spacing)

Each engine may have additional provider-specific parameters. See the individual engine pages for details.

Configuration Example

from upsonic.ocr import OCR
from upsonic.ocr.layer_1.engines import EasyOCREngine

# Create engine with full configuration
engine = EasyOCREngine(
    languages=['en'],
    confidence_threshold=0.6,
    rotation_fix=True,
    enhance_contrast=True,
    remove_noise=True,
    pdf_dpi=300,
    gpu=True
)

# Create orchestrator with engine and timeout
ocr = OCR(layer_1_ocr_engine=engine, layer_1_timeout=120)

# Extract text from document
text = ocr.get_text('document.pdf')
print(text)

Async Usage

import asyncio
from upsonic.ocr import OCR
from upsonic.ocr.layer_1.engines import EasyOCREngine

engine = EasyOCREngine(languages=['en'], gpu=True)
ocr = OCR(layer_1_ocr_engine=engine, layer_1_timeout=30.0)

async def main():
    text = await ocr.get_text_async('document.pdf')
    result = await ocr.process_file_async('document.pdf')
    print(text)

asyncio.run(main())

GET STARTED

CONCEPTS

STARTING AN AGENT PROJECT

READY TO USE SNIPPETS

DEPLOYMENT

FURTHER READINGS

OCR Orchestrator Parameters

Engine Configuration

Configuration Example

Async Usage

GET STARTED

CONCEPTS

STARTING AN AGENT PROJECT

READY TO USE SNIPPETS

DEPLOYMENT

FURTHER READINGS

​OCR Orchestrator Parameters

​Engine Configuration

​Configuration Example

​Async Usage

OCR Orchestrator Parameters

Engine Configuration

Configuration Example

Async Usage