Research
Multi-pass Context-Aware Processing
Agentic OCR: Beyond Traditional Document Processing
Revolutionary approach to document processing through multi-pass, context-aware analysis that understands document structure and meaning.
Deep dives into our core foundations: Agentic OCR, privacy-first PII detection, and comprehensive Document AI tools. Discover how we're solving the biggest bottlenecks in enterprise AI adoption.
Multi-pass Context-Aware Processing
Revolutionary approach to document processing through multi-pass, context-aware analysis that understands document structure and meaning.
Multi-Task Document Understanding
Open-source vision language model trained with reinforcement learning to understand and reason about documents, handling math, LaTeX OCR, invoices, and handwriting.
Multi-Domain OCR
Unified OCR solution handling mathematical formulas, handwritten notes, and structured invoices in a single model using Qwen2.5-VL architecture.
Legal Document OCR
Specialized vision-language model fine-tuned for legal document processing, trained on synthetic documents from the Caselaw Access Project.
Privacy-First Summarization
Fine-tuned document intelligence model built on IBM Granite that's fast, runs locally, and maintains document privacy for on-premise processing.
Privacy-First PII Detection
Lightweight, accurate PII detection model fine-tuned on IBM Granite 4.0 that identifies and tags 20 different categories of sensitive information.