Cernis Intelligence
← Back to Home

Research & Insights

Deep dives into our core foundations: Agentic OCR, privacy-first PII detection, and comprehensive Document AI tools. Discover how we're solving the biggest bottlenecks in enterprise AI adoption.

Cernis

Studio

No Code Required

Day 7: Document AI Studio - Intelligent Document Processing with State-of-the-Art AI

Production-grade document intelligence accessible to everyone. All six primitives in a visual playground - no code required, no setup, no infrastructure to manage.

Dec 7, 2025Engineering
Cernis

Edit Mode

Cursor for PDFs

Day 6: Building Cursor for PDFs - Edit Mode for Documents

What if you could edit PDFs like you edit code? We're building Cursor for PDFs - upload a form, paste your context, and let AI auto-fill intelligently.

Dec 6, 2025Engineering
Cernis

Specialized

Task-Specific Models Win

Day 5: Why Specialized Models Matter - Building Task and Domain-Specific Intelligence

Why specialized models beat general-purpose frontier models on privacy, cost, and domain accuracy. Deep dive into Sentinel-PII, Precis, and Cernis-Legal-OCR.

Dec 5, 2025Engineering
Cernis

Agentic

Human-Like Document Processing

Day 4: Agentic OCR - Processing Documents The Way Humans Do

Why do OCR systems extract text perfectly yet fail to understand what they're reading? Agentic OCR processes documents the way humans do through layout analysis, dual OCR, and structured output.

Dec 4, 2025Engineering
Cernis

Scale

1M+ Documents/Day Architecture

Day 3: Building a Document Processing Pipeline That Scales to 1M+ Documents/Day

How we built document AI infrastructure capable of handling 1 million documents per day through component isolation, stateless workers, and tiered GPU compute.

Dec 3, 2025Engineering
Cernis

Training

Consumer Hardware, Big Results

Day 2: Training and Deploying SOTA Document Intelligence Models on a Budget

How we built production-grade OCR and reasoning models using consumer hardware, efficient fine-tuning, and reinforcement learning for under $50 in GPU time.

Dec 2, 2025Engineering
Cernis

SDK

Six Core Primitives

Day 1: Creating a Document AI SDK Users Actually Want

Six primitives that transform unstructured documents into production-ready data. Deep dive into ocr(), extract(), classify(), summarize(), chunk(), and count_tokens().

Dec 1, 2025Engineering
Cernis

New Model

Privacy-First PII Detection

Announcing Sentinel-PII: A Fast, Accurate PII Detection Model

Lightweight, accurate PII detection model fine-tuned on IBM Granite 4.0 that identifies and tags 20 different categories of sensitive information.

Nov 2, 2025Announcement
Cernis

New Model

Multi-Task Document Understanding

Introducing Cernis-Thinking: A Multi-Task Vision Language Model

Open-source vision language model trained with reinforcement learning to understand and reason about documents, handling math, LaTeX OCR, invoices, and handwriting.

Sep 15, 2025Announcement
Cernis

New Model

Multi-Domain OCR

Announcing CernisOCR: A Faster, Lighter Multi-Domain OCR Model

Unified OCR solution handling mathematical formulas, handwritten notes, and structured invoices in a single model using Qwen2.5-VL architecture.

Sep 12, 2025Announcement
Cernis

New Model

Legal Document OCR

Building Cernis-Legal-OCR: A Specialized Vision Model for Legal Documents

Specialized vision-language model fine-tuned for legal document processing, trained on synthetic documents from the Caselaw Access Project.

Sep 10, 2025Announcement
Cernis

New Model

Privacy-First Summarization

Introducing Precis: A Fast, Privacy-Focused Document Summarization Model

Fine-tuned document intelligence model built on IBM Granite that's fast, runs locally, and maintains document privacy for on-premise processing.

Sep 8, 2025Announcement