Technology
Mistral launches OCR 4, turning document extraction into a full enterprise AI play
Image via VentureBeat
Article Summary
197 words
Mistral AI on Tuesday released OCR 4, a document intelligence model that moves beyond raw text extraction to return structured representations of entire documents — complete with bounding boxes, block-type classification, and per-word confidence scores. The release marks Mistral's fourth… Mistral AI on Tuesday released OCR 4, a document intelligence model that moves beyond raw text extraction to return structured representations of entire documents — complete with bounding boxes, block-type classification, and per-word confidence scores. The release marks Mistral's fourth generation of optical character recognition technology in roughly 15 months and lands at a moment when the company's pitch for European AI sovereignty has never been more commercially relevant.The model supports 170 languages across 10 language groups, accepts PDF, DOC, PPT, and OpenDocument formats, and can be deployed as a single container on an organization's own infrastructure — a capability Mistral is positioning directly at enterprises in regulated industries that cannot route sensitive documents through U.S.-jurisdiction cloud APIs."Mistral OCR 4 extracts and structures content from a wide range of documents," the company said in its announcement. "Where previous generations focused on converting a page into clean text and tables, OCR 4 returns a structured representation of the document."The model is…
Continue Reading
Full story on VentureBeat
🔗 Clicking will take you to venturebeat.com