OCR Studio
Extract text from uploaded images, scanned notes, screenshots, or posters using Tesseract.js, then inspect the recognized text and optional word or line boxes.
How OCR fits computer vision
OCR combines image preprocessing with text recognition. Before the recognizer can read letters, the image often needs contrast adjustment, grayscale conversion, thresholding, or scaling.
What this page shows
Students can upload notes, screenshots, posters, or scanned pages, then compare the extracted text with optional word or line boxes drawn over the original image.
Why bounding boxes matter
Document analysis is not only about the final text string. Word and line boxes reveal where the recognizer found text regions and make OCR errors much easier to diagnose.
Image: No image selected yet
Visual Overlay
Word boxes are useful when students want to inspect local OCR mistakes. Line boxes are useful when students want to study document layout and reading order.
Preprocessing Controls
OCR Results
Extracted text will appear here after OCR runs.
Stats
Words
0
Lines
0
Characters
0
Avg confidence
—
Teaching Notes
- Grayscale and contrast help separate text from the background before recognition begins.
- Thresholding can help clean high-contrast receipts or scans, but it can also destroy faint handwriting if used too aggressively.
- Scaling up small images often improves OCR because letter shapes become easier for the recognizer to distinguish.
- Bounding boxes help students compare recognition output with the actual text layout on the page.