MuhammadLab
Computer VisionBrowser-basedNo API requiredComputer VisionPixel-level predictionEducational demo

Image Segmentation Studio

Explore how machine learning models divide images into meaningful regions by predicting labels at the pixel level.

This tool demonstrates image segmentation. Instead of only saying what is in an image, segmentation shows where different regions are by creating masks.

What is Image Segmentation?

Image segmentation is a computer vision task where a model assigns a label to pixels or regions in an image. Instead of giving one label to the whole image, segmentation produces a mask that shows which pixels belong to a person, background, object, road, sky, animal, or another category.

If an image contains a person standing in front of a background, image segmentation can separate person pixels from background pixels. In a road scene, segmentation can label the road, cars, pedestrians, sky, buildings, and trees.

How does Image Segmentation work?

  1. The input image is resized and converted into pixel values.
  2. A neural network processes the image.
  3. Instead of predicting one class for the whole image, the model predicts a class for each pixel or region.
  4. The output is a segmentation mask.
  5. The mask can be drawn over the original image using colors or transparency.

What is a segmentation mask?

A segmentation mask is an image-like output where each pixel represents a predicted category. For example, person pixels may be white and background pixels may be black. In multi-class segmentation, each class can be shown in a different color.

Visible notes

  • - Image segmentation predicts regions at the pixel level. It is more detailed than object detection because it estimates object shape rather than only drawing a bounding box.
  • - This browser demo is educational. Results may be inaccurate for images that differ from the model's training data.
  • - Images are processed locally in the browser and are not uploaded to a server.

Types of Image Segmentation

Semantic, instance, panoptic, and person/background

TypeMeaningExample
Semantic SegmentationLabels each pixel by class, but does not separate individual objects of the same class.All cars are labeled as car.
Instance SegmentationLabels each pixel and separates each individual object instance.Car 1, Car 2, and Car 3 are separated.
Panoptic SegmentationCombines semantic and instance segmentation.Road and sky are labeled while each person and car is separated.
Person/Background SegmentationSeparates a person from the background.Used in video calls for background blur.
Image preview and segmentation output appear here.

Results and interpretation

Segmentation output

Loading
Model status
Loading Image Segmenter in the browser...
Input type
No input selected
Segmentation mode
Not run yet
Output type
Overlay
Number of classes detected
0
Active class labels
No foreground classes detected yet
Educational interpretation: Run segmentation to see how the model separates image regions at the pixel level.
Current demo model

This page uses a browser-based MediaPipe Image Segmenter with a semantic segmentation model. It predicts pixel classes such as background, person, car, dog, road-scene objects, and other categories from its training data.

Comparison

Image Segmentation vs Other Computer Vision Tasks

TaskMain QuestionOutputExample
Image ClassificationWhat is this image?One label + confidenceDog
Object DetectionWhere are the objects?Bounding boxes + labelsBox around dog
Image SegmentationWhich pixels belong to each region?Pixel-level maskExact dog outline
Object IdentificationWhich specific object/person is this?Identity/nameThis is Person A
Object VerificationDoes this match the reference?Yes/No or similarity scoreDoes this face match the ID photo?
Pose EstimationWhere are the body joints?Keypoints/skeletonElbows, knees, shoulders

Segmentation is more detailed than detection because it does not just draw a rectangle. It estimates the actual shape of the object or region.

Try This in Class

Student tasks

  1. Upload a simple image with one clear object.
  2. Upload an image with a person and background.
  3. Upload a crowded image with multiple objects.
  4. Upload a dark or blurry image.
  5. Compare original image, mask-only view, and overlay view.
  6. Adjust mask opacity and observe how the interpretation changes.
  7. Discuss why pixel-level prediction is harder than classification.

Discussion questions

  1. Why is segmentation more detailed than object detection?
  2. What is the difference between a bounding box and a mask?
  3. Why might segmentation fail on unusual images?
  4. How could segmentation be used in medicine?
  5. How could segmentation be used in self-driving cars?
  6. What privacy issues arise when using webcam-based segmentation?
  7. Why should browser-based processing be preferred for sensitive images?

Medical imaging

Tumor segmentation, organ segmentation, and cell segmentation.

Autonomous driving

Road, lane, pedestrian, car, and sign segmentation.

Video conferencing

Background blur and background replacement.

Agriculture

Plant disease region segmentation and crop/weed separation.

Satellite imaging

Land cover, water, forest, and urban area segmentation.

Digital forensics

Separating foreground/background regions or identifying manipulated regions.

Creative tools

Background removal and image editing.

Technical Notes
Library
@mediapipe/tasks-vision
Task
Image segmentation
Input
Image, video frame, or canvas
Output
Segmentation mask
Execution
Browser/client-side
Privacy
Images remain in the browser
Limitation
The model only recognizes categories it was trained to segment