Hand Pose Detection
Use your webcam to see how an AI model detects hand landmarks in real time.
Webcam preview
Click Start Camera to enable your webcam.
Camera controls
Start webcam and track hands
The demo does not start the webcam automatically.
Camera not started
Permission not requested yet
Privacy note
- Webcam frames are processed locally in the browser.
- No images/video are uploaded to a server by this demo.
- The camera does not start automatically; click “Start Camera”.
- You can stop the camera at any time.
Model & camera status
Model ready
Loading hand model...
Detected hands
0
No hand detected yet.
Performance
—
0 frames processed
Camera
Camera not started
Permission not requested
Interpretation
What the model is doing now
Loading the hand model...
Left hand
Not detected
Right hand
Not detected
Visual controls
Landmarks & skeleton
Adjust what you see, and use the confidence threshold to hide uncertain points.
Higher threshold hides more low-confidence keypoints.
Gesture recognition (educational)
From landmarks to simple gestures
This demo focuses on landmark detection first. The gesture indicator below is a simple rule-based educational example that uses relative landmark positions and distances (not a trained gesture classifier).
Current gesture
—
Based on the latest detected landmarks.
How it works (rules)
- Open hand: fingertip joints appear above lower finger joints.
- Fist: fingertip joints appear below finger joints.
- Pointing: index finger is more extended than other fingers.
- Pinch: thumb tip is close to index fingertip.
Step-by-step pipeline
How hand pose detection works
This pipeline highlights the flow from webcam input to landmark points and gesture cues.
Webcam frame input
A webcam provides live video frames for real-time processing.
Frame preprocessing
The browser prepares the frame for the model (resize/format as needed).
Hand region detection
The model looks for hand areas in the frame.
Landmark prediction
It predicts key points on the hand, like wrist and finger joints.
Confidence scoring
Each landmark can include a confidence score (how sure the model is).
Skeleton drawing
When points are confident enough, the demo draws a hand skeleton.
Coordinates & table
Coordinates update continuously so students can inspect the numbers.
Gesture understanding
Simple gesture examples can be estimated from relative landmark positions.
Landmark coordinates
Coordinates will appear after detection
Start the camera and show your hand in good lighting.
Hand landmark map
Common landmark groups
Pose detection predicts key points on your hand. By connecting landmark groups, we can draw a skeleton that shows hand pose.
Wrist
The base landmark where the hand attaches. It helps anchor pose and motion.
Thumb joints
The thumb joints help model grip and pointing angles.
Index finger joints
Index landmarks describe pointing and fine finger movement.
Middle finger joints
Middle landmarks help describe overall hand posture.
Ring finger joints
Ring landmarks contribute to hand shape and curl.
Pinky finger joints
Pinky landmarks help complete the hand pose estimate.
Real-time AI
Why it updates continuously
Your webcam provides many frames per second. The model runs inference repeatedly on those frames, then the canvas overlay redraws with the latest hand landmarks. Faster devices may run more smoothly, while lighting, distance, occlusion, and device performance affect accuracy.
Browser-based AI helps keep frames local: this demo does not upload video to a server.
How it works
Hand landmarks, then skeleton lines, then (simple) gesture cues
Hand pose detection is a computer vision task where an AI model identifies important landmark points on a hand, such as the wrist, fingertips, and finger joints.
A landmark is a predicted key point. By connecting landmarks, we can draw a hand skeleton that represents the pose.
Gesture recognition can be built on top of landmark positions. This page focuses on landmark detection first; gesture cues shown here are simple rule-based educational examples.
The model is pre-trained and runs inference in the browser. It is not learning from your webcam during the demo.
Landmark coordinates explained
How to read the table
X coordinate
Horizontal position of the landmark in the video frame.
Y coordinate
Vertical position of the landmark in the video frame.
Z coordinate
Estimated depth or relative distance if provided by the model. If missing, the demo shows “--”.
Confidence threshold
Only keypoints with confidence above this value are drawn (and treated as “shown” in the table).
Real-time inference means predictions repeat across frames while the webcam is running.
Privacy note
Keep the webcam local
Webcam frames are processed locally in your browser. This demo does not upload video to a server.
You can stop the camera at any time. The page does not identify you; it only estimates hand landmarks.
Limitations & ethics
Understand the risks before using tracking
- The model can lose tracking.
- Poor lighting can reduce accuracy.
- Fast movement can cause unstable landmarks.
- Hands partly outside the frame may not be detected.
- Gloves, occlusion, unusual angles, and overlapping hands reduce accuracy.
- Hand tracking should be used carefully in surveillance, identity-related, workplace, or high-stakes settings.
- This demo is for learning and teaching only.
Student learning outcomes
What you will learn
- Understand what hand pose detection means.
- Understand landmarks and skeleton connections.
- Understand real-time webcam inference.
- Understand how landmark coordinates can support gesture recognition.
- Understand classification vs detection vs landmarks vs pose tracking.
- Understand privacy considerations for webcam-based AI.
- Learn how pre-trained computer vision models can run in the browser.
Final reminder
The model is pre-trained and performs inference only. It does not learn from your webcam during this demo.