Gesture-Based Sterile Radiology Image Browser (Python + Deep Learning)

2025-10-20 TensorFlow OpenCV Gesture CNN Image Processing Python Flask Deep Learning

A Python–Flask and Deep Learning solution that enables radiologists to browse and manipulate medical images using hand gestures—fully contactless and ideal for sterile environments like operation theatres.

Gesture-Based Tool for Sterile Browsing of Radiology Images

A deep-learning powered medical tool that enables radiologists to browse images hands-free using real-time gesture recognition.

Download Project

Download the source code, trained model, and documentation.

Download GestureRadiologyBrowser.zip
View on GitHub

System Requirements

Python 3.7 or above
TensorFlow/Keras for gesture model loading
OpenCV for camera-based gesture capture
Flask for web interface
A webcam (built-in or USB)
GPU optional (for faster processing)

Key Features

✔ Hands-free control of radiology images
✔ Real-time gesture recognition using Deep Learning
✔ Flask web interface for uploading medical images
✔ Sterile browsing — ideal for surgery rooms / medical labs
✔ Supports gesture-based operations like zoom, rotate, blur, grayscale, and display
✔ Uses TensorFlow model (gesture.h5) for classification
✔ Modular Python code — easy to extend or integrate

How It Works

1. User uploads a radiology image through web interface.
2. Program loads the deep-learning gesture model (gesture.h5).
3. Webcam captures hand movements in a designated ROI box.
4. The model predicts the gesture in real-time:
   - ZERO → Hold / No action
   - ONE → Show image (200×200)
   - TWO → Apply Gaussian blur
   - THREE → Rotate image (-45 degrees)
   - FOUR → Zoom (400×400)
   - FIVE → Convert to grayscale
5. Output is displayed in a separate OpenCV window.
6. ESC key exits the detection loop.

Gesture Detection Flow

Step 1: Start Flask Web App
Step 2: User uploads image → saved to /uploads/
Step 3: Real-time video is captured using OpenCV
Step 4: ROI (Region of Interest) extracted for gesture detection
Step 5: Model predicts the gesture (ZERO–FIVE)
Step 6: Perform corresponding medical image operation
Step 7: Display the processed image
Step 8: Repeat until user presses ESC