How to Generate Accurate Segmentation Masks Using Object Detection and SAM2 Model
Segmentation masks are vital for precise object tracking and analysis, allowing pixel-level identification of objects. By leveraging a fine-tuned Ultralytics YOLO11 model alongside the Segment Anything 2 (SAM2) model, you can achieve unparalleled accuracy and flexibility in your workflows.
Hardware and Software Setup for This Demo
- CPU: Intel® Core™ i5-10400 CPU @ 2.90GHz for efficient processing.
- GPU: NVIDIA RTX 3050 for real-time tasks.
- RAM and Storage: 64 GB RAM and 1TB hard disk for seamless performance.
- Model: Fine-tuned YOLO11 model for object detection.
- Dataset: Custom annotated dataset for maximum accuracy.
How to Generate Segmentation Masks
Step 1: Prepare the Model
Train or fine-tune a custom YOLO11 model, or use the Ultralytics Pretrained Models for object detection tasks.
Step 2: Auto Annotation with SAM2
Integrate the SAM2 model to convert bounding boxes into segmentation masks.
# Install the necessary library
# pip install ultralytics
from ultralytics.data.annotator import auto_annotate
# Automatically annotate images using YOLO and SAM2 models
auto_annotate(data="Path/to/images/directory",
det_model="yolo11n.pt",
sam_model="sam2_b.pt")
Step 3: Generate and Save Masks
Run the script to save segmentation masks as .txt
files in the images_auto_annotate_labels
folder.
Step 4: Visualize the Results
Use the following script to overlay segmentation masks on images.
import os
import cv2
import numpy as np
from ultralytics.utils.plotting import colors
# Define folder paths
image_folder = "images_directory" # Path to your images directory
mask_folder = "images_auto_annotate_labels" # Annotation masks directory
output_folder = "output_directory" # Path to save output images
os.makedirs(output_folder, exist_ok=True)
# Process each image
for image_file in os.listdir(image_folder):
image_path = os.path.join(image_folder, image_file)
mask_file = os.path.join(mask_folder,
os.path.splitext(image_file)[0] + ".txt")
img = cv2.imread(image_path) # Load the image
height, width, _ = img.shape
with open(mask_file, "r") as f: # Read the mask file
lines = f.readlines()
for line in lines:
data = line.strip().split()
color = colors(int(data[0]), True)
# Convert points to absolute coordinates
points = np.array([(float(data[i]) * width, float(data[i + 1])*height)
for i in range(1, len(data), 2)],
dtype=np.int32).reshape((-1, 1, 2))
overlay = img.copy()
cv2.fillPoly(overlay, [points], color=color)
alpha = 0.6
cv2.addWeighted(overlay, alpha, img, 1 - alpha, 0, img)
cv2.polylines(img, [points], isClosed=True, color=color, thickness=3)
# Save the output
output_path = os.path.join(output_folder, image_file)
cv2.imwrite(output_path, img)
print(f"Processed {image_file} and saved to {output_path}")
print("Processing complete.")
That's it! After completing Step 4, you'll be able to segment objects and view the total count for each segmented object in every frame.
Real-World Applications
- Medical Imaging: Segment organs and anomalies in scans for diagnostics.
- Retail Analytics: Detect and segment customer activities or products.
- Robotics: Enable robots to identify objects in dynamic environments.
- Satellite Imagery: Analyze vegetation and urban areas for planning.
Explore More
Start building your object segmentation workflow today!🚀