Custom Zero-Shot Object Detection using YOLO-World

Harish Siva Subramanian
Level Up Coding
Published in
3 min readApr 18, 2024

--

What is YOLO?

YOLO (You Only Look Once) is a state-of-the-art object detection algorithm that offers real-time performance while maintaining high accuracy. Unlike traditional object detection methods that rely on region proposal networks and multiple stages, YOLO operates by dividing the input image into a grid and predicting bounding boxes and class probabilities directly from the grid cells. This approach allows YOLO to efficiently detect objects in images with a single forward pass of a neural network, making it well-suited for applications requiring fast and accurate object detection, such as autonomous driving, surveillance, and image understanding.

In my previous article, I discussed how to train a YOLO V9 model using a local GPU. Here we will take one step back to see if there are ways to create custom models without any training.

Let’s start coding!

pip install ultralytics

Do a pip install to install the Ultralytics package.

import ultralytics
from ultralytics import YOLOWorld

Import the YOLOWorld from Ultralytics. YOLO World is the

# Initialize a YOLO-World model
model = YOLOWorld('yolov8l-world.pt') # or select yolov8m/l-world.pt for different sizes

# Execute inference with the YOLOv8s-world model on the specified image
results = model.predict('c.jpg', device='cpu', save=True)

This would download the Yolo world model. We will execute an inference on one of the images that was downloaded from the internet.

We can see the objects “person” and “laptop” were detected. We didn't do any sort of training. Now we can use the same model where we set the classes to just “laptop”. The code below does exactly that,

# Define custom classes
model.set_classes(["laptop"])

# Execute prediction for specified categories on an image
results = model.predict('c.jpg', device='cpu', save=True)

Now only the object “laptop” is detected. Let’s set the classes as just “laptop” and then save the model. This saved model can be later used to predict only laptops in any other image.

from ultralytics import YOLO

# Initialize a YOLO-World model
model = YOLO('yolov8l-world.pt')

model.set_classes(["laptop"])

# Save the model with the defined offline vocabulary
model.save("custom_yolov8l.pt")

Now let’s use the below image to test!

Photo by Annie Spratt on Unsplash
from ultralytics import YOLO

# Load your custom model
model = YOLO('custom_yolov8l.pt')


# Run inference to detect your custom classes
results = model.predict('new.jpg', device='cpu', save=True)

All the predictions would saved in “runs\detect\” folder. Note there will be different predict folders created based on the number of runs like predict, predict2, and so on.

We can see this model now only detects the laptop when passed an image. This is an easy way to use a YOLO model for a custom class without any training.

Thank you for reading!!

If you like the article and would like to support me, make sure to:

--

--

🚀 Passionate Data Scientist | Storyteller 📊✨ Sharing insights on Medium | 🌐 Tech Enthusiast | Connect with me! 🌟🔍