Semi-Supervised Learning | Pseudo Labeling Custom Dataset with YOLOv4

Lahiru Rathnayake
3 min readJan 13, 2022

--

Image by Author.

Introduction

Image classification is the most common computer vision problem where an algorithm process an image and classifies the classes. This technique extended with object detection algorithms, where it uses localization with the classification. In object detection methods object is localized by a bounding box, where the bounding box is represented by four value points according to the pixels in an image.

If you are trying to train an object detection model with custom data, human resources are required to annotate enormous amounts of data manually. Consider a large amount of image data set that need to train on a model, and manually labeling all of this data ourselves may take a long time and logistically difficult. Pseudo labeling is a Semi-supervised learning approach that helps to deal with unlabeled data. This method uses a small set of labeled data with unlabeled data to improve the model’s robustness. Here we can use the labeled data to train the model which gives us a partially trained model and use that model to annotate unlabeled data. The labels annotated by the partially trained model are called pseudo labels. In this post, we’ll walk through how to prepare a custom dataset for object detection, train it with YOLOv4 and use a partially trained model for pseudo labeling.

Data Preparation

For the training purpose, we need to annotate a small number of images from the dataset. As the annotation tool, you can use the Visual Object Tagging Tool (VoTT) which is an open-source annotation and labeling tool for image and video assets. After the annotation, export the annotated data as VoTT CSV format and place the CSV file in the image folder.

├── Image_Folder
│ ├── image1.jpg
│ ├── image2.jpg
│ ├── image3.jpg
│ ├── ----------
│ ├── ----------
│ ├── image100.jpg
│ └── vott_annotation.csv

Training Process

Preface

For the training process, YOLOv4 is used as the object detection model. YOLOv4 is a popular real-time object detection model. (Paper link: YOLOv4: Optimal Speed and Accuracy of Object Detection). The semi-supervised learning pipeline with YOLOv4 is built as a wrapper around the excellent PyTorch implementation of YOLOv4 by Tianxiaomo.

Installation

1. Clone the SSL_Vision_Pipeline

git clone https://github.com/LahiRumesh/SSL_Vision_Pipeline.git

2. Create a new virtual environment and install the required packages

cd SSL_Vision_Pipeline
pip install -r requirements.txt

3. Download the weight file yolov4.conv.137.pth

Training

Use the train_models.py for the training. Here you can change the parameters for the training or run python train_models.py -h to view the description of the arguments.

python train_models.py --model YOLOV4 --data_dir /home/data/Image_Folder --weights yolov4.conv.137.pth --validation 0.1 --epochs 80 --batch_size 8

All the weight checkpoints and data log will be saved in the data_models dir according to the corresponding data set name. During the training process, it calculates the mAP for different IOU thresholds. You can get an idea of the model performance from that table. also, mAP graphs will be saved in the wandb. If you are not logged into your wandb account, please log in to the account before the training process.

Pseudo Labeling

After the training process, you can use the saved checkpoint weight file to label the remaining unlabeled data in the pseudo label process. Use pseudo_label.py for the pseudo labeling and check out the arguments for the process.

python pseudo_label.py --checkpoint checkpoint80.pth --class_file class.names --dir_name unlabeld_images --conf_thresh 0.5 --iou_thresh 0.2 

Afterward, an annotated CSV file will be saved in the image folder which you use for the pseudo labeling. These data can be used to retrain the model and enhance its accuracy.

Reference

  1. YOLOv4: Optimal Speed and Accuracy of Object Detection
  2. YOLOv4 by Tianxiaomo

--

--

Lahiru Rathnayake

AI researcher with a passion centered on Machine Learning and Computer Vision.