How to train YOLOv2 on custom dataset

Update 1: I found way better article on how to train YOLOv2  here

 

YOLOv2 is open source state-of-the-art real-time object detector that is written on deep learning framework darknet in C language https://pjreddie.com/darknet/yolo/ . Simple guide to reproduce results in the YOLOv2 paper is provided at author’s blog.

To train on custom dataset some elegant instructions are given on windows port of the YOLO at https://github.com/AlexeyAB/darknet .

Here I will show hands on approach to train YOLOv2 detector (If you cannot see the images clearly, please zoom in the browser)

Task

Detect/Count 8 types of Beverage Bottles: pepsi,7up, mirinda, dolina_olma, dolina_behi, dolina_olcha, dolina_limon, dolina_apelsin

Dataset Preparation

As said above I use PEPSI dataset, it contains around 150 images, even though it is small, for our example during this post that should be enough.

I annotated the dataset using YOLO-MARK annotation tool.  For a tutorial on this visit here https://github.com/AlexeyAB/Yolo_mark .

Put all the class labels into obj.names  file. So content looks like this:

Untitled

Then start the program and start labeling:

Untitled.png

 

As a result of annotation we will have corresponding .txt file for each images where *.txt file contains YOLO format annotations

Untitled.png

 

next I moved all the *.txt files and put them into labels folder and rename the img  folder to images

So now my folder looks like this

Untitled

Since we changed the img folder name to images folder name, now we have to to change train.txt accordingly.

Untitled.png

One last step is to put full paths to images instead of relative paths. Because later darknet will access this file from outside.

Untitled.png

Training

Network CFG

  1. copy  yolo-voc.cfg from https://github.com/Jumabek/darknet/blob/master/cfg/yolo-voc.cfg and rename it as pepsi.cfg
  2. change filters=125 in last convolutional layer to filters=65      which is (5+8)*5.  Here first 5 corresponds to (x,y,w,h,objectness_score), 8 corressponds to number of classes, in my case I have 8 classes. Last 5 corressponds to number of BoundingBox predictions for each cell.
  3. change classess=20 to classess=8

Now this is how our cfg file looks like

Untitled.png

I have 4 GB GTX 1050 GPU on my laptop, so I set batch=64 and subdivisions=8. That way my GPU will process 64/8 = 8 images in one pass. Lets say if you have 8GB GPU memory then you can set batch=64 and subdivisions=4. In order to take advantage of all of your gpu memory in order to speed up the training

Untitled

Creating *.data and *.names files

  1. Copy obj.names and obj.data files (that we created in Data Preparation step with YOLO_MARK) to C:\darknet\build\darknet\x64\data
  2. rename obj.names to pepsi.names
  3. rename obj.data to pepsi.dataUntitled.png
  4. Fix the paths in pepsi.data to point to right files as followsUntitled.png
  5. Download darknet19_448.conv.23 pre-trained weights from https://pjreddie.com/media/files/darknet19_448.conv.23 and put into C:\darknet\build\darknet\x64\backup folder

Finally Start training

darknet.exe detector train data/pepsi.data cfg/pepsi.cfg backup\\darknet19_448.conv.23 >> pepsi.log

Training log will be saved in pepsi.log file, so you can monitor loss, recall and other things by accessing this file.

Untitled.png

Enjoy your cup of coffe and come back later 🙂

Important: I try making tutorial on how to get best out of YOLO training in another post. So, stay tuned!

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s