How to train YOLOv2 on custom dataset

Update 1: I found way better article on how to train YOLOv2  here


YOLOv2 is open source state-of-the-art real-time object detector that is written on deep learning framework darknet in C language . Simple guide to reproduce results in the YOLOv2 paper is provided at author’s blog.

To train on custom dataset some elegant instructions are given on windows port of the YOLO at .

Here I will show hands on approach to train YOLOv2 detector (If you cannot see the images clearly, please zoom in the browser)


Detect/Count 8 types of Beverage Bottles: pepsi,7up, mirinda, dolina_olma, dolina_behi, dolina_olcha, dolina_limon, dolina_apelsin

Dataset Preparation

As said above I use PEPSI dataset, it contains around 150 images, even though it is small, for our example during this post that should be enough.

I annotated the dataset using YOLO-MARK annotation tool.  For a tutorial on this visit here .

Put all the class labels into obj.names  file. So content looks like this:


Then start the program and start labeling:



As a result of annotation we will have corresponding .txt file for each images where *.txt file contains YOLO format annotations



next I moved all the *.txt files and put them into labels folder and rename the img  folder to images

So now my folder looks like this


Since we changed the img folder name to images folder name, now we have to to change train.txt accordingly.


One last step is to put full paths to images instead of relative paths. Because later darknet will access this file from outside.



Network CFG

  1. copy  yolo-voc.cfg from and rename it as pepsi.cfg
  2. change filters=125 in last convolutional layer to filters=65      which is (5+8)*5.  Here first 5 corresponds to (x,y,w,h,objectness_score), 8 corressponds to number of classes, in my case I have 8 classes. Last 5 corressponds to number of BoundingBox predictions for each cell.
  3. change classess=20 to classess=8

Now this is how our cfg file looks like


I have 4 GB GTX 1050 GPU on my laptop, so I set batch=64 and subdivisions=8. That way my GPU will process 64/8 = 8 images in one pass. Lets say if you have 8GB GPU memory then you can set batch=64 and subdivisions=4. In order to take advantage of all of your gpu memory in order to speed up the training


Creating *.data and *.names files

  1. Copy obj.names and files (that we created in Data Preparation step with YOLO_MARK) to C:\darknet\build\darknet\x64\data
  2. rename obj.names to pepsi.names
  3. rename to pepsi.dataUntitled.png
  4. Fix the paths in to point to right files as followsUntitled.png
  5. Download darknet19_448.conv.23 pre-trained weights from and put into C:\darknet\build\darknet\x64\backup folder

Finally Start training

darknet.exe detector train data/ cfg/pepsi.cfg backup\\darknet19_448.conv.23 >> pepsi.log

Training log will be saved in pepsi.log file, so you can monitor loss, recall and other things by accessing this file.


Enjoy your cup of coffe and come back later 🙂

Important: I try making tutorial on how to get best out of YOLO training in another post. So, stay tuned!


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s