Update 1: I found way better article on how to train YOLOv2 here
YOLOv2 is open source state-of-the-art real-time object detector that is written on deep learning framework darknet in C language https://pjreddie.com/darknet/yolo/ . Simple guide to reproduce results in the YOLOv2 paper is provided at author’s blog.
To train on custom dataset some elegant instructions are given on windows port of the YOLO at https://github.com/AlexeyAB/darknet .
Here I will show hands on approach to train YOLOv2 detector (If you cannot see the images clearly, please zoom in the browser)
Detect/Count 8 types of Beverage Bottles: pepsi,7up, mirinda, dolina_olma, dolina_behi, dolina_olcha, dolina_limon, dolina_apelsin
As said above I use PEPSI dataset, it contains around 150 images, even though it is small, for our example during this post that should be enough.
I annotated the dataset using YOLO-MARK annotation tool. For a tutorial on this visit here https://github.com/AlexeyAB/Yolo_mark .
Put all the class labels into obj.names file. So content looks like this:
Then start the program and start labeling:
As a result of annotation we will have corresponding .txt file for each images where *.txt file contains YOLO format annotations
next I moved all the *.txt files and put them into labels folder and rename the img folder to images
So now my folder looks like this
Since we changed the img folder name to images folder name, now we have to to change train.txt accordingly.
One last step is to put full paths to images instead of relative paths. Because later darknet will access this file from outside.
- copy yolo-voc.cfg from https://github.com/Jumabek/darknet/blob/master/cfg/yolo-voc.cfg and rename it as pepsi.cfg
- change filters=125 in last convolutional layer to filters=65 which is (5+8)*5. Here first 5 corresponds to (x,y,w,h,objectness_score), 8 corressponds to number of classes, in my case I have 8 classes. Last 5 corressponds to number of BoundingBox predictions for each cell.
- change classess=20 to classess=8
Now this is how our cfg file looks like
I have 4 GB GTX 1050 GPU on my laptop, so I set batch=64 and subdivisions=8. That way my GPU will process 64/8 = 8 images in one pass. Lets say if you have 8GB GPU memory then you can set batch=64 and subdivisions=4. In order to take advantage of all of your gpu memory in order to speed up the training
Creating *.data and *.names files
- Copy obj.names and obj.data files (that we created in Data Preparation step with YOLO_MARK) to C:\darknet\build\darknet\x64\data
- rename obj.names to pepsi.names
- rename obj.data to pepsi.data
- Fix the paths in pepsi.data to point to right files as follows
- Download darknet19_448.conv.23 pre-trained weights from https://pjreddie.com/media/files/darknet19_448.conv.23 and put into C:\darknet\build\darknet\x64\backup folder
Finally Start training
darknet.exe detector train data/pepsi.data cfg/pepsi.cfg backup\\darknet19_448.conv.23 >> pepsi.log
Training log will be saved in pepsi.log file, so you can monitor loss, recall and other things by accessing this file.
Enjoy your cup of coffe and come back later 🙂
Important: I try making tutorial on how to get best out of YOLO training in another post. So, stay tuned!