Document Type: Original Research Paper


1 Department of Computer Engineering, Amirkabir University of Technology

2 AmirKabir


Multiple people detection and tracking is a challenging task in real-world crowded scenes. In this paper, we have presented an online multiple people tracking-by-detection approach with a single camera. We have detected objects with deformable part models and a visual background extractor. In the tracking phase we have used a combination of support vector machine (SVM) person-specific classifiers, similarity scores, the Hungarian algorithm and inter-object occlusion handling. Detections have been used for training person-specific classifiers and to help guide the trackers by computing a similarity score based on them and spatial information and assigning them to the trackers with the Hungarian algorithm. To handle inter-object occlusion we have used explicit occlusion reasoning. The proposed method does not require prior training and does not impose any constraints on environmental conditions. Our evaluation showed that the proposed method outperformed the state of the art approaches by 10% and 15% or achieved comparable performance.


Main Subjects

 [1]   Hare, Sam, Amir Saffari, and Philip HS Torr. "Struck: Structured output tracking with kernels." In Computer Vision (ICCV), 2011 IEEE International Conference on, pp. 263-270. IEEE, 2011.

[2]   Kalal, Zdenek, Jiri Matas, and Krystian Mikolajczyk. "Pn learning: Bootstrapping binary classifiers by structural constraints." In Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, pp. 49-56. IEEE, 2010.

[3]   Andriyenko, Anton, and Konrad Schindler. "Multi-target tracking by continuous energy minimization." In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, pp. 1265-1272. IEEE, 2011.

[4]   Milan, Anton, Konrad Schindler, and Stefan Roth. "Detection-and trajectory-level exclusion in multiple object tracking." In Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on, pp. 3682-3689. IEEE, 2013.

[5]   Zamir, Amir Roshan, Afshin Dehghan, and Mubarak Shah. "Gmcp-tracker: Global multi-object tracking using generalized minimum clique graphs." InComputer Vision–ECCV 2012, pp. 343-356. Springer Berlin Heidelberg, 2012.

[6]   Tang, Siyu, Mykhaylo Andriluka, Anton Milan, Konrad Schindler, Stefan Roth, and Bernt Schiele. "Learning people detectors for tracking in crowded scenes."ICCV’13 (2013).

[7]   Felzenszwalb, Pedro F., Ross B. Girshick, David McAllester, and Deva Ramanan. "Object detection with discriminatively trained part-based models."Pattern Analysis and Machine Intelligence, IEEE Transactions on 32, no. 9 (2010): 1627-1645.

[8]   Li, Yuan, Chang Huang, and Ram Nevatia. "Learning to associate: Hybridboosted multi-target tracker for crowded scene." In Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, pp. 2953-2960. IEEE, 2009.

[9]   Benfold, Ben, and Ian Reid. "Stable multi-target tracking in real-time surveillance video." In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, pp. 3457-3464. IEEE, 2011.

[10]Breitenstein, Michael D., Fabian Reichlin, Bastian Leibe, Esther Koller-Meier, and Luc Van Gool. "Robust tracking-by-detection using a detector confidence particle filter." In Computer Vision, 2009 IEEE 12th International Conference on, pp. 1515-1522. IEEE, 2009.

[11]Kuo, Cheng-Hao, Chang Huang, and Ram Nevatia. "Multi-target tracking by on-line learned discriminative appearance models." In Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, pp. 685-692. IEEE, 2010.

[12]Shu, Guang, Afshin Dehghan, Omar Oreifej, Emily Hand, and Mubarak Shah. "Part-based multiple-person tracking with partial occlusion handling." InComputer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, pp. 1815-1821. IEEE, 2012.

[13]Barnich, Olivier, and Marc Van Droogenbroeck. "ViBe: A universal background subtraction algorithm for video sequences." Image Processing, IEEE Transactions on 20, no. 6 (2011): 1709-1724.

[14]Dollar, Piotr, Christian Wojek, Bernt Schiele, and Pietro Perona. "Pedestrian detection: An evaluation of the state of the art." Pattern Analysis and Machine Intelligence, IEEE Transactions on 34, no. 4 (2012): 743-761

[15]Kuhn, Harold W. "The Hungarian method for the assignment problem." Naval research logistics quarterly 2, no. 1‐2 (1955): 83-97.

[16]Ellis, Anna, Ali Shahrokni, and James Michael Ferryman. "Pets2009 and winter-pets 2009 results: A combined evaluation." In Performance Evaluation of Tracking and Surveillance (PETS-Winter), 2009 Twelfth IEEE International Workshop on, pp. 1-8. IEEE, 2009.

[17] Andriluka, Mykhaylo, Stefan Roth, and Bernt Schiele. "Monocular 3d pose estimation and tracking by detection." In Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, pp. 623-630. IEEE, 2010.

[18]Kasturi, Rangachar, Dmitry Goldgof, Padmanabhan Soundararajan, Vasant Manohar, John Garofolo, Rachel Bowers, Matthew Boonstra, Valentina Korzhova, and Jing Zhang. "Framework for performance evaluation of face, text, and vehicle detection and tracking in video: Data, metrics, and protocol."Pattern Analysis and Machine Intelligence, IEEE Transactions on 31, no. 2 (2009): 319-336.