This page includes tracking data and evaluation tools used in our tracking papers.

Our papers are based on a framwork that associates detection responses into tracks. Therefore, detection performance may influence the tracking performance. For fair tracking performance comparisons, we provide our detection responses and tracking ground truths. Our quantitative results can be found in our papers.

Please cite our corresponding papers when using any of the following data or ground truths.

Trecvid 2008 Data Set

Video source: http://www.itl.nist.gov/iad/mig/tests/trecvid/2008/
Papers to cite when using the data set:
Bo Yang, Chang Huang, and Ram Nevatia. Learning Affinities and Dependencies for Multi-Target Tracking using a CRF Model. In Proceedings of IEEE Computer Vision and Pattern Recognition (CVPR), pp. 1233-1240, Colorado Springs, USA, Jun. 2011
Detection outputs and tracking ground truths for test: Trecvid2008_detection.zip, Trecvid2008_tracking_groundtruth.zip.
Detection outputs and tracking ground truths for training: Trecvid2008_training_detection.zip, Trecvid2008_training_gt.zip.
Details: We use three original videos in Trecvid 2008 data sets with about two hours long for each. We cut six clips from each video with 5000 frames for each clip, and produce 18 video clips in total. Half of them are used for training and the other half are used for test. We rename the video clips according to the following table.

Renamed video clipsCorresponding video in Trecvid 2008Start frame No.End frame No.
   Test    LGW_20071123_E1_CAM1.03.aviLGW_20071123_E1_CAM1.mpeg1500019999
LGW_20071123_E1_CAM1.11.aviLGW_20071123_E1_CAM1.mpeg5500059999
LGW_20071123_E1_CAM1.12.aviLGW_20071123_E1_CAM1.mpeg6000064999
LGW_20071123_E1_CAM3.03.aviLGW_20071123_E1_CAM3.mpeg1500019999
LGW_20071123_E1_CAM3.11.aviLGW_20071123_E1_CAM3.mpeg5500059999
LGW_20071123_E1_CAM3.12.aviLGW_20071123_E1_CAM3.mpeg6000064999
LGW_20071123_E1_CAM5.03.aviLGW_20071123_E1_CAM5.mpeg1500019999
LGW_20071123_E1_CAM5.11.aviLGW_20071123_E1_CAM5.mpeg5500059999
LGW_20071123_E1_CAM5.12.aviLGW_20071123_E1_CAM5.mpeg6000064999
   Training    LGW_20071123_E1_CAM1.08.aviLGW_20071123_E1_CAM1.mpeg4000044999
LGW_20071123_E1_CAM1.09.aviLGW_20071123_E1_CAM1.mpeg4500049999
LGW_20071123_E1_CAM1.10.aviLGW_20071123_E1_CAM1.mpeg5000054999
LGW_20071123_E1_CAM3.08.aviLGW_20071123_E1_CAM3.mpeg4000044999
LGW_20071123_E1_CAM3.09.aviLGW_20071123_E1_CAM3.mpeg4500049999
LGW_20071123_E1_CAM3.10.aviLGW_20071123_E1_CAM3.mpeg5000054999
LGW_20071123_E1_CAM5.08.aviLGW_20071123_E1_CAM5.mpeg4000044999
LGW_20071123_E1_CAM5.09.aviLGW_20071123_E1_CAM5.mpeg4500049999
LGW_20071123_E1_CAM5.10.aviLGW_20071123_E1_CAM5.mpeg5000054999

ETH Data Set

Video source: http://www.vision.ee.ethz.ch/~aess/dataset/
Papers to cite when using the data set:
Bo Yang and Ram Nevatia. An Online Learned CRF Model for Multi-Target Tracking. In Proceedings of IEEE Computer Vision and Pattern Recognition (CVPR), Providence, USA, Jun. 2012.
Detection outputs and tracking ground truths: ETH_detection.zip, ETH_tracking_groundtruth.zip.
Details: We use the BAHNHOF and SUNNY DAY sequences from the ETH data set; only videos captured by the left camera is used in each sequence.

PETS 2009 Data Set

Video source: http://www.cvg.rdg.ac.uk/PETS2009/a.html#s2l1
Papers to cite when using the data set:
Bo Yang and Ram Nevatia. Multi-Target Tracking by Online Learning of Non-linear Motion Patterns and Robust Appearance Models. In Proceedings of IEEE Computer Vision and Pattern Recognition (CVPR), Providence, USA, Jun. 2012.
Detection outputs and tracking ground truths: PETS09.zip.
Details: In our experiments, we use the first 795 frames from Scenario S2.L1 in the data set for comparison with others.

CAVIAR Data Set

Video source: http://homepages.inf.ed.ac.uk/rbf/CAVIARDATA1/
Papers to cite when using the data set:
Bo Yang and Ram Nevatia. Multi-Target Tracking by Online Learning of Non-linear Motion Patterns and Robust Appearance Models. In Proceedings of IEEE Computer Vision and Pattern Recognition (CVPR), Providence, USA, Jun. 2012.
Detection outputs and tracking ground truths: CAVIAR_detection.zip, CAVIAR_tracking_groundtruth.zip.
Details: In our experiments, we use 20 videos from the data set. Please refer to the detection and ground truths files to see the video names.

TUD Data Set

Video source: http://www.d2.mpi-inf.mpg.de/node/428/
Papers to cite when using the data set:
Bo Yang and Ram Nevatia. An Online Learned CRF Model for Multi-Target Tracking. In Proceedings of IEEE Computer Vision and Pattern Recognition (CVPR), Providence, USA, Jun. 2012.
Detection outputs and tracking ground truths: TUD.zip.

Multi-target tracking evaluation tool

Our tracking evaluation is an automatic process. The executable package and sample evaluation data can be downloaded here.

The evaluation tool will provide detection recall, detection precision, number of ground truth detections, number of correct detections, number of false detections, number of total detections, number of total tracks in ground truth, number of generated tracks, proportion of mostly tracked, partly tracked, and mostly lost trajectories, number of fragments, and number of id switches.

For details of the above measurements, please refer to the paper:
Yuan Li, Chang Huang, and Ram Nevatia. Learning to Associate: HybridBoosted Multi-Target Tracker for Crowded Scene. In CVPR 2009.

To use the evaluation tool, please follow the steps:
1. Set video names in “eval_list.xml”.
2. Put all ground truths xml files in the “GT” sub-directory.
3. Put all tracking results xml files in the “eval” sub-directory.
4. Run “Evaluation_5.exe”.

For format of tracking result files and ground truth files, please refer to the sample files in the package. The evaluation tool will generate an evaluation file for each tracking result file as well as a summary file named “Eval_stat.txt”, which contains the measurements for each video and measurements for all videos in total.

Multi-target tracking annotation tool

The executable package of our annotation tool can be downloaded here.