Tracking with deep networks

Tracking is the process of locating a user selected object in different frames as it moves around the scene. It has a variety of uses such as human-computer interactions, gesture recognition, driver assistance systems, security monitoring, medical imaging and agricultural automations. There has been extensive studies for tracking during the last four decades and many different tracking algorithms have been proposed. However, all these trackers are limited to simple scenarios such as no occlusion, illumination or appearance change and no complex object motion. On the other hand we have such perfect tracker examples: humans and animals!! The human visual system object tracking performance is currently unsurpassed by engineered systems, thus our research tries to take inspiration and reverse-engineer the known principles of cortical processing during visual tracking.Inspired by recent ﬁndings on shallow feature extractors of the visual cortex, we postulate that simple tracking processes are based on a shallow neural network that can identify quickly similarities between object features repeated in time. We propose an algorithm that can track and extract motion of an object based on the similarity between local features observed in subsequent frames. The local features are initially deﬁned as a bounding box that deﬁnes the object to track.

The Similarity Matching Ratio (SMR) Tracker

The SMR tracker achieved the state-of-the-art performance on the TLD [1] dataset as presented in Table 2. See the SMR Paperto learn more about it!! Download the code and try yourself!

Figure 1 shows snopshots from videos and Table 1 lists the properties. Detection is considered to be correct if its overlap with ground truth bounding box is larger than 25% .

Figure 1 : Snapshots from the sequences with the objects marked by the bounding box [1]

Videos of the SMR tracker on the TLD dataset

**David
**https://www.youtube.com/watch?v=FiUbhmwtASM

**Jumping
**https://www.youtube.com/watch?v=zkhv6cvK-cQ

Pedestrian1
**https://www.youtube.com/watch?v=Pdt7wti2wVw

Pedestrian 2
https://www.youtube.com/watch?v=nVhkO6ZT5sg

Pedestrian 3
https://www.youtube.com/watch?time_continue=20&v=gcsLCIGYvcA

Car
**https://www.youtube.com/watch?v=1eIV1r3tShg

References

Z. Kalal, J. Matas, and K. Mikolajczyk. P-N Learning: Bootstrapping Binary Classiﬁers by Structural Constraints. Conference on Computer Vision and Pattern Recognition. 2010
Z. Kalal, K. Mikolajczyk. Forward-Backward Error: Automatic Detection of Tracking Failures. International Conference on Pattern Recognition. 2010.
J. Lim, D. Ross, R. Lin, and M. Yang. Incremental learning for visual tracking. NIPS, 2005.
R. Collins, Y. Liu, and M. Leordeanu. Online selection of discriminative tracking features. PAMI, 27(10):1631–1643, 2005.
S. Avidan. Ensemble tracking. PAMI, 29(2):261–271, 2007.
B. Babenko, M.-H. Yang, and S. Belongie. Visual tracking with online multiple instance learning. CVPR, 2009.

NOTE: this is an old post from our research in 2012

Tracking with deep networks

Tracking with deep networks

The Similarity Matching Ratio (SMR) Tracker

Videos of the SMR tracker on the TLD dataset

References

Further Reading

100x microsystems and microchips

Robotics, manufacturing and the future

Artificial Intelligence, AI in 2025 and beyond

Trending Tags