Abstract:
Convolutional neural networks (CNNs) have been employedin visual tracking due to their rich levels of feature representation.While the learning capability of a CNN increaseswith its depth, unfortunately spatial information is diluted indeeper layers which hinders its important ability to localize targets. To successfully manage this trade-off, we propose anovel residual network based gating CNN architecture for objecttracking. Our deep model connects the front and bottomconvolutional features with a gate layer. This new networklearns discriminative features while reducing the spatial informationlost. This architecture is pre-trained to learn generictracking characteristics. In online tracking, an efficient domainadaptation mechanism is used to accurately learn thetarget appearance with limited samples. Extensive evaluationperformed on a publicly available benchmark dataset demonstratesour proposed tracker outperforms state-of-the-art approaches.