Challenge of object detection

not fully observed, scale distraction, illumination changes.

Basic concepts

bounding box and class labels,


intersection of union (IoU)

See more at: Evaluationg Mateics

2D Object Detection Steps (inference)

feature extractor, computationally expensive, lower widthe and height, greater depth


Prior bounding boxes, or anchor bounding boxes, assume bounding boxes, then guess where and how large they are.

centroid location (where), box dimensions (size)


Every pixel in feature map correspond to centroids of multiple anchor boxes in the original image.


Non-maximum suppression, nms, remove anchor boxes.

Training the network


minibatch selection for boxes during training is important

hard negative anchor mining, control training bias


Loss functions, for classificatoin we have cross entropy, for regression we have

Example Feature extractor

Supplement material

  1. Everingham, M., Van Gool, L., Williams, C. K., Winn, J., & Zisserman, A. (2010). The pascal visual object classes (voc) challenge. International journal of computer vision, 88(2), 303-338.
  2. Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks. In Advances in neural information processing systems (pp. 91-99).
  3. Redmon, Joseph, et al. “You only look once: Unified, real-time object detection.” Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
  4. (Optional) Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y., & Berg, A. C. (2016, October). SSD: Single shot multibox detector. In European conference on computer vision.

Origin: Dr. Chris Lu (Homepage)
Translate + Edit: YangSier (Homepage)

:four_leaf_clover:碎碎念:four_leaf_clover:
Hello米娜桑,这里是英国留学中的杨丝儿。我的博客的关键词集中在编程、算法、机器人、人工智能、数学等等,点个关注吧,持续高质量输出中。
:cherry_blossom:唠嗑QQ群兔叽的魔术工房 (942848525)
:star:B站账号白拾Official(活跃于知识区和动画区)