{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 10.3 物体检测与语义分割联合使用"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "物体检测的目标是：导入一张图片，通过方框正确识别主要物体在图像的哪个地方。它的输入是一整幅图像，输出是方框及方框内每个物体的标签。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![](images/RCNN.jpeg)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "生成这些边框的算法最著名的就是R-CNN。R-CNN 采用 Selective Search 的流程，通过不同尺寸的窗口来查看图像。对于每一个尺寸，它通过纹理、色彩或密度把相邻像素划为一组，来进行物体识别。后来，它的不同组件都进行了改进，以达到端到端的自动学习的流程："
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "R-CNN: https://arxiv.org/abs/1311.2524\n",
    "\n",
    "Fast R-CNN: https://arxiv.org/abs/1504.08083\n",
    "\n",
    "Faster R-CNN: https://arxiv.org/abs/1506.01497\n",
    "\n",
    "Mask R-CNN: https://arxiv.org/abs/1703.06870"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "其中，Mask R-CNN 把 Faster R-CNN 拓展到像素级的图像分割。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![MaskRCNN](images/MaskRCNN.jpeg)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Mask R-CNN 通过向 Faster R-CNN 加入一个分支来实现语义分割。新增的分支输出一个二元的 mask，指示某像素是否是物体的一部分。这个分支是一个 CNN 特征图上的全卷积网络："
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![](images/Mask.jpeg)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "原始版本的 Faster R-CNN的RoIPool会与原图中的区域有轻微出入，而图像分割需要像素级别的精确度。于是，作者们对 RoIPool 进行调整，使之更精确的排列对齐，便是 RoIAlign："
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![](images/RoiAlign.jpeg)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}