Could you explain me how instance segmentation works?

I'm a beginner in deep learning and know about semantic segmentation. What is it and how does it work?  

If you are familiar with semantic segmentation, you already know it just segments all objects of the same class and does not differentiate any instance inside of it.

Instance segmentation is the same thing, only it differentiates all instances in each class. If it finds people, it will separate every person as a different class. This is what the instance segmentation does.

To understand how it works, you will have to know what is object detection. Object detection is another approach to find all rectangles for every object inside of the image. A rectangle is just a vector with 4 elements like (x, y, w, h).

Instance segmentation contains 2 major parts: Object Detection (which contains classification as well) and semantic segmentation. In other words, it just runs object detection firstly, then uses a semantic segmentation model inside every rectangle (which are called bounding boxes). Here is what it looks like.

