Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
Unsupervised Object Segmentation by RedrawingMickaël Chen1, Thierry Artières2,3 and Ludovic Denoyer4
1Sorbonne Université, CNRS, LIP6, F-75005, Paris, France2Aix Marseille Univ, Université de Toulon, CNRS, LIS, Marseille, France
3Ecole Centrale Marseille 4Facebook Artificial Intelligence Research
We present ReDO (ReDrawing Objects): anunsupervised, data-driven, object segmentationmethod for real images.We assume natural images generation is a compositeprocess in which each object is generatedindependently. Object segmentation is then thediscovery of regions that can be redrawn withoutseeing the rest of the image.
Image Composition Model
We consider the following underlying generative pro-cess G that produce images in three steps.1 Define the position of the different regions ie. globalstructure of the image, by sampling N region masks.
M ∼ p(M), M ∈ {0, 1}N×W×H,N∑k=1
Mkx,y = 1
2 Generate the contents of each region independently.Vk ← Gk(Mk, zk), zk ∼ p(z) for k ∈ {1, . . . , n}
3 Aggregate the resulting regions into the final image.
G(M, z1, . . . , zn) =n∑k=1
Mk �Vk
Towards Learning To Segment
We replace step 1 by a segmentation function F thatproduce a mask given an image input I ∈ RC×W×H.1 Obtain the mask using a segmentation function F.
M← F(I), M ∈ [0, 1]N×W×H,N∑k=1
Mkx,y = 1
2 Generate the contents of each region independently.Vk ← Gk(Mk, zk), zk ∼ p(z) for k ∈ {1, . . . , n}
3 Aggregate the resulting regions into the final image.
GF(I, z1, . . . , zn) =n∑k=1
Mk �Vk
We can train this model end-to-end using an adver-sarial loss to match the distribution of generatedimages to the dataset distribution. But it would nat-urally converge to trivial and uninformative solutions.
Conservation of Information
Problem 1: Mapping all pixels to one region is atrivial but valid solution.
z1
z2
Figure 1: In this example, the input is ignored by the segmen-tation function and region 2 is responsible for the whole image.The model has collapsed into a standard GAN.
Solution: We add a learned function δ that tries toreconstruct the noise vectors zk from the generatedimage.
z1 ẑ1
z2 ẑ2
Figure 2: As region 1 does not contribute, z1 can not be retrievedfrom the generated image.
Redrawing a Single Region
Problem 2: The segmentation function can ignorethe input.
Figure 3: In this example, the segmentation function choose re-gions that are meaningless w.r.t. to the input. The generator canstill generate a perfectly fine image.
Solution: We tie the output to the input by onlyregenerating one region at a time, keeping the rest ofthe image the same.
Figure 4: Given a wrong segmentation, the generator cannot pro-duce a consistant image.
Learning the full model for object segmentation
f
z1G1
generated image
generated region input I inferred mask M1
inferred mask M2
�����
�(�)
ẑ1
D
�
�
Realor
Fake?
Figure 5: The ReDO pipeline. Learned neural networks are represented in bold colored lines.
Objective functions: We use the hinge version of the adversarial loss.maxGF,δLG = EI∼pdata,i∼U(n),zi∼p(z)
[D(GF(I, zi, i))− λz||δi(GF(I, zi, i))− zi||2
]max
DLD = EI∼pdata
[min(0,−1 + D(I))
]+ EI∼pdata,i∼U(n),zi∼p(z)
[min(0,−1− D(GF(I, zi, i))
]
Preprint
https://arxiv.org/abs/1905.13539
Code and Pretrained
https://github.com/mickaelChen/ReDO
Unsupervised Object Segmentation by RedrawingMickaël Chen1, Thierry Artières2,3 and Ludovic Denoyer4
1Sorbonne Université, CNRS, LIP6, F-75005, Paris, France2Aix Marseille Univ, Université de Toulon, CNRS, LIS, Marseille, France
3Ecole Centrale Marseille 4Facebook Artificial Intelligence Research
Experiments
We evaluate ReDO on 3 datasets of real images.
•Without supervision, ReDO discoversmeaningful object masks and noise vectorsz codes for specific texture.•ReDO’s performance is comparable to supervisedbaselines trained with about 50-100 labelleddatapoints.•Preliminary experiments indicates that ReDOcan work on datasets with multiple objects ormultiple classes without using labels.
ReDO and supervised baselines
101 102 1030.70
0.75
0.80
0.85
0.90
0.95
1.00
test
set A
ccur
acy
LFW Dataset
101 102 103 1040.70
0.75
0.80
0.85
0.90
0.95
1.00CUB Dataset
101 102 1030.70
0.75
0.80
0.85
0.90
0.95
1.00Flowers Dataset
101 102 103
number of training samples
0.5
0.6
0.7
0.8
0.9
1.0
test
set I
oU
101 102 103 104
number of training samples
0.0
0.2
0.4
0.6
0.8
1.0
101 102 103
number of training samples
0.5
0.6
0.7
0.8
0.9
1.0
supervisedours (unsupervised)
Generated Samples Dataset with 2 Categories
Datasets with 2 Objects
Additional masks