45
Leveraging Social Media with Computer Vision TJ Torres Data Scientist, Stitch Fix Big Data Applications in Fashion MeetUp 10/2016 Informing Recommendations in Fashion and Retail

Leveraging Social Media with Computer Vision

Embed Size (px)

Citation preview

Leveraging Social Media with Computer Vision

TJ Torres Data Scientist, Stitch Fix

Big Data Applications in Fashion MeetUp 10/2016

Informing Recommendations in Fashion and Retail

Styling Algorithms Research

Styling Algorithms Research

Styling Algorithms Research

Data LabsStyling Algorithms Research

MOTIVATION

Inventory Scaling:

Why Recommendations?

Infeasible from an efficiency perspective to look through all inventory as it scales.

MOTIVATION

Inventory Scaling:

Human Ability:

Why Recommendations?

Infeasible from an efficiency perspective to look through all inventory as it scales.

Stylists can’t keep all products in their memories while trying to locate the best items for each client.

MOTIVATION

Inventory Scaling:

Human Ability:

Why Recommendations?

Infeasible from an efficiency perspective to look through all inventory as it scales.

Stylists can’t keep all products in their memories while trying to locate the best items for each client.

Business Success:

Aid stylists in making the best decisions to better please our clients.

MOTIVATIONOur goal at Stitch Fix

Total Inventory

Recommendation Algo

Stylists

Filtered Items

1 2 3 4 5

Final Items Sent

COMPUTER VISION

COMPUTER VISION

New Clients

New Clothing

Cold Start Problem

No or sparse purchasing information, so how can we supplement this?

COMPUTER VISION

New Clients

New Clothing

Cold Start Problem

No or sparse purchasing information, so how can we supplement this?

Perception

Fashion can be difficult to describe via text/categorization.

Many times it’s easier to show what you like.

TURN TO IMAGES

• Style/fashion is primarily visual.

• We wish to use images for modeling purposes.

• Heuristics for how we process image data

unknown or quite complex.

• We don’t want to have to develop image

features.

• Turn to deep learning to learn the feature

extraction.

OUTLINE

1. Brief Introduction to NNs

2. Deep Learning for Fashion Imagery

3. Recommendations and Social Media

4. Results

5. Conclusions

NEURAL NETWORKS

http://www.wired.com/2013/02/three-awesome-tools-scientists-may-use-to-map-your-brain-in-the-future/

http://googleresearch.blogspot.com/2015/06/inceptionism-going-deeper-into-neural.html

WhoaDude!

http://googleresearch.blogspot.com/2015/06/inceptionism-going-deeper-into-neural.html

Gatys, et. al. : https://arxiv.org/abs/1508.06576

Begin with input:

INTRO TO NEURAL NETS1 2 3 4 5 6

Begin with input: 1 2 3 4 layer 1 (Input)

5 6

layer 2

f

(l)i (x) = tanh

0

@X

j

W

(l)ij x

(l�1)j + b

(l)

1

A

INTRO TO NEURAL NETS

Begin with input: 1 2 3 4 layer 1 (Input)

5 6

layer 2

f

(l)i (x) = tanh

0

@X

j

W

(l)ij x

(l�1)j + b

(l)

1

A

layer 3 (output)

Transform data repeatedly with non-linear function.

f

(1) � · · · � f (n)(x)

INTRO TO NEURAL NETS

1 2 3 4 layer 1(Input)

5 6

layer 2

layer 3(output)

Calculate loss function and update weights

f

(1) � · · · � f (n)(x)

L(xout

, y) =

MSEz }| {1

m

mX

k=1

(xk � yk)2

Begin with input:

f

(l)i (x) = tanh

0

@X

j

W

(l)ij x

(l�1)j + b

(l)

1

A

Transform data repeatedly with non-linear function.

INTRO TO NEURAL NETS

1 2 3 4 layer 1(Input)

5 6

layer 2

layer 3(output)

L(xout

, y) =

MSEz }| {1

m

mX

k=1

(xk � yk)2

W (l)⇤ij = W (l)

ij

✓1� ↵

@L@Wij

Calculate loss function and update weights

f

(1) � · · · � f (n)(x)

Begin with input:

f

(l)i (x) = tanh

0

@X

j

W

(l)ij x

(l�1)j + b

(l)

1

A

Transform data repeatedly with non-linear function.

INTRO TO NEURAL NETS

1 2 3 4 layer 1(Input)

5 6

layer 2

layer 3(output)

L(xout

, y) =

MSEz }| {1

m

mX

k=1

(xk � yk)2

W (l)⇤ij = W (l)

ij

✓1� ↵

@L@Wij

◆@L

@W

(l)ij

=

✓@L

@x

out

◆✓@x

out

@f

(n�1)

◆· · ·

@f

(l)

@W

(l)ij

!

Calculate loss function and update weights

f

(1) � · · · � f (n)(x)

Begin with input:

f

(l)i (x) = tanh

0

@X

j

W

(l)ij x

(l�1)j + b

(l)

1

A

Transform data repeatedly with non-linear function.

INTRO TO NEURAL NETS

RECS AND SOCIAL MEDIA

Clients give Pinterest board to visually indicate fashion tastes.

Match pinned images to our own styles.

RECS AND SOCIAL MEDIA

Clients give Pinterest board to visually indicate fashion tastes.

Match pinned images to our own styles.

Strategies

RECS AND SOCIAL MEDIA

Clients give Pinterest board to visually indicate fashion tastes.

Match pinned images to our own styles.

Strategies

Attribute extraction and matching.

RECS AND SOCIAL MEDIA

Clients give Pinterest board to visually indicate fashion tastes.

Match pinned images to our own styles.

Strategies

Attribute extraction and matching. Visual feature similarity.

RECS AND SOCIAL MEDIA

Clients give Pinterest board to visually indicate fashion tastes.

Match pinned images to our own styles.

Strategies

Attribute extraction and matching. Visual feature similarity.

Metric learning.

RECS AND SOCIAL MEDIA

Clients give Pinterest board to visually indicate fashion tastes.

Match pinned images to our own styles.

Strategies

Attribute extraction and matching. Visual feature similarity.

Metric learning. …or some combination.

VISUAL FEATURES

VISUAL FEATURES

VISUAL FEATURES

Use pre-trained extracted features.

Compare image features with metric of your choice

Cosine Euclidean etc,

EXAMPLESQuery Image

Top 5 Results

EXAMPLESQuery Image

Top 5 Results

CHALLENGESQuery Image

Top 5 Results

Sometimes things don’t work out so well…

Need system to compare images across separate domains

METRIC LEARNING

New Metricas Objective

Anch

orPo

sitive

Neg

ative

Triplet or Contrastive Loss

https://arxiv.org/abs/1404.4661

Ltriplet(a, p, n) =1

N

NX

i=1

max {d(f(ai), f(pi))� d(f(ai), f(ni)) +m, 0}!

METRIC LEARNING

https://arxiv.org/abs/1511.05939

m m

Positive

Negative

Before Training After Training Before Training After Training

METRIC LEARNING

https://arxiv.org/abs/1511.05939

m m

Positive

Negative

Before Training After Training Before Training After Training

Learn an embedding that obeys the similarity constraints.

similarity score = d�query, inventory

EXAMPLES

EXAMPLES

CONCLUSIONS

1. Social media images can help make better recommendations.

a) Alleviate cold start.

b) Provide new features/data for recommendations.

2. Cross-domain image matching can be difficult, but is made easier with deep learning.

3. There’s enormous potential moving forward with this type of work.

a) Attribute labeling and trend tracking.

b) Predictive models for purchasing probability.

ATTRIBUTE LABELING

GENERATIVE FASHION