clustering and regression -...

Recap • 

Be cautious..

Data may not be in one blob, need to separate data into groups

Clustering

CS 498 Probability & Sta;s;cs

Clustering methods

Zicheng Liao

What is clustering?

•  “Grouping”

–  A fundamental part in signal processing

•  “Unsupervised classification”

•  Assign the same label to data points

that are close to each other

We live in a universe full of clusters

Two (types of) clustering methods

•  Agglomerative/Divisive clustering

•  K-means

Agglomerative/Divisive clustering

Agglomera;ve clustering

(BoDom up)

Hierarchical cluster tree

Divisive clustering

(Top down)

Algorithm

Agglomerative clustering: an example •  “merge clusters bottom up to form a hierarchical cluster tree”

Anima;on from Georg Berber www.mit.edu/~georg/papers/lecture6.ppt

Dendrogram

>> X = rand(6, 2); %create 6 points on a plane

>> Z = linkage(X); %Z encodes a tree of hierarchical clusters

>> dendrogram(Z); %visualize Z as a dendrograph

Distance measure • 

Inter-cluster distance •  Treat each data point as a single cluster

•  Only need to define inter-cluster distance –  Distance between one set of points and another set of

points

•  3 popular inter-cluster distances –  Single-link –  Complete-link –  Averaged-link

Single-link •  Minimum of all pairwise distances between

points from two clusters •  Tend to produce long, loose clusters

Complete-link •  Maximum of all pairwise distances between

points from two clusters •  Tend to produce tight clusters

Averaged-link •  Average of all pairwise distances between

points from two clusters

How many clusters are there? •  Intrinsically hard to know •  The dendrogram gives insights to it •  Choose a threshold to split the dendrogram into clusters

An example

do_agglomerative.m

Divisive clustering •  “recursively split a cluster into smaller clusters” •  It’s hard to choose where to split: combinatorial problem •  Can be easier when data has a special structure (pixel grid)

K-means •  Partition data into clusters such that:

–  Clusters are tight (distance to cluster center is small) –  Every data point is closer to its own cluster center than to all

other cluster centers (Voronoi diagram)

[figures excerpted from Wikipedia]

Formulation • 

Cluster center

K-means algorithm

• 

Illustration

Randomly ini;alize 3 cluster centers (circles)

Assign each point to the closest cluster center Update cluster center Re-‐iterate step2

Example

do_Kmeans.m (show step-‐by-‐step updates and effects of cluster number)

Discussion • 

K = 2? K = 3? K = 5?

Discussion •  Converge to local minimum => counterintuitive clustering

Discussion •  Favors spherical clusters; •  Poor results for long/loose/stretched clusters

Input data(color indicates true labels) K-‐means results

Discussion •  Cost is guaranteed to decrease in every step

–  Assign a point to the closest cluster center minimizes the cost for current cluster center configuration

–  Choose the mean of each cluster as new cluster center minimizes the squared distance for current clustering configuration

•  Finish in polynomial time

Summary •  Clustering as grouping “similar” data together

•  A world full of clusters/patterns

•  Two algorithms –  Agglomerative/divisive clustering: hierarchical clustering tree –  K-means: vector quantization

CS 498 Probability & Sta;s;cs Regression

Zicheng Liao

Example-I •  Predict stock price

t+1 ;me

Stock price

Example-II •  Fill in missing pixels in an image: inpainting

Example-III •  Discover relationship in data

Amount of hormones by devices from 3 produc;on lots

Time in service for devices from 3 produc;on lots

Example-III •  Discovery relationship in data

Linear regression

• 

Parameter estimation •  MLE of linear model with Gaussian noise

[Least squares, Carl F. Gauss, 1809]

Likelihood func;on

Parameter estimation

•  Closed form solution

Normal equation

Cost func;on

(expensive to compute the matrix inverse for high dimension)

Gradient descent •  hDp://openclassroom.stanford.edu/MainFolder/VideoPage.php?

course=MachineLearning&video=02.5-‐LinearRegressionI-‐GradientDescentForLinearRegression&speed=100 (Andrew Ng)

(Guarantees to reach global minimum in finite steps)

Example

do_regression.m

Interpreting a regression

•  Zero mean residual •  Zero correla;on

Interpreting the residual

Interpreting the residual • 

How good is a fit? • 

How good is a fit? •  R-squared measure

–  The percentage of variance explained by regression

–  Used in hypothesis test for model selection

Regularized linear regression •  Cost

•  Closed-form solution

•  Gradient descent

Why regularization?

•  Handle small eigenvalues –  Avoid dividing by small values by adding the

regularizer

Why regularization?

•  Avoid over-fitting: –  Over fitting –  Small parameters simpler model less prone

to over-fitting

Over fit: hard to generalize to new data

Smaller parameters make a simpler (beDer) model

L1 regularization (Lasso) • 

How does it work?

• 

Summary • 

clustering and regression -...

Documents

1 Clustering 2 - Illinoisluthuli.cs.uiuc.edu/~daf/courses/Probcourse/notesclustering.pdf · Algorithm1.2: Divisive Clustering, or Clustering by Splitting. canonical inter-cluster

DAF DISSOLVED AIR FLOTATION-2016 - SKIMOIL€¦ · DAF Brochure 1 DAF Dissolved Air Flotation Systems ... Our unique, compact, small footprint Lil' DAF design is offered for small

Daf Hashavua

The DAF XF105 - downloads.clickedit.co.ukdownloads.clickedit.co.uk/2513/DAF-XF105-brochure-2011_HQ-GB.pdfDAF 4XF105 5 DAF XF105 Functionality meets design Evolution rather than revolution

DAF System

VTP DAF Catalogue - vtp-uk.com€¦VTP DAF Catalogue - vtp-uk.com

DAF methodology

Daf brochure cf

Daf Tests

THE CLASS LEADING DAF RANGE - DAF Trucks/media/files/daf trucks/trucks/euro 5/range... · QUALITY THROUGHOUT THE RANGE The DAF product range. Optimally designed for maximum efficiency,

Interpolating Curves - luthuli.cs.uiuc.eduluthuli.cs.uiuc.edu/~daf/courses/CS-419/Week-12/Interpolation-2013.pdf · Central issues in modelling • Construct families of curves, surfaces

A Stochastic Grammar of Images - luthuli.cs.uiuc.eduluthuli.cs.uiuc.edu/~daf/readingpapers/reprint_grammar.pdf · A Stochastic Grammar of Images ... This parse graph includes a tree

DAF | CF 440

SECTION 3-DAF...911.501.0150 911.501.0600 DAF- 658882 DAF - 659807 DAF- 0281009 DAF- 753259 DAF- 1518253 DAF- 0097539 DAF- 1519282 DAF- 1519286 DAF- 285596 DAF- 555097 WABCO Type Clutch

THE DAF COLLECTION. Merchandise (1) DAF Chauffeurstas Part No M003295 DAF Toilettas Part No M003297 DAF Posttas Part No M003307 DAF Paraplu Part No M003321

DAF Trucks N.V

Contentsluthuli.cs.uiuc.edu/~daf/courses/probcourse/notesclassification.pdf · 1.2.3 Example: Class-Conditional Histograms and Naive Bayes . . 15 ... In this case, the best strategy,

For Official Use DAF/AS/WD(2010)25 DAF/AS/WD (2010)

FLOTATION, TYPE HIGH RATE i-DAF - Nijhuis Industries · FLOTATION, TYPE HIGH RATE i-DAF DAF applications above 1000 m3/hr / 25 MLD . ... The Dissolved Air Flotation technology (DAF)

TECHNICAL MANUAL - SPERONIsperoni.ua/loadimages/masdaf12.pdf · 2020. 5. 27. · 130 12 160 200 daf 9005 4 249 daf 9006 5.5 282 daf 9007 5.5 315 daf 9008 7.5 348 f 0 daf 5004 3 50