close

Вход

Забыли?

вход по аккаунту

код для вставкиСкачать
Database Marketing
Cluster Analysis
N. Kumar, Asst. Professor of
Marketing
Agenda
Discussion of the first Assignment
Motivation for conducting Cluster Analysis
Benefit Segmentation
Cluster Analysis
Basic Concepts
Hierarchical/Non- Hierarchical Clustering
Implementation in SAS and interpreting the output
2
Voter Profiling
What are the different voting segments out
there? What do they want to hear i.e. issues
they care about?
What should I say?
N. Kumar, Asst. Professor of
Marketing
Ad Campaign
How many customer segments are there?
How many do I want to target?
How should I target – what message should
I communicate to each segment?
N. Kumar, Asst. Professor of
Marketing
Promotional Strategies
Coupon Drops – who should they be
targeted at?
Catalog Example – should the catalog be
accompanied with a $5 coupon or a $10
coupon or no coupon?
N. Kumar, Asst. Professor of
Marketing
What is Cluster Analysis?
Cluster Analysis is a technique for
combining observations into groups or
clusters such that:
Each group is homogenous with respect to
certain characteristics (that you specify)
Each group is different from the other groups
with respect to the same characteristics
N. Kumar, Asst. Professor of
Marketing
Data
Consumer
Income ($ 1000s)
Education (years)
1
5
5
2
6
6
3
15
14
4
16
15
5
25
19
6
30
20
N. Kumar, Asst. Professor of
Marketing
Geometrical View of Cluster Analysis
Education
Income
N. Kumar, Asst. Professor of
Marketing
Similarity Measures
Why are consumers 1 and 2 similar?
Distance(1,2) = (5-6)2 + (5-6)2
More generally, if there are p variables:
Distance(i,j) =  (xik - xjk)2
N. Kumar, Asst. Professor of
Marketing
Similarity Matrix
C1
C2
C3
C4
C5
C6
C1
0
2
181
221
625
821
C2
2
0
145
181
557
745
C3
181
145
0
2
136
250
C4
221
181
2
0
106
212
C5
625
557
136
106
0
26
C6
821
745
250
212
26
0
N. Kumar, Asst. Professor of
Marketing
Clustering Techniques
Hierarchical Clustering
Non-Hierarchical Clustering
N. Kumar, Asst. Professor of
Marketing
Hierarchical Clustering
Distance(1,2) = 2 = Distance(3,4)
Say, we group 1 and 2 together and leave
the others as is
How do we compute the distance between a
group that has two (or more) members and
the others?
N. Kumar, Asst. Professor of
Marketing
Hierarchical Clustering Algorithms
Centroid Method
Nearest-Neighbor or Single-Linkage
Farthest-Neighbor or Complete-Linkage
Average-Linkage
Ward’s Method
N. Kumar, Asst. Professor of
Marketing
Centroid Method
Each group is replaced by an average
consumer
Cluster 1 – average income = 5.5 and
average education = 5.5
N. Kumar, Asst. Professor of
Marketing
Data for Five Clusters
Cluster
Members
Income
Education
1
C1&C2
5.5
5.5
2
C3
15
14
3
C4
16
15
4
C5
25
20
5
C6
30
19
N. Kumar, Asst. Professor of
Marketing
Similarity Matrix
C1&C2 C3
C4
C5
C6
C1&C2 0
C3
162.5
0
C4
200.5
2
C5
590.5
135.96 106
0
C6
782.5
250
26
0
212
N. Kumar, Asst. Professor of
Marketing
0
Data for Four Clusters
Cluster
Members
Income
Education
1
C1&C2
5.5
5.5
2
C3&C4
15.5
14.5
3
C5
25
20
4
C6
30
19
N. Kumar, Asst. Professor of
Marketing
Similarity Matrix
C1&C2
C3&C4
C5
C1&C2
0
C3&C4
181
0
C5
590
120.5
0
C6
782.5
230.5
26
N. Kumar, Asst. Professor of
Marketing
C6
0
Data for Three Clusters
Cluster
Members
Income
Education
1
C1&C2
5.5
5.5
2
C3&C4
15.5
14.5
3
C5&C6
27.5
19.5
N. Kumar, Asst. Professor of
Marketing
Similarity Matrix
C1&C2
C3&C4
C1&C2
0
C3&C4
181
0
C5&C6
680
169
N. Kumar, Asst. Professor of
Marketing
C5&C6
0
Dendogram for the Data
C1
C2
C3
C4
C5
N. Kumar, Asst. Professor of
Marketing
C6
Single Linkage
First Cluster is formed in the same fashion
Distance between Cluster 1 comprising of
customers 1 and 2 and customer 3 is the
minimum of Distance(1,3) = 181 and
Distance(2,3) = 145
N. Kumar, Asst. Professor of
Marketing
Similarity Matrix
C1&C2 C3
C4
C5
C6
C1&C2 0
C3
145
0
C4
181
2
0
C5
557
136
106
0
C6
745
250
212
26
N. Kumar, Asst. Professor of
Marketing
0
Complete Linkage
Distance between Cluster 1 comprising of
customers 1 and 2 and customer 3 is the
maximum of Distance(1,3) = 181 and
Distance(2,3) = 145
N. Kumar, Asst. Professor of
Marketing
Similarity Matrix
C1&C2 C3
C4
C5
C6
C1&C2 0
C3
181
0
C4
221
2
0
C5
625
136
106
0
C6
821
250
212
26
N. Kumar, Asst. Professor of
Marketing
0
Average Linkage
Distance between Cluster 1 comprising of
customers 1 and 2 and customer 3 is the
average of Distance(1,3) = 181 and
Distance(2,3) = 145
N. Kumar, Asst. Professor of
Marketing
Similarity Matrix
C1&C2 C3
C4
C5
C6
C1&C2 0
C3
163
0
C4
201
2
0
C5
591
136
106
0
C6
783
250
212
26
N. Kumar, Asst. Professor of
Marketing
0
Ward’s Method
Does not compute distance between clusters
Forms clusters by maximizing withincluster homogeneity or minimizing error
sum of squares (ESS)
ESS for cluster with two observations (say,
C1 and C2) = (5-5.5)2 + (6-5.5)2 + (5-5.5)2
+ (6-5.5)2
N. Kumar, Asst. Professor of
Marketing
Ward’s Method
1
2
3
4
5
6
7
CL1
C1,C2
C1,C3
C1,C4
C1,C5
C1,C6
C2,C3
C2,C4
CL2
C3
C2
C2
C2
C2
C1
C1
CL3
C4
C4
C3
C3
C3
C4
C3
CL4
C5
C5
C5
C4
C4
C5
C5
N. Kumar, Asst. Professor of
Marketing
CL5
C6
C6
C6
C6
C5
C6
C6
ESS
1
90.5
110.5
312.5
410.5
72.5
90.5
Non-Hierarchical Clustering
Data are grouped into K clusters
Requires a priori knowledge of K
N. Kumar, Asst. Professor of
Marketing
Basic Steps in Non-Hierarchical
Clustering
Select K initial cluster centroids
Assign each observation to the cluster to which it
is closest
Reassign or reallocate each observation to one of
the K clusters according to a pre-determined
stopping rule
Stop if there is no reallocation
Approaches differ in Step 1 and/or step 3
N. Kumar, Asst. Professor of
Marketing
Algorithm I
Selects first K observations as cluster
centers
N. Kumar, Asst. Professor of
Marketing
Initial Cluster Centroids
Variable
CL1
CL2
CL3
Income
5
6
15
Education
5
6
14
N. Kumar, Asst. Professor of
Marketing
Initial Assignment
C1
Distance Distance Distance Assigned
from C1 from C2 from C3 to CL
0
2
181
1
C2
2
0
145
2
C3
181
145
0
3
C4
221
181
2
3
C5
625
557
136
3
C6
821
745
250
3
N. Kumar, Asst. Professor of
Marketing
New Cluster Centroids
Variable
CL1
CL2
CL3
Income
5
6
21.5
Education
5
6
17
N. Kumar, Asst. Professor of
Marketing
Distance Matrix
Distance
from CL1
Distance
from CL2
Distance
from CL3
Previous
Current
Assignment Assignment
C1
0
2
416.15
1
1
C2
2
0
316.25
2
2
C3
181
145
51.25
3
3
C4
221
181
34.25
3
3
C5
625
557
21.25
3
3
C6
821
990
76.25
3
3
N. Kumar, Asst. Professor of
Marketing
Algorithm II
Differs from Algorithm I in how the initial seeds
are modified
As before first K observations are selected as the
initial cluster seeds
A seed that is a candidate for replacement is from
one of the two seeds that are closest to each other
An observation qualifies to replace one of the two
candidates if the distance between the seeds is less
than the distance between the observation and the
closest seed
N. Kumar, Asst. Professor of
Marketing
Algorithm II …contd.
C1, C2 and C3 are the initial seeds
The smallest distance between the seeds is
between C1 and C2
Observation C4 does not qualify as a replacement
as Distance(C1,C2) > Distance(C4 and the nearest
seed C3)
Observation C5 does qualify as a replacement as
Distance(C1,C2) < Distance(C5 and the nearest
seed C3): replace C2 with C5
N. Kumar, Asst. Professor of
Marketing
Initial Assignment
C1
Distance Distance Distance Assigned
from C1 from C2 from C3 to CL
0
181
625
1
C2
2
145
557
1
C3
181
0
136
2
C4
221
2
106
2
C5
625
136
0
3
C6
821
250
26
3
N. Kumar, Asst. Professor of
Marketing
New Cluster Centroids
Variable
CL1
CL2
CL3
Income
5.5
15.5
27.5
Education
5.5
14.5
19.5
N. Kumar, Asst. Professor of
Marketing
Distance Matrix
Distance
from CL1
Distance
from CL2
Distance
from CL3
Previous
Current
Assignment Assignment
C1
0.5
200.5
716.5
1
1
C2
0.5
162.5
644.5
1
1
C3
162.5
0.5
186.5
2
2
C4
200.5
0.5
152.5
2
2
C5
590.5
120.5
6.5
3
3
C6
600.50
230.5
6.5
3
3
N. Kumar, Asst. Professor of
Marketing
Hierarchical vs. Non-Hierarchical
Clustering
Hierarchical clustering does not require a
priori knowledge of the number of clusters
Assignments are static
Use hierarchical clustering for exploratory
purposes
Non-Hierarchical Methods can be viewed as
a complementary rather than a competing
method
N. Kumar, Asst. Professor of
Marketing
Voter Profiling
Survey of voters concerns may help us
group customers with similar concerns –
perhaps they all live in a certain area?
Target ads/mailings with customized
messages
N. Kumar, Asst. Professor of
Marketing
Ad Campaign
Use attitudinal data to segment customers
Target message appropriately
N. Kumar, Asst. Professor of
Marketing
Promotional Strategies
Use transaction data to group customers
into those that are more prone to purchasing
the product on deal
Give a stronger incentive to the price
sensitive segment
N. Kumar, Asst. Professor of
Marketing
1/--страниц
Пожаловаться на содержимое документа