The module has the following objectives:

1. Discuss the logic behind cluster analysis

2. Explain the K-means approach to clustering

3. Decide on the number of clusters underlying the data

4. Interpret clustering output for decision making

5. Perform a cluster analysis with R/Shiny

*MBA program*

Peter Ebbes

Topic

Interpreting K-means output

Lecture

topic 2, K-means clustering

topic 3, Mini-case clustering solution and findings

Purpose

Practice interpreting K-means output including fit-statistics from cross validation.

Question 1

Before you get started with any fancy-pancy data analytic approach, it is always a good idea to imagine how the data table could look like. Write down on scratch paper, how the data table could look like for this context.

Question 2

Question 3

Topic

How many clusters?

Lecture

topic 2, K-means clustering

Purpose

Practice identifying how many cluster underlay the data and giving the cluster solution an interpretation.

Scenario

Question 1

Question 2

Question 3

Topic

Five multiple choice practice questions

Lecture

Module 9, all topics

Purpose

Test your knowledge about the subjects of this module.

1. A researcher fits a regression with a large set of independent variables. He celebrates because the R-square is 98%. Yet, a few months later, the researcher is dissapointed as the model did not work at all to predict for new observations. He had checked the VIFs (all close to 1) and he was not extrapolating the data. What may be going on?

2. A financial analyst is running cluster analysis on a large set of stock returns. Which of the following expressions is not true?

3. Which of the following expressions about cluster analysis is not true?

4. K-means clustering...

5. Cross-validation...