How to Implement Hierarchical Clustering for Direct Marketing Campaigns— with Python Code | by Zoumana Keita | Aug, 2023

Understand the ins and outs of hierarchical clustering, and how it applies to marketing campaign analysis in the banking industry.

Photo by Frederick Warren on Unsplash

Imagine being a Data Scientist at a leading financial institution, and your task is to assist your team in categorizing existing clients into distinct profiles:low , average , medium and platinum for loan approval.

But, here is the catch:

There is no such historical label attached to these customers, so how do you proceed with the creation of these categories?

This is where clustering can help, an unsupervised machine-learning technique to group unlabeled data into similar categories.

Multiple clustering techniques exist, but this tutorial will focus more on the hierarchical clustering approach.

It starts by providing an overview of what hierarchical clustering is, before walking you through a step-by-step implementation in Python using the popular Scipy library.

Hierarchical clustering is a technique for grouping data into a tree of clusters called dendrograms, representing the hierarchical relationship between the underlying clusters.

The hierarchical clustering algorithm relies on distance measures to form clusters, and it typically involves the following main steps:

Four main steps of the hierarchical clustering (Image by Author)
  • Computation of the distance matrix containing the distance between each pair of data points using a particular distance metric such as Euclidean distance, Manhattan distance, or cosine similarity
  • Merge the two clusters that are the closest in distance
  • Update the distance matrix with regard to the new clusters
  • Repeat steps 1, 2, and 3 until all the…

Source link

Leave a Comment