Project 05 · Banking & Financial Services

XYZ Bank — Digital
Customer Journey
Analysis

A five-phase end-to-end analytics project examining where 2,000 newly onboarded customers dropped off across XYZ Bank's digital platform in FY 2024, and what Ksh 1.58 billion in revenue at risk means for the business.

Python MySQL Tableau Excel EDA Customer Analytics
2,000 Registered Customers
73.6% Activation Rate
19.6% Active by Year-End
Ksh 1.58B Revenue at Risk
Interactive Dashboard

Executive Dashboard, Tableau Public

Built for boardroom presentation. Four KPI cards, a customer journey funnel, activation rates by channel and segment, retention breakdown, and revenue impact — all in one interactive view.

XYZ Bank Digital Customer Journey Dashboard – Tableau
The Problem

Strong Registration. Weak Retention.

XYZ Bank's digital banking platform recorded strong onboarding growth in FY 2024, with 2,000 new customers registered across Mobile App, USSD, Agent, and Branch channels. But monthly active user numbers were not keeping pace — raising urgent questions at senior management level about where customers were dropping off and why.

This project was commissioned to answer five core business questions:

Where exactly in the digital journey are customers dropping off — before or after activation?

Which customer segments are most at risk of churn, and how different are their behaviours?

Is this a channel problem, or a platform-wide retention problem affecting all onboarding methods?

What is the financial value at risk from customers who churned or went dormant?

Which channels activate the highest proportion of registered customers?

What interventions would have the highest impact on reversing the retention decline?

Key Findings

Six Numbers That Tell the Story

The analysis uncovered a systemic retention failure spanning every segment and every channel — with a single overriding conclusion: this is not a channel problem, it is a platform-level engagement problem.

# Finding Detail Severity
1 Activation Gap 26.4% of registered customers (528) never made a single transaction after signing up. High
2 Retention Crisis Only 19.6% of all registered customers were active by year-end — a critical engagement failure. Critical
3 Churn Dominates 45% of activated customers churned — the single biggest drop-off in the entire journey. Critical
4 High-Risk Segments SME had the highest churn rate (48.9%), followed by Mass Affluent (47.1%). Critical
5 Platform-Wide Problem Churn rates were consistent across all channels (43–47%) — confirming this is not a channel issue. High
6 Revenue at Risk Ksh 1.19B at risk from churn + Ksh 394M from dormant customers = Ksh 1.58B total exposure. Critical
The Analysis

Funnel, Segments, Channels.

Three analytical lenses across the same 2,000-customer dataset — each revealing a different dimension of the retention failure.

Digital Customer Funnel – Registered to Active
01
Customer Journey Funnel
Of 2,000 registered customers, 1,472 activated (73.6%). Of those, only 392 remained active — while 662 churned and 418 went dormant. The funnel visualises where the most value is being lost.
Segment Retention Analysis – Churn by Customer Segment
02
Segment Retention Analysis
SME customers churn at 48.9%, almost 3 percentage points above the 45.9% average. Mass Affluent follows at 47.1%. Mass Market, despite the highest volume, shows the most stable retention at 43.3%.
Channel Retention Analysis – Active Rate by Registration Channel
03
Channel Retention Analysis
USSD produces the highest active rate (29.5%), while Mobile App surprisingly underperforms at 25.0%. Branch activation sits lowest at 56.4%. The near-identical churn profile across channels confirms a platform-level, not channel-level, problem.
04
The Core Insight

Churn rates of 43–47% across all four channels — Mobile App, Branch, Agent, and USSD — is the clearest possible signal that the problem is not in how customers onboard. It is in what the platform offers them after they do.

→ This is a product engagement problem, not a channel problem.

Methodology

Five-Phase End-to-End Pipeline

This project was built from raw data generation through to boardroom-ready reporting — replicating the full lifecycle of a real banking analytics engagement.

01

Data Generation

Designed a realistic 2,000-row digital banking dataset with intentional data quality issues to simulate real-world conditions — including duplicate rows, impossible dates, negative values, casing inconsistencies, blank strings, and ETL sync failures.

02

Data Cleaning — MySQL + Jupyter Notebook

Connected Python to MySQL via SQLAlchemy and PyMySQL. Conducted a structured audit across all columns before applying any fixes. Each step followed a preview → apply → verify pattern. Nine distinct issue types resolved across 2,012 raw rows.

MySQL Python SQLAlchemy PyMySQL Jupyter
03

Exploratory Analysis — Python

Connected directly to MySQL via SQLAlchemy. Analysis conducted in Pandas with visualisations in Matplotlib and Seaborn. Five analytical areas: activation, funnel, segment retention, channel retention, and revenue impact.

Pandas Matplotlib Seaborn
04

Dashboard — Tableau Public

Built an executive-ready interactive dashboard with 4 views and 4 KPI cards. Published to Tableau Public. Covers the customer journey funnel, activation rates by channel and segment, retention breakdown, and revenue at risk.

Tableau Public
05

Executive Brief — Excel

Produced a one-page boardroom-ready executive brief covering business context, key findings, recommendations, and next steps. Formatted to professional banking reporting standards.

Microsoft Excel
Data Quality

9 Issues Found. 2,012 → 2,000 Rows.

Every issue was documented, previewed, applied, and verified before moving on — following a structured audit methodology across all 14 columns.

# Issue Records Action Taken
1Duplicate rows12Removed using ROW_ID deduplication
2Impossible dates (first transaction before registration)25Set to NULL
3Negative total_transactions8Corrected using ABS()
4Segment casing inconsistencies37Standardised using TRIM() + UPPER/LOWER
5Blank status values10Converted to NULL
6Blank region values15Converted to NULL
7Blank age_band values20Relabelled as 'Unknown'
8Value/transaction mismatch (ETL failure)40Flagged as 'REVIEW' for Finance
9VARCHAR date columns2 colsConverted to DATE type

Final dataset: 2,000 rows × 14 columns. Zero rows lost to cleaning.

Recommendations

Four Actions with Measurable Impact

Each recommendation is tied to a specific finding and includes a measurable target to enable post-implementation tracking.

01

Fix Branch Onboarding

Implement a 72-hour post-registration follow-up workflow for Branch-onboarded customers. Branch has the lowest activation rate (56.4%) — a structured nudge at the right moment can close that gap.

Target: Activation 56.4% → 70%
02

SME Retention Programme

Launch tailored digital banking features addressing bulk payments, payroll, and supplier management. SME churns at 48.9% — the highest of any segment — yet likely has the highest revenue potential per customer.

Target: Reduce SME churn below 40%
03

Dormant Reactivation Campaign

Target 418 dormant activated customers with personalised push notifications and transaction fee waivers. These customers proved they can activate — they need a compelling reason to return.

Revenue at risk recovered: Ksh 394M
04

Platform UX Review

Commission a product review focused on post-activation engagement. Consistent churn across all channels (43–47%) is the clearest possible signal of a platform-level problem, not a channel-level one.

Platform-level retention programme
Tools & Technologies

Full Stack Analytics

Python (Pandas, Matplotlib, Seaborn)Exploratory analysis and visualisation
MySQLData storage and structured cleaning
SQLAlchemy + PyMySQLPython–MySQL connection layer
Jupyter NotebookDocumented, reproducible analysis
Tableau PublicInteractive executive dashboard
Microsoft ExcelExecutive brief and reporting
View Dashboard ↗ GitHub ↗ ← All Projects