Attrition Predictor

ℹ️ Only desktop version - not adapted for small screens.

Overview

Purpose

This machine-learning-based application predicts early employee attrition within the first 30, 60, or 90 days after a new project assignment.

Think of the app as a proof-of-concept. It shows how attrition could be predicted, but you’ll want to feed it with your actual workforce data to make the predictions useful.

Industry Context

The app is designed for a large outsourcing contact center, where each client (also referred to as a project) is supported by a group of dedicated support employees (agents).
Attrition (sometimes called “early churn”) is any case where an agent voluntarily leaves a staffing assignment or the company.
Attrition covers both new hires and agents reassigned internally from another project.

⚠️ Responsible use

This model is designed for workforce-planning; for example, deciding whether to slightly over-staff a new project or to allocate extra onboarding support.
It should not be used as an automated “hire / no-hire” filter for individual candidates.

Training Dataset

ℹ️ Data Disclaimer

Data used for this app version is generated for demonstration purposes. Although it simulates reality fairly well, it does not refer to any particular company.

Features

Feature	Description	Type	Values
`gender`	Employee’s gender.	numerical	female: `0` male: `1`
`fte`	Full-time equivalent grouped by categories. (full-timers vs. part-timers).	ordinal	< 0.9 fte: `1` 0.9 fte: `2` 1 fte: `3`
`language`	Primary language skill	categorical	list of languages
`country`	Country of employee’s residence.	categorical	list of countries
`employment_days`	Number of days since the first day at the company	numerical	continuous: integer
`project_date`	Date the last project was assigned to the employee.	date → numerical	`month_sin` `month_cos` `quart_sin` `quart_cos`
`new_employee`	“New” if hired fewer than 10 days ago.	boolean	`0` / `1`
`industry`	Client’s industry.	categorical	E-commerce: `ecommerce` Consumer Electronics: `manufacturer` Video Gaming: `gaming` Mobile Gaming: `mob_gaming`
`size`	Project size by number of agents	ordinal	XS (<10 agents): `1` S (10–20 agents): `2` M (20–30 agents): `3` L (30–60 agents): `4` XL (60–120 agents): `5` XXL (>120 agents): `6`
`channels`	Number of support channels (grouped).	ordinal	single channel: `1` two channels: `2` multi-channel: `3`
`phone`	Phone-line among support channel	boolean	`0` / `1`

Descriptive analysis

Time series analysis

Attrition over time

Monthly attrition seasonality

Quarterly attrition seasonality

Attrition by country

Top 3 countries attrition distribution (days)

Top 3 languages attrition distribution (days)

Attrition per FTE category (full-timer vs. part-timer)

Attrition by industry

Attrition by project size (in number of agents)

Top 15 features by importance

Labels

30-days attrition: Employee left voluntarily within 30 days after last project assignment.
60-days attrition: Employee left within 60 days after last project (includes 30-day attrition).
90-days attrition: Employee left within 90 days (includes both 30- and 60-day attrition).

Machine Learning Model

After trying different linear and non-linear classifiers, chosen algorithm is random forest with 1000 n-estimators (trees) and unlimited leaves. Because of the highly imbalanced data (attrition / non-attrition labels are split as 10% / 90%) SMOTE-ENN (combination of SMOTE and Edited Nearest Neighbours) method was applied to training data set.

The target performance parameter was to achieve best recall: ability of the model to recognize attrition among actual attrition cases. At the same time, precision was sacrificed. It means that the model should prioritise recall — catching most true attrition events — even at the cost of additional false alarms.

Model Performance

Metric	30-days	60-days	90-days
Accuracy	0.970	0.938	0.900
Precision	0.683	0.657	0.618
Recall	0.915	0.870	0.831
F1-score	0.783	0.749	0.708
AUC	0.944	0.908	0.872

Confusion Matrices

30-days attrition

60-days attrition

90-days attrition

Tech Details

App Architecture

Tech Stack

🐍 Python 3.10
🧪 Flask 2.2.2
🤖 Scikit Learn 1.1.2
🔣 NumPy 1.23

Versions

v1.0.0 (01 Dec 2012)