Attrition Predictor
Application to predict employee attrition in an outsourced contact-center environment.

🔗 attrition-pred-app.roboteria.io
ℹ️ Only desktop version - not adapted for small screens.
Overview
Purpose
This machine-learning-based application predicts early employee attrition within the first 30, 60, or 90 days after a new project assignment.
Think of the app as a proof-of-concept. It shows how attrition could be predicted, but you’ll want to feed it with your actual workforce data to make the predictions useful.
Industry Context
- The app is designed for a large outsourcing contact center, where each client (also referred to as a project) is supported by a group of dedicated support employees (agents).
- Attrition (sometimes called “early churn”) is any case where an agent voluntarily leaves a staffing assignment or the company.
- Attrition covers both new hires and agents reassigned internally from another project.
⚠️ Responsible use
This model is designed for workforce-planning; for example, deciding whether to slightly over-staff a new project or to allocate extra onboarding support.
It should not be used as an automated “hire / no-hire” filter for individual candidates.
Training Dataset
ℹ️ Data Disclaimer
Data used for this app version is generated for demonstration purposes. Although it simulates reality fairly well, it does not refer to any particular company.
Features
Feature | Description | Type | Values |
---|---|---|---|
gender | Employee’s gender. | numerical | female: 0 male: 1 |
fte | Full-time equivalent grouped by categories. (full-timers vs. part-timers). | ordinal | < 0.9 fte: 1 0.9 fte: 2 1 fte: 3 |
language | Primary language skill | categorical | list of languages |
country | Country of employee’s residence. | categorical | list of countries |
employment_days | Number of days since the first day at the company | numerical | continuous: integer |
project_date | Date the last project was assigned to the employee. | date → numerical | month_sin month_cos quart_sin quart_cos |
new_employee | “New” if hired fewer than 10 days ago. | boolean | 0 / 1 |
industry | Client’s industry. | categorical | E-commerce: ecommerce Consumer Electronics: manufacturer Video Gaming: gaming Mobile Gaming: mob_gaming |
size | Project size by number of agents | ordinal | XS (<10 agents): 1 S (10–20 agents): 2 M (20–30 agents): 3 L (30–60 agents): 4 XL (60–120 agents): 5 XXL (>120 agents): 6 |
channels | Number of support channels (grouped). | ordinal | single channel: 1 two channels: 2 multi-channel: 3 |
phone | Phone-line among support channel | boolean | 0 / 1 |
Descriptive analysis
Time series analysis
Attrition over time

Monthly attrition seasonality

Quarterly attrition seasonality

Employee-related features
Attrition by country

Top 3 countries attrition distribution (days)

Top 3 languages attrition distribution (days)

Attrition per FTE category (full-timer vs. part-timer)

Project (client)-related features
Attrition by industry

Attrition by project size (in number of agents)

Top 15 features by importance

Labels
- 30-days attrition: Employee left voluntarily within 30 days after last project assignment.
- 60-days attrition: Employee left within 60 days after last project (includes 30-day attrition).
- 90-days attrition: Employee left within 90 days (includes both 30- and 60-day attrition).
Machine Learning Model
After trying different linear and non-linear classifiers, chosen algorithm is random forest with 1000 n-estimators (trees) and unlimited leaves. Because of the highly imbalanced data (attrition / non-attrition labels are split as 10% / 90%) SMOTE-ENN (combination of SMOTE and Edited Nearest Neighbours) method was applied to training data set.
The target performance parameter was to achieve best recall: ability of the model to recognize attrition among actual attrition cases. At the same time, precision was sacrificed. It means that the model should prioritise recall — catching most true attrition events — even at the cost of additional false alarms.
Model Performance
Metric | 30-days | 60-days | 90-days |
---|---|---|---|
Accuracy | 0.970 | 0.938 | 0.900 |
Precision | 0.683 | 0.657 | 0.618 |
Recall | 0.915 | 0.870 | 0.831 |
F1-score | 0.783 | 0.749 | 0.708 |
AUC | 0.944 | 0.908 | 0.872 |
Confusion Matrices
30-days attrition

60-days attrition

90-days attrition

Tech Details
App Architecture

Tech Stack
- 🐍 Python 3.10
- 🧪 Flask 2.2.2
- 🤖 Scikit Learn 1.1.2
- 🔣 NumPy 1.23
Versions
- v1.0.0 (01 Dec 2012)