Student dropout rarely announces itself
A higher education institution serving thousands of students across associate’s, bachelor’s, and master’s programmes faced a persistent challenge: knowing which students were most at risk of not returning the following term — and knowing it early enough to act.
Advisors relied largely on instinct and reactive outreach. By the time a student’s risk became obvious — missed assignments, unpaid balances, a formal withdrawal — the window for meaningful intervention had often already closed. The institution needed a way to surface warning signals earlier, automatically, and at scale.
“The goal wasn’t just prediction — it was getting the right information to the right advisor before a student quietly disappeared.”
An added architectural constraint: the institution operated on Salesforce but held a limited analytics licence pool against a significantly larger Salesforce user base. Any solution would need to work for everyone — not just those with specialized analytics access.
A lean, iterative build across five phases
The engagement followed a design-led methodology — starting with deep business understanding before touching a single dataset — and moved through five structured phases over ten weeks.
From 185 variables down to 13 that matter
The team drew from both native Salesforce objects and external system extracts — pulling together academic records, financial aid data, engagement activity from the learning management system, and enrolment history into a unified dataset via CRM Analytics Recipes.
After rigorous analysis, 185 candidate features were narrowed to 40 for model training, then refined further to a core set of 13 organized into four categories:
| Feature | Category | Model Importance |
|---|---|---|
| Programme of Study | Academic | 41.03% |
| Student Location | Demographic | 9.39% |
| Engagement Cluster | Engagement | 7.05% |
| Total Financial Aid Awarded | Financial | 5.06% |
A risk score every advisor can see
Einstein Discovery generates a persistence likelihood score for each enrolled student at the start of every term. Rather than locking this insight behind a specialized analytics interface, the team engineered a write-back mechanism: scores are pushed directly to each student’s Contact record in Salesforce — accessible to all standard-licence users across the institution.
Students are automatically segmented into three intervention tiers:
A centralized dashboard — designed specifically for student success staff — layers on comparative views by degree type and programme, with filters for department and cohort. Advisors see exactly where their caseload stands, in one place, without navigating multiple systems.
The write-back strategy solved a real institutional constraint: predictive intelligence should not be locked behind a licence count. It belongs in the workflow every advisor already uses daily.
A path toward near-real-time intelligence
The current model operates on batch data refreshed at the start of each term. The technical roadmap points toward a future state that is significantly more dynamic — and more responsive to changes in student behaviour as they happen.
Honest about the gaps from the start
A key strength of this engagement was surfacing data limitations early rather than working around them. Two datasets — grade detail records with high null values, and withdrawal records with overlapping statuses — were excluded from the initial model to preserve integrity rather than risk polluting predictions with unreliable inputs.
The team also flagged where institutional behaviour itself created noise: a support ticketing system designed to log student interactions was inconsistently used across departments, limiting its predictive value. Meaningful social engagement data — peer connections, campus activity — simply was not being captured anywhere in a structured form.
These are not failures. They are a roadmap for what to build next — and being explicit about them builds trust with the stakeholders who need to act on the model’s outputs.