Emon Sarker

Most digital interventions for behavior change share a common failure mode: they work brilliantly for two weeks, then the user reverts to baseline. The notification becomes noise. The streak counter loses its motivational power. The gamification points feel hollow.

This is the persistence problem, and it's fundamentally a problem of extrinsic vs. intrinsic motivation.

The Decay Curve

Every behavior change intervention follows a predictable decay curve. Initial engagement is high — the novelty effect ensures this. But novelty is a depreciating asset. Within 14-21 days, most users have habituated to the stimulus and no longer respond to it.

The standard approach to this problem is escalation: more notifications, bigger rewards, social pressure. This works briefly, but it's treating symptoms. The underlying issue is that the system is creating dependency on external triggers rather than building genuine habit architecture.

A Different Framework

What if the system's goal wasn't to maximize engagement, but to minimize its own necessity?

We model this as a reinforcement learning problem where the agent's reward function includes a term for user autonomy. The agent receives positive reward when the user performs the target behavior without an intervention — and receives diminishing reward for behaviors that require a prompt.

This creates fascinating agent behavior:

The agent learns to time interventions strategically, favoring moments when the user is most likely to internalize the behavior
It learns to vary intervention types to prevent habituation
Most importantly, it learns to back off — reducing intervention frequency as the user develops self-sustaining habits

Preliminary Results

In a 90-day study, our RL-based system achieved 2.3x higher behavior persistence compared to rule-based alternatives. The key metric isn't day-1 engagement — it's day-90 adherence without any system intervention.

Ethical Implications

Building systems that influence human behavior carries profound responsibility. Our framework includes hard constraints: the agent can only optimize for outcomes the user has explicitly opted into, and it cannot use dark patterns or exploit psychological vulnerabilities.

The goal is not to control behavior. It's to scaffold the development of habits the user already wants to build.

On the Nature of Persistent Behavior Change

The Decay Curve

A Different Framework

Preliminary Results

Ethical Implications