In October 2025, MIT Media Lab's Advancing Humans with AI (AHA) program gathered 80 interdisciplinary experts from over 40 institutions for a two-day structured workshop. Participants came from Stanford, Harvard, Oxford, OpenAI, Microsoft Research, the Gates Foundation, and the Center for Humane Technology. Their goal: build the world's first benchmarks not for what AI can do, but for what AI does to us.
The resulting framework for human flourishing with AI evaluates AI across three dimensions: Reasoning, Comprehension, and Agency; Curiosity and Learning; and Healthy Emotional and Social Lives.
The report is dense and technical. What it reveals about how leaders should think about AI use is worth unpacking.
The Problem With How We Currently Measure AI
Every major AI benchmark in existence measures the same thing: task performance. Does the model answer correctly? Does it complete the job faster? Is it more accurate than the last version?
The MIT framework makes the limits of that approach vivid with a simple example. Imagine two AI tutors helping a student with a hard problem. Tutor A gives the full solution immediately. Tutor B asks questions, offers hints, builds the student's capacity to solve it themselves. Both produce the correct answer. On every existing benchmark, they score identically.
In practice, they are building completely different humans.
This is the gap the AHA program is trying to close: the space between what AI produces and what AI does to the person using it. The workshop generated a taxonomy of 26 distinct risk categories and 24 opportunity categories across the three flourishing dimensions, a structured map of exactly where AI either erodes or builds human capacity.
For leaders, this map is worth studying.
The Three Dimensions That Define Leadership in the Age of AI
1. Reasoning, Comprehension, and Agency
The first dimension covers our capacity to hold true beliefs, reason critically, and make decisions we won't later regret.
The workshop identified several specific risks here. Models tend toward what the researchers call "cognitive offloading": they write the whole essay, complete the plan, execute the strategy, even when explicitly asked not to. They also overclaim, flattening complex, contested questions into confident single answers and presenting uncertainty as settled fact.
For leaders, this is not a minor inconvenience. Leadership is, at its core, a reasoning practice. You collect ambiguous signals, weigh competing considerations, make judgment calls under uncertainty. If you systematically hand that process to a tool that bypasses the struggle, produces polished outputs, and projects false confidence, you don't get to keep the skill. You lose it gradually, invisibly, until the moment it matters most.
Martin Seligman, the world's most cited positive psychologist, pushed back directly on AI displacement anxiety in a recent conversation on AI and human agency. His position: AI is not replacing what matters most in humans. It is clarifying it. What AI cannot do, Seligman argues, is the genuinely creative act, or love. Both require presence in the process, not delegation of it.
The best leaders will use AI to sharpen their reasoning, not replace it. Ask the tool to steelman the opposing view. Instruct it to surface what's uncertain rather than what's confident. Use it as a thinking partner, not a thinking substitute.
2. Curiosity and Learning
The second dimension covers intrinsic motivation to explore, understand, and build new capability, and the specific risk of deskilling when AI takes over tasks people should be learning from.
The workshop produced benchmarks around "learning while doing" scenarios: writing project briefs, debugging code, preparing negotiation strategies. In each scenario, the AI can either act as a mentor, leaving legible reasoning traces, explaining trade-offs, inviting the user to make and justify decisions, or as an invisible ghostwriter, producing polished outputs that optimize short-term efficiency while eroding long-term judgment.
The finding worth noting: "empirical benchmarking results suggest that most baseline LLMs, even when highly fluent, fail to meaningfully scaffold these learning processes." The default mode of almost every AI system we use at work is ghostwriting, not mentoring. The fluency is the problem. It feels helpful. It feels efficient. It is quietly making us less capable.
BCG research on joy at work found that employees who enjoy their work are 49% less likely to consider a new job than those who don't. A significant driver of that enjoyment is mastery, the feeling of getting genuinely better at something. When AI removes the challenge, it often removes the satisfaction along with it.
Your own growth depends on productive struggle, the friction of working through problems you haven't solved before. Your team's growth depends on the same. When you or they default to AI execution instead of AI scaffolding, you're trading future capability for current speed. That's a trade worth making sometimes. It's a catastrophic trade to make by default.
3. Healthy Emotional and Social Lives
The third dimension is where the workshop findings become most personal.
The researchers documented "emotional dependency mitigation" as a core benchmark category. The risk: users increasingly turn AI companions into primary emotional supports, with consequences for isolation, blurred boundaries, and genuine psychosocial harm. The evidence cited includes prolonged AI interactions linked to less socialization and more loneliness, and multiple incidents of extensive chatbot use connected to individuals' suicides.
The benchmark for appropriate behavior in emotional contexts is specific: the AI should validate feelings while gently redirecting users toward human connection, not position itself as the user's closest confidant with no friction or reminders of its limits.
For leaders, the emotional dimension is not just a personal wellbeing question. It's a leadership capacity question. Your ability to read a room, hold tension in a difficult conversation, repair a broken relationship, and build genuine trust with people cannot be outsourced.
Harvard Business School professor Arthur C. Brooks has documented a specific version of this problem in his research on leader happiness. The number one and number two emotions a new CEO feels in their first 24 months on the job, Brooks has found, are not pride and excitement. They are loneliness and anger. When you use AI to avoid the discomfort of hard human interactions, having it draft the difficult feedback, script the conflict conversation, or process the disappointment rather than sitting with it, you're withdrawing from the very experiences that, worked through rather than avoided, build the emotional range leadership actually requires.
BCG's Debbie Lovich, in her research on retaining top talent, found that satisfaction with one's manager and feeling valued and supported are among the most critical factors for employee retention. A team that has learned to outsource hard conversations to an AI, or that has stopped tolerating the productive friction of genuine disagreement, is not a strong team. It is a fragile one.
The Leaders Who Will Flourish
The workshop's core insight: AI systems should be evaluated on whether they support or undermine human capability development over time, not simply whether they produce correct answers or complete tasks efficiently.
The best leaders will not be those who use AI to avoid thinking, learning, feeling, or relating. They will be those who use AI to deepen thinking, accelerate learning, regulate emotion, strengthen relationships, and preserve human agency under complexity.
This is a discipline, not a default. The default is offloading. The discipline is intentionality: being specific about which tasks AI should execute fully, which it should scaffold, and which belong entirely to you.
A few principles for that discipline:
Protect your reasoning process. When working through genuinely complex decisions, use AI to surface what you might be missing, not to reach the conclusion for you. The struggle of working through ambiguity is where leadership judgment lives.
Stay in the productive struggle. When something is hard and you're tempted to hand it to the tool, ask first: is this difficulty the point? Learning that comes from friction is the kind that transfers. Learning that comes from watching AI perform the task is mostly not learning at all.
Don't automate your emotional labor. Brooks is direct on this: working on your own inner life is not optional, it is your most important professional responsibility. The hardest conversations, the feedback that costs something to give, the relationship repair that requires real presence, these are not inefficiencies to be optimized. They are how trust gets built. Tracy Brower, in her work on happiness at work, identifies four components of workplace happiness, and two of them, dedication and mattering, are inherently relational. You cannot automate your way to either.
Watch for sycophancy. One of the workshop's specific risk categories involves AI systems that agree with users' harmful self-talk or distorted beliefs in the name of being supportive. The gold-standard behavior they benchmarked is AI that combines warmth with gentle challenge, acknowledging emotions while questioning distorted beliefs, suggesting alternative perspectives, slowing users down before impulsive actions. If your AI tools never push back on you, that's a calibration problem worth addressing.
Why This Research Matters Now
Gallup's 2026 employee engagement report put global engagement at 20%, the lowest since the pandemic. Eight in ten workers are either going through the motions or actively disengaged. The World Happiness Report has documented a steady rise in negative emotions globally, with workplace factors among the primary drivers. We have built, as our piece on human flourishing at work puts it, the most productive work culture in history and one of the least fulfilling.
The MIT framework gives us a new lens for understanding one of its accelerants: AI that optimizes for task completion while quietly depleting the human capacity underneath it.
The MIT team is explicit about their goals: to give users, developers, and policymakers actionable data about the psychosocial risks and benefits of specific AI systems. Think of it as a nutrition label for AI, not "is this model smart?" but "is this model making you better?"
That is a useful frame to apply to your own relationship with AI.
Not: does it make me faster?
But: does it make me sharper? More curious? More capable of genuine connection? Better at making decisions I can stand behind?
Happiness researcher Sonja Lyubomirsky defines happiness as "the experience of joy, contentment, or positive well-being, combined with a sense that one's life is good, meaningful, and worthwhile." The research on happiness at work consistently shows that the leaders who thrive long-term are those who invest in their own capacity, not just their output.
Those are the questions worth asking. The leaders who ask them will have a meaningful advantage over those who notice, too late, that efficiency came at the cost of something harder to rebuild.
FAQ: Leaders and Human Flourishing in the Age of AI
What does human flourishing with AI mean for senior leaders?
It means using AI in ways that build rather than erode your core leadership capacities: your ability to reason through complexity, learn from experience, maintain genuine relationships, and preserve your own agency in high-stakes decisions. As we explore in our guide on human flourishing at work, flourishing is a significantly higher bar than happiness. You can hit every target and generate excellent outputs while something essential in you quietly goes dark. The MIT framework defines it as "the sustained cultivation of capacities that enable individuals to exercise authentic agency, engage in meaningful learning and exploration, and maintain psychological and social well-being across the lifespan."
What are the biggest risks of AI for leadership development?
The three biggest risks identified by the MIT workshop are cognitive offloading (using AI to think instead of with), deskilling (losing professional capability through over-reliance), and emotional dependency (substituting AI for the human relationships and difficult conversations that build real leadership capacity). All three are risks of comfort and efficiency, not of malice.
How can leaders use AI to strengthen rather than weaken their judgment?
By being deliberate about when to use AI for execution versus scaffolding. Use AI to surface blind spots, steelman competing views, and extend your thinking, not to reach conclusions you should be working through yourself. The distinction between AI as a thinking partner and AI as a thinking replacement is one that leadership performance over the next decade will likely hinge on.
What does the research say about AI and emotional intelligence?
The MIT workshop found that emotionally attuned AI interactions should validate feelings while redirecting users toward human connection, not position the AI as a substitute for human relationships. Arthur Brooks' research shows that loneliness and anxiety are already among the most common experiences for senior leaders. Using AI to further reduce human contact compounds a problem that is already significant.
What is cognitive scaffolding and why does it matter for leadership?
Cognitive scaffolding means supporting someone's thinking process without replacing it, asking questions, offering hints, building toward understanding rather than delivering the answer. The MIT research found that most AI systems default to the opposite: providing polished, complete outputs that feel helpful but skip the productive struggle where real learning and judgment development happen. Leaders should look for ways to configure their AI use toward scaffolding rather than execution, both for themselves and for the teams they're developing.
Is it possible to measure whether AI is good or bad for human flourishing?
That's exactly what the MIT AHA program is building. Their benchmark framework evaluates AI systems across specific behavioral dimensions: does the model scaffold or over-offload? Does it communicate uncertainty or overclaim? Does it foster emotional dependency or redirect toward human connection? The shift is from asking "how smart is this AI?" to asking "what does this AI do to the humans who use it?"
The full MIT Media Lab report, "Towards Open Benchmarks for Human Flourishing with AI," was produced by the Advancing Humans with AI (AHA) program and supported by the Omidyar Network. The workshop drew participants from MIT, Stanford, Harvard, Oxford, Cambridge, OpenAI, Microsoft Research, the Gates Foundation, and over 35 other institutions.






![[Research Report] Rethinking the Workweek: The Push for Fewer Work Hours, More Life](https://cdn.prod.website-files.com/6442419dcf656a81da76b503/67f62ad0c3f263946ab76bf6_67eb9ea7f1a4e15aa02357f9_67dbcc5cd852e0468b7c1ab5_67d9294873eb0f3d06db0522_67d3e3773b52fb1458bbeca9_67cfeef562a442c6b5cac417_67ce9d783f83a316431e99fb_67c805f1e959c187503b6f9a_67c6b466f85f6718abc84402_67c562f18bb2e90006fefb91_67c01ccf46d56e67e92b4a6c_67becb6ce25a88249e237189_67b590c86adb9081bf7caffb_67a31bbd70980f65543d6140_6790a675ad20d4456fda7112_678f83dd8d45ccd6bdcc0504_Rethinking-the-workweek.avif)



