Leadership

When Measuring Impact Gets in the Way of Actually Having Impact

Teams hit every metric—green dashboards, perfect scores. But customers were frustrated and critical incidents kept happening. Here's why: people spent more time proving they had impact than actually having it. Measurement systems don't just fail, they twist the work they measure.

While working on quality efforts across Azure, I noticed something odd.

Teams hit their metrics. Scorecards were green. Service reliability numbers looked great. The dashboards told a beautiful story.

But our largest enterprise customers were frustrated. Critical incidents kept hitting the same services. Teams spent more time explaining their numbers than fixing problems.

Here's what was happening: Microsoft's focus on measuring "impact" for performance reviews meant people spent more time proving they had impact than having it.

How We Got Here

I watched the pattern repeat across Azure organizations. A team would start strong, focused on solving actual problems. Then someone would ask: "How do we measure this?"

Everything changed. The team shifted from asking, "How can we help customers?" to "How can we show our impact?" New ideas slowed. Risk-taking stopped. People spent more time focused on measuring their impact and displaying it on dashboards instead of working on things that improved services for customers.

Here's what I learned: measurement systems don't just fail, they twist the work they measure. And it happens in obvious ways.

The Three Ways Measurement Breaks

First, people focus on the metric, not the goal. When deployment success became the key metric, teams started deploying smaller, safer changes to boost their scores. They hit 98% deployment success while shipping fewer features that customers wanted.

Second, what gets measured gets managed. One team obsessed over response times while ignoring error rates. Response times improved 15%. Error rates doubled. Users got fast failures instead of slow successes.

Third, the metric becomes the mission. I sat in meetings where program managers defended broken numbers because "the metric is green." The scorecard mattered more than the outcome.

This isn't poor judgment. It's how humans work. Show people a scorecard and they'll find ways to win it, even when winning hurts the business.

What Works

Through my work across Azure, I've seen what approaches work when teams get measurement right:

1. Measure Inputs, Not Just Outputs

This is the classic idea of leading vs lagging indicators. Kaplan and Norton call them "what drives results" and "what you want to achieve."

Instead of tracking deployment success (output), we measured deployment practices (input). Did teams run automated tests? Did they use feature flags? Did they have rollback procedures?

We could control these things. Teams couldn't game them without improving their work. When deployment practices improved, deployment success followed naturally.

The approach: For every output metric you care about, identify 2-3 input metrics that drive it. Track those instead.

2. Set Expiration Dates on Every Metric

Nothing should live forever. Every metric needs a review date—usually 6 months. If a metric isn't driving the behavior you want, kill it.

This solves the metric creep problem. Teams can't just add new measures on top of old ones. They have to choose what matters most.

The rule: If you can't explain why a metric exists and how it drives better outcomes, stop measuring it.

3. Separate Measurement from Punishment

Missing a target should never trigger consequences for individuals. Metrics are for learning, not evaluation. This kills the gaming behavior overnight.

Teams start reporting problems instead of hiding them. They use metrics to spot issues early instead of covering them up at review time.

The approach: When someone misses a metric, the first question should be "What did we learn?" not "Who's responsible?"

A Framework That Works

Based on the research and what I've observed work in practice, here's a measurement approach that addresses these problems:

Focus on Input Metrics: Things teams control directly

Code review completion rates
Test coverage percentages
Feature flag usage
Documentation updates

Monitor Output Metrics: Things customers experience

System reliability
Response times
Error rates
Customer satisfaction

Review Process:

Review how inputs correlate with outputs
Kill metrics that don't drive the behavior you want
Keep the total number of metrics small (3-5 per team max)

Why This Works

The research backs up what we learned in practice. Robert Austin at Harvard proved that partial measurement is worse than no measurement—it pushes effort toward measured areas while neglecting everything else.

McKinsey found that good measurement needs three things: clear goals, coaching instead of tracking, and fair compensation. Amazon's Weekly Business Review process focuses on controllable inputs, not just hoped-for outputs.

Andrew Likierman at London Business Schoo identified the five traps that kill measurement programs: measuring against yourself instead of competition, looking backward instead of forward, putting shareholders over operations, enabling gaming, and linking pay to metrics.

But here's what the research misses: doing it matters more than theory. You can have the perfect measurement approach and still fail if people don't trust it.

What You Can Do Tomorrow

Start small. Pick one output metric your team cares about. Identify 2-3 input metrics that drive it. Track those for a month.

Kill something. Find one metric your team tracks that doesn't change behavior. Stop measuring it. See what happens.

Ask different questions. Instead of "Did we hit our numbers?" ask "What do our numbers tell us?" Instead of "Who missed their target?" ask "What should we do differently?"

Focus on inputs you control. If your team can't directly influence a metric through their daily work, don't use it to evaluate performance.

Make it safe to report problems. The fastest way to improve metrics is to make it safe for people to share bad news early.

The Reality Check

Here's the hard truth: Microsoft never changed this approach. Despite the published data about better measurement practices, the culture still obsesses over "measuring impact" for performance reviews.

Managers still ask people to prove their impact with metrics. Teams still spend more time on dashboards than customer problems. The broken measurement system continues because it serves performance evaluation, not improvement.

This isn't unique to Microsoft. Most large companies fall into the same trap. They know measurement can be harmful, but they can't resist using it to rank people.

The approaches I described work. I've seen them succeed in smaller teams and projects. But changing company-wide measurement culture? That's a different problem entirely.

The Bottom Line

Measurement isn't broken because we're doing it wrong. It's broken because we're asking it to do too much.

Metrics can't tell you if your team is successful. They can only tell you if specific things are happening. Use them to spot patterns, identify problems, and track improvements. Don't use them to define success.

The most important work often can't be measured until long after it's done. That's not a problem with the work. It's a feature of solving problems that matter.

Measure what helps you improve. Ignore everything else. Your customers will thank you—even if your performance review doesn't.