PR Analytics

7 Pull Request Metrics That Actually Predict Team Velocity

Stop measuring the wrong things. These 7 metrics correlate with deployment frequency, lower change failure rates, and faster feature delivery. Backed by DevOps Research and Assessment (DORA) data.

Most teams track PR metrics that do not matter: number of PRs opened, lines of code changed, number of commits, approval counts.

These are all vanity metrics. They tell you your team is busy. They do not tell you if your team is shipping.

The metrics below actually correlate with outcomes: faster deployments, fewer production incidents, higher throughput.

The 7 Metrics

1. Time to First Review

How long does a PR sit before anyone looks at it?

Elite

< 2 hours

High

< 4 hours

Bottleneck

> 24 hours

Why It Matters:

Long wait times kill momentum. A developer who waits 2 days for first review context-switches to other work, making rework expensive when feedback finally arrives.

What Good Looks Like:

Median < 4 hours. P90 < 1 day. If P90 is consistently high, you have a review capacity problem.

How to Fix:

  • Rotate review on-call duty
  • Set team SLA for first review (4 hours)
  • Make reviewing part of performance evaluations

2. Time to Merge (Cycle Time)

From PR open to merge, how long does it take? This is the ultimate measure of your delivery pipeline.

Elite

< 1 day

High

< 2 days

Red Flag

> 5 days

Why It Matters:

Long cycle time = slow velocity. If it takes 10 days to merge a PR, you ship 3 times per month. If it takes 1 day, you ship 20+ times per month.

What Good Looks Like:

Median < 2 days. P90 < 5 days. Distribution should be tight, not bimodal.

Common Bottlenecks:

  • Multiple rounds of feedback (poor requirements)
  • Large PRs that need deep review
  • Waiting for stakeholder approval

3. PR Size Distribution

How big are your PRs? Small PRs get reviewed faster, have fewer defects, and deploy more often.

Ideal

< 250 lines

Acceptable

250-500 lines

Too Large

> 500 lines

Why It Matters:

Large PRs sit longer, accumulate more review feedback, and merge slower. A 1,000-line PR takes 4x longer to review than four 250-line PRs.

What to Track:

Percent of PRs over 500 lines. This should be < 10%.

How to Improve:

  • Break features into smaller PRs
  • Use feature flags to ship incrementally
  • Refactor in separate PRs

4. Review Depth

How thorough are your reviews? Track comments per 100 lines of code.

Too Light

< 2 comments

Healthy

2-10 comments

Too Heavy

> 15 comments

Why It Matters:

Rubber-stamp approvals miss bugs. Overly pedantic reviews waste time. The goal is substantive feedback, not approval theater.

What Good Looks Like:

2-10 comments per 100 lines. Mix of blocking (bugs, security) and non-blocking (style, suggestions).

Red Flags:

  • 90% of PRs approved with zero comments
  • 5+ rounds of feedback on small PRs

5. Deployment Frequency (DORA)

How often are you shipping to production? This is the ultimate velocity metric.

EliteMultiple times per day
HighOnce per day to once per week
MediumOnce per week to once per month
LowLess than once per month

Why It Matters:

High deployment frequency correlates with lower change failure rates, faster MTTR, and higher team performance (DORA research).

How to Track:

Count merges to production branch per week. Track trends over time. If frequency drops, diagnose why (CI failures? manual gates? batching?).

6. Review Distribution

What percentage of your team actively reviews PRs? Are reviews concentrated on a few people?

Healthy

> 70% of team reviewing

Bottleneck

< 40% of team reviewing

Why It Matters:

If only 2 people review all PRs, those 2 people are your bottleneck. When they are on vacation or overloaded, velocity drops to zero.

What to Track:

Number of reviewers vs. team size. Distribution of reviews across individuals. Gini coefficient if you want to be fancy.

How to Improve:

  • Rotate review duty across team
  • Set minimum review participation targets
  • Pair juniors with seniors for learning

7. Rework Rate

How often do merged PRs get followed by immediate bug fix PRs?

Good

< 10% rework

High Churn

> 25% rework

Why It Matters:

High rework means reviews are not catching bugs, or requirements are unclear. Either way, velocity suffers.

How to Track:

Tag PRs that touch same files within 48 hours of merge as potential rework. Review patterns.

Root Causes:

  • Rubber-stamp reviews (not catching bugs)
  • Unclear requirements (feature misunderstanding)
  • Inadequate testing (manual QA missing edge cases)

How to Actually Use These Metrics

1. Start with Time to Merge

If this is high (> 5 days), drill into why. Is it time to first review? PR size? Review depth? Fix the bottleneck.

2. Track Trends, Not Snapshots

One bad week means nothing. If median time to merge goes from 2 days to 5 days over a quarter, you have a problem.

3. Compare Within Team, Not Across Teams

Frontend PRs will always be larger than backend. Do not compare them. Track each team against their own baseline.

4. Use P50 and P90, Not Averages

Median (P50) tells you typical performance. P90 tells you worst case. If P90 is 3x higher than P50, you have outliers to investigate.

5. Do Not Gamify

If you tie bonuses to time to merge, developers will merge PRs without review. Use metrics for diagnosis, not performance reviews.

PRPulse Tracks All of This Automatically

No manual spreadsheets. No Jira integration. Just connect GitHub and see:

  • Time to first review and time to merge (P50, P90, trends)
  • PR size distribution across repos and teams
  • Review depth (comments per 100 lines)
  • Deployment frequency via production branch tracking
  • Review distribution (who is reviewing, who is bottleneck)
  • Rework rate (PRs followed by immediate fixes)
  • Individual contributor patterns and team velocity trends

Connect GitHub → Select repos → See metrics within an hour