The Bug Backlog That Finally Broke Me
The number that broke me was 212. That’s how many open issues we had in Linear when I finally stopped pretending we had a “manageable backlog.” I spent an entire Monday just reading through old bug reports, and by noon I realized I couldn’t even remember what half of them referred to anymore. The context was gone. The developer who filed it had left. The feature it affected had been refactored twice. That backlog wasn’t a to-do list β it was a graveyard with a ticket-tracking system slapped on top.
Every engineering team I’ve worked on has told itself the same lie: “We’ll fix it next sprint.” I’ve said it. I’ve nodded along when others said it. The mechanics of why it never happens are pretty straightforward β new features get estimated, assigned, and shipped because they’re visible to stakeholders. Bug fixes are invisible until they’re catastrophic. So they keep getting pushed to sprint N+1, which never arrives. The honest version of “we’ll fix it later” is “we’ve decided this bug is acceptable forever unless a customer screams loud enough.” Most teams just don’t want to say that out loud.
Zero bugs doesn’t mean your software has no defects. It means your team has made a deliberate agreement about what lives in the backlog and what gets fixed before the next commit merges. The version I’ve seen actually work is a hard rule: no bug older than two weeks ships in the backlog. If it’s not worth fixing in two weeks, you either close it with an explicit “won’t fix” label or you write up the known limitation in your docs. That forces honesty. Teams stop filing bugs as a form of CYA and start being selective about what actually needs tracking. The moment you treat bug filing as a commitment rather than a note-to-self, the backlog number drops fast.
The discipline part is the enforcement mechanism. I switched our team to a policy where any bug filed needed a reproducible case and a severity tag before it could sit in the backlog more than 48 hours. No repro? Closed immediately with a comment asking for more detail. That alone killed about 30% of the phantom tickets we’d accumulated β bugs that were actually user confusion, environment-specific one-offs, or already-fixed issues that nobody had closed. The friction of documentation filters out the noise without losing the signal.
One thing that genuinely shifted our hit rate on catching bugs before they ever became tickets was leaning harder on static analysis and AI-assisted review during the PR stage. Tools that flag potential null reference errors, unhandled promise rejections, or logic branches that look wrong before a human even reviews the diff β that’s where the real use is. If you’re evaluating that layer of your toolchain, the Best AI Coding Tools in 2026 (thorough Guide) covers the options worth actually testing, with the kind of specificity that lets you compare them honestly rather than just reading marketing copy.
Step 1: Stop New Bugs From Entering the Codebase
The most underrated insight I’ve had after years of shipping code: a bug that gets caught in a pre-commit hook costs literally nothing to fix. You just fix the code before it exists anywhere else. Once it merges, you’re paying for it in PR reviews, QA cycles, staging incidents, and eventually a postmortem. The entire game is moving the catch point as far left as possible.
Linting That Actually Blocks Bad Code
Most teams use ESLint for formatting opinions β tabs vs spaces, semicolons, line length. That’s mostly useless. The config I actually care about flags things like unused variables that shouldn’t be optional, no-floating-promises which catches async bugs before they silently swallow errors in production, and rules that prevent common React state mutation patterns. Here’s the .eslintrc.json I use on Node/React projects that has caught real bugs before they merged:
{
"parser": "@typescript-eslint/parser",
"plugins": ["@typescript-eslint", "react-hooks"],
"extends": [
"eslint:recommended",
"plugin:@typescript-eslint/recommended-type-checked"
],
"parserOptions": {
"project": "./tsconfig.json"
},
"rules": {
"@typescript-eslint/no-floating-promises": "error", // caught 3 silent async failures in month 1
"@typescript-eslint/no-explicit-any": "error", // forces you to actually type things
"@typescript-eslint/no-unused-vars": ["error", { "argsIgnorePattern": "^_" }],
"no-console": ["warn", { "allow": ["warn", "error"] }],
"react-hooks/rules-of-hooks": "error",
"react-hooks/exhaustive-deps": "warn", // warn not error β misses edge cases
"eqeqeq": ["error", "always"] // == vs === has bitten every JS dev once
}
}
The key is recommended-type-checked instead of plain recommended. The type-checked variant uses your TypeScript type information to lint β so it catches things like calling .then() on a value that isn’t actually a Promise, which is a runtime bug, not a style issue. It’s slower (it has to run the TS compiler) but worth it.
Husky v9 Pre-Commit Setup
Husky v9 changed the config format from v8, so a lot of Stack Overflow answers are wrong now. Here’s the actual install flow:
# Install Husky v9 and lint-staged
npm install --save-dev husky lint-staged
# Initialize husky (creates .husky/ directory)
npx husky init
# The init command creates .husky/pre-commit β replace its content:
echo "npx lint-staged" > .husky/pre-commit
Then add this to package.json. Don’t put lint-staged config in a separate file β keeping it here makes it visible to anyone reading the project setup:
{
"lint-staged": {
"*.{ts,tsx}": [
"eslint --fix --max-warnings=0",
"tsc --noEmit" // type-check only staged files' project
],
"*.{ts,tsx,js,json,css}": [
"prettier --write"
]
},
"scripts": {
"prepare": "husky" // this wires up husky on npm install for new team members
}
}
The --max-warnings=0 flag is the part people skip and then regret. Without it, lint warnings accumulate silently and the hook becomes theatre. prepare running husky means every new developer who clones the repo and runs npm install gets the hooks automatically β no README step to forget.
TypeScript Strict Mode: What Actually Breaks
Adding "strict": true to tsconfig.json is a single line change that will probably generate 50β300 type errors in any codebase that wasn’t built with it from day one. That’s not a reason to avoid it β that’s the point. Those errors are real problems:
{
"compilerOptions": {
"strict": true, // enables strictNullChecks, noImplicitAny, strictFunctionTypes, etc.
"noUncheckedIndexedAccess": true, // arr[0] is T | undefined, not T β honest about array bounds
"exactOptionalPropertyTypes": true // {x?: string} means string, not string | undefined explicitly
}
}
The thing that caught me off guard the first time was that strict: true is actually a shorthand for about eight separate flags. The one that hurts most is strictNullChecks: suddenly every function that returns User | null requires a null check before you access properties on it. That’s painful for an hour and then permanently better. I also add noUncheckedIndexedAccess separately because it’s not included in strict but it catches real array-out-of-bounds assumptions. On a mid-size codebase, expect to spend half a day clearing the initial errors β but you’ll find at least two actual bugs hiding in there, not just type annotation gaps.
Step 2: Static Analysis That Runs in CI, Not Just Locally
The “it works on my machine” problem is usually framed as an environment issue β different Node versions, missing env vars, OS path separators. But a huge chunk of it is actually static analysis failing silently. Your teammate pushed code with a null dereference that your linter would catch, except the linter only runs in their editor, they’ve ignored the squiggly lines, and now the bug ships. Pushing static analysis into CI means it becomes a gate, not a suggestion.
Running SonarQube Community Edition Locally First
Before you wire anything into CI, run it locally so you understand what you’re actually deploying. The exact Docker command:
# Needs at least 4GB RAM allocated to Docker
docker run -d \
--name sonarqube \
-p 9000:9000 \
-v sonarqube_data:/opt/sonarqube/data \
-v sonarqube_extensions:/opt/sonarqube/extensions \
-v sonarqube_logs:/opt/sonarqube/logs \
sonarqube:10.4-community
Hit http://localhost:9000, default credentials are admin/admin and it forces a password change on first login. Create a project manually, generate a token, then run your first scan:
# For a Node.js project β sonar-scanner needs to be installed separately
sonar-scanner \
-Dsonar.projectKey=my-app \
-Dsonar.sources=src \
-Dsonar.host.url=http://localhost:9000 \
-Dsonar.token=sqp_yourGeneratedTokenHere \
-Dsonar.javascript.lcov.reportPaths=coverage/lcov.info
The thing that caught me off guard: SonarQube won’t block anything by default. You have to configure a Quality Gate. Go to Quality Gates β Sonar way, and verify it has “New Issues” conditions set. The default “Sonar way” gate does include this, but check that your project is actually assigned to it β new projects sometimes inherit a permissive custom gate someone created months ago.
GitHub Actions Workflow That Actually Blocks Merges
This is the workflow I use. The key is sonar.qualitygate.wait=true β without it, the scanner exits 0 regardless of gate status and your merge protection is theater.
# .github/workflows/sonar.yml
name: SonarQube Analysis
on:
pull_request:
branches: [main, develop]
jobs:
sonar:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0 # Full history needed for blame data and new-code detection
- name: Set up Node 20
uses: actions/setup-node@v4
with:
node-version: '20'
cache: 'npm'
- name: Install and test
run: |
npm ci
npm run test -- --coverage --coverageReporters=lcov
- name: SonarQube Scan
uses: SonarSource/sonarqube-scan-action@master
env:
SONAR_TOKEN: ${{ secrets.SONAR_TOKEN }}
SONAR_HOST_URL: ${{ secrets.SONAR_HOST_URL }}
with:
args: >
-Dsonar.projectKey=my-app
-Dsonar.javascript.lcov.reportPaths=coverage/lcov.info
-Dsonar.qualitygate.wait=true
In GitHub repo settings, go to Branch Protection β require the sonar status check to pass before merging. Now a failing gate actually blocks the PR. Store SONAR_TOKEN and SONAR_HOST_URL as repository secrets β if you’re self-hosting SonarQube on a private network, you’ll need a GitHub Actions runner that can reach it, or expose it through a tunnel. SonarCloud (their hosted version) sidesteps this entirely but costs money past the free open-source tier.
That One Rule That Flags Everything Incorrectly
The rule typescript:S6606 (prefer nullish coalescing) and javascript:S3358 (no nested ternaries) generate an embarrassing number of false positives on codebases that intentionally use patterns SonarQube misreads as bugs. The proper suppression is inline, not project-wide disabling:
// Intentional fallback to empty string β null/undefined both valid states here
const label = getValue() || ''; // NOSONAR typescript:S6606
Using // NOSONAR without specifying the rule key suppresses all rules on that line, which is sloppy β it hides real issues introduced later. Always include the specific rule key. If you’re suppressing the same rule more than five times across a codebase, that’s a signal to either disable the rule project-wide in sonar-project.properties or, more honestly, evaluate whether your code pattern is actually the problem.
When SonarQube Is Overkill: CodeClimate as a Practical Alternative
SonarQube Community needs a server running somewhere, Postgres 15+ for production setups, and regular version maintenance. For a team under five people or an open-source project, that overhead is real. CodeClimate’s free tier covers open-source repos with no server to manage β you connect the GitHub repo and it runs on their infrastructure. The trade-off is visibility: you get maintainability grades and duplication detection, but the free tier doesn’t give you security hotspot detection or the granular custom rules SonarQube does.
In practice, CodeClimate catches the things that matter most for small teams: duplicated code blocks, overly complex functions (it uses cognitive complexity scoring), and files that keep accumulating changes without being refactored. The GitHub PR decoration works well β it posts inline comments on the diff rather than making you visit a separate dashboard. If you want CodeClimate on a private repo, pricing starts at $16/month per seat. My honest take: use CodeClimate when your team hasn’t done static analysis before and you want zero friction onboarding. Move to SonarQube when you need security rules, custom quality profiles, or you’re in a regulated industry where you need audit trails of your code quality gates.
Step 3: Write Tests That Actually Catch Bugs (Not Just Pad Coverage)
The most dangerous number in your test suite is 87% coverage. It feels good. It looks green on the CI dashboard. But I’ve watched 87%-covered codebases ship critical bugs every single sprint because the tests were written to hit a number, not to prevent failures. Coverage tells you which lines got executed during a test run β nothing more. A line covered by a test that doesn’t assert anything is worse than no test at all, because it actively misleads you.
What I measure instead: assertion density (how many meaningful checks per test), branch coverage specifically (not line coverage β a line can execute while leaving half its conditional branches untested), and mutation survival rate (more on that below). These three together tell you whether your tests actually break when code breaks. Coverage percentage tells you almost none of that.
The test pyramid I actually follow
I write a lot of unit tests with Jest, a moderate number of integration tests with Supertest, and basically no E2E tests in most projects. That’s intentional. E2E tests with Playwright or Cypress are expensive to maintain β flaky network conditions, timing issues, and UI changes kill them faster than you can fix them. For API-heavy work, a Supertest integration test that spins up the actual Express app and hits real routes with a test database covers 80% of what an E2E test would, in a fraction of the setup time. Here’s what a Supertest integration test actually looks like:
// tests/integration/orders.test.ts
import request from 'supertest';
import { app } from '../../src/app';
import { db } from '../../src/db';
beforeEach(async () => {
// wipe and reseed β don't share state between tests or you'll chase ghosts
await db.migrate.rollback();
await db.migrate.latest();
await db.seed.run();
});
afterAll(() => db.destroy());
test('POST /orders returns 400 when inventory is exhausted', async () => {
const res = await request(app)
.post('/orders')
.set('Authorization', 'Bearer test-token')
.send({ productId: 'WIDGET-001', quantity: 9999 });
expect(res.status).toBe(400);
expect(res.body.code).toBe('INSUFFICIENT_INVENTORY');
// asserting the shape of the error, not just the status code
});
Bug-driven development: write the test first, then fix the bug
Every time a bug hits production, before touching the fix, I write a failing test that reproduces it. Not after the fix β before. This sounds obvious but most teams skip it because there’s pressure to ship the fix fast. The problem is you then have no guarantee the bug stays fixed after the next refactor. The habit I’ve locked in: the bug report becomes a test name. Literally. test('order total rounds down to zero when currency is JPY and amount is less than 1 yen'). That test lives forever. It’s caught the same class of rounding bug three times in a codebase I maintain, from three different contributors who had no idea about the original incident.
Failing the build when coverage drops
I enforce a coverage floor with Jest’s --coverageThreshold flag in the config, not as a CLI argument someone can forget to pass. Here’s the actual config:
// jest.config.ts
export default {
collectCoverageFrom: ['src/**/*.ts', '!src/**/*.d.ts'],
coverageThreshold: {
global: {
branches: 75, // branches matter more than lines
functions: 80,
lines: 80,
statements: 80,
},
// per-file threshold catches new files with zero coverage sneaking in
'./src/services/': {
branches: 85,
lines: 90,
},
},
};
The numbers above aren’t magic β they’re the floor of what my team had when I introduced this, rounded down 5% to avoid immediate breakage. The point is you ratchet them up over time, never down. I run npx jest --coverage on every PR. If the branch coverage on a new file is 0%, the build dies. That conversation is easier to have before merge than after.
Mutation testing with Stryker will genuinely disturb you
The first time I ran Stryker on a project with 85% line coverage, the mutation score came back at 41%. That means 59% of the mutations Stryker made β flipping > to >=, removing a return statement, changing && to || β survived the test suite without any test failing. The code was changed to be wrong and the tests still passed. That’s the real number. Here’s the minimum config to run it:
// stryker.config.json
{
"testRunner": "jest",
"reporters": ["html", "clear-text", "progress"],
"coverageAnalysis": "perTest",
"mutate": ["src/**/*.ts", "!src/**/*.test.ts"],
"thresholds": {
"high": 80,
"low": 60,
"break": 50 // CI fails if mutation score drops below 50%
}
}
Run it with npx stryker run. Budget 10-20 minutes for a mid-sized codebase β it’s slow because it’s running your entire test suite once per mutation. The HTML report shows you exactly which mutations survived and where. I don’t chase 100% mutation coverage; the diminishing returns past ~75% aren’t worth it. But running it once per major feature branch has caught more real logic bugs in my code than any code review I’ve ever gotten.
Step 4: Code Review as a Bug Filter, Not a Style Fight
The number one sign that a team’s code review process is broken: the PR thread has 15 comments about semicolons and one comment about a null pointer that ships to production. I’ve been on both ends of that. Once I automated formatting entirely, review quality jumped immediately β not because people got smarter, but because the cognitive budget stopped getting wasted on trivia.
Kill Formatting Debates With Automation
Prettier + ESLint with --fix should run in CI and commit back to the branch. No arguing, no “I prefer 2 spaces” threads. Here’s the exact GitHub Actions step I use:
name: Auto Format
on: [pull_request]
jobs:
format:
runs-on: ubuntu-latest
permissions:
contents: write
steps:
- uses: actions/checkout@v4
with:
ref: ${{ github.head_ref }}
token: ${{ secrets.GITHUB_TOKEN }}
- uses: actions/setup-node@v4
with:
node-version: '20'
cache: 'npm'
- run: npm ci
# --fix mutates files in place; ESLint handles logic rules, Prettier handles aesthetics
- run: npx eslint . --fix && npx prettier --write .
- uses: stefanzweifel/git-auto-commit-action@v5
with:
commit_message: "chore: auto-format [skip ci]"
The [skip ci] tag prevents an infinite loop of the bot triggering itself. One gotcha: if your ESLint config has rules that conflict with Prettier (common with older configs), eslint-config-prettier disables the overlapping rules. Add it as the last item in your extends array or the formatting commit will thrash on every push.
What I Actually Look For Now
With formatting off the table, I focus on four things during review. First: unhandled async paths β await calls without try/catch where the caller has no fallback. Second: conditional branches with no test coverage β I’ll literally check the PR’s diff against the test file and ask “what executes this else branch?” Third: data that arrives from outside the system β API responses, form inputs, third-party webhooks β and whether the code trusts that shape blindly. Fourth: missing authorization checks, especially on new endpoints where the author tested happy-path with their own admin account and didn’t realize a regular user could also hit that route.
The PR Template That Forces Honest Thinking
I added one required section to our PR template that changed behavior more than any process rule: “What can go wrong here?” Authors have to answer it before requesting review. Blank is not accepted. This does something subtle β it shifts the author’s mental posture from “I wrote code, approve it” to “I need to think like a reviewer.” Half the time, filling out that section makes the author catch their own bug before I even look at it. Here’s the template:
<!-- .github/pull_request_template.md -->
## What does this change do?
## What can go wrong here?
<!-- Required. If nothing can go wrong, explain why. -->
## How was this tested?
<!-- List specific scenarios, not just "ran locally" -->
## Does this touch auth, payments, or data deletion?
<!-- If yes, tag @security-review -->
The Real Cost of “It’s Just a Small PR”
Two examples I watched personally. A two-line PR changed a default parameter from limit=100 to limit=undefined on a database query β the dev tested it locally against a small dataset and it was fine. In production with 800K rows, it took down the API. Nobody reviewed it because “it’s two lines.” The second: a one-line change to a JWT validation helper extracted the userId field without checking if the token had actually been verified first. The logic path that skipped verification was only reachable from a legacy endpoint nobody tested. Both PRs took under five minutes to write. Both caused incidents that took hours to fix. The math on “we don’t need to review this” never actually works out.
Step 5: When a Bug Escapes to Production, Kill It Permanently
The Post-Mortem Format That Actually Changes Behavior
Most post-mortems I’ve seen get filed, forgotten, and the same class of bug ships again six weeks later. The format matters. I stopped writing blame narratives and started writing process indictments. The document has one job: identify which step in your pipeline failed, not which human failed. I keep mine to three sections β Timeline (what happened and when, UTC timestamps), Process gaps (where the system let this through), and Concrete changes (specific PRs, lint rules, or checklist items that will close the gap). No “we should be more careful” action items. Those are worthless. If the action item doesn’t have a GitHub issue number attached within 24 hours, it doesn’t exist.
Three Questions That Cut to the Root
For every production bug, I ask three questions in order, and I don’t move to the next until I have a real answer β not a vague one.
- How did it enter? Was this a wrong assumption during implementation, a misread spec, a library upgrade that changed behavior silently? The entry point tells you whether you have a communication problem or a technical problem.
- How did it survive review? Code review catches a lot, but not everything. If a bug made it through, either the diff was too large, the reviewer didn’t have context, or nobody was looking at this code path. That’s a process failure, not a reviewer failure.
- How did tests miss it? This is the most uncomfortable question because the honest answer is usually “we never thought to test that path.” A missing test is a gap in your mental model, not just a gap in coverage percentage.
Going through this sequence for a recent bug I shipped: a null check on a user profile field was missing because the field was added three sprints ago and the test fixtures were never updated. The entry point was a stale fixture. It survived review because the diff was buried in a 400-line PR. Tests missed it because we mocked the database response and never tested with a real null. Three separate gaps, all fixable independently.
Write the Regression Test Before You Write the Fix
This is non-negotiable for me now. Before I touch the bug, I write a test that reproduces it and watch it fail. Then I write the fix and watch it pass. This sounds obvious but most developers I’ve paired with skip it β they fix it, then write a test around the fix, which often doesn’t actually cover the failure scenario. The order matters because writing the test first forces you to understand the exact input that caused the failure. You can’t write a reproducing test without understanding the root cause. Here’s what that looks like in practice:
// Write this FIRST, confirm it fails with the actual bug
it('returns empty array when user has no profile field', () => {
const user = { id: 'u1', name: 'Alex' }; // no .profile
expect(getUserTags(user)).toEqual([]);
// this will throw "Cannot read properties of undefined" before the fix
});
// Only THEN go write getUserTags defensively
This habit also gives you a permanent record in the codebase. Six months from now, some new developer will see that test, see the comment, and understand why that defensive check exists. Comments rot; tests run on every commit.
Tracking Escaped Bugs Without Turning It Into a Witch Hunt
I track “escaped bugs” β bugs that reached production users β as a sprint-level metric in Linear. Each bug gets tagged escaped-bug and linked to the sprint it shipped in. At the end of each sprint, I look at the count across the team, not per person. The number goes in the same retrospective doc as velocity and cycle time. It’s a health metric, the same way you’d track error rate on a service dashboard.
The trap is letting this become a performance signal for individuals. The moment someone feels like their escaped bug count is being watched, they stop being honest in post-mortems and start being defensive. I’ve seen this kill psychological safety on two teams. The fix is to always present the metric in aggregate, never break it down by author in any shared view, and make the explicit team norm that the metric measures our process maturity, not individual skill. A team that ships ten features and has two escaped bugs is doing better process work than a team that ships three features and has zero β because they’re taking more surface area of risk and still containing it reasonably well.
The Tools I Have Running Right Now (Honest Stack)
The thing that surprised me most about stabilizing bug counts wasn’t the testing framework I chose β it was enforcing standards at commit time. You can have the world’s best ESLint config and it means nothing if developers push unchecked code at 11pm. So I’ll walk through exactly what I have running, what I dropped, and the specific settings that actually make a difference.
ESLint + Prettier in VS Code
I run ESLint 8 with TypeScript support and Prettier 3 as a formatter (not as an ESLint plugin β that approach causes slowdowns). The key is making VS Code fix on save, not just highlight. Without auto-fix, developers dismiss the squiggles and ship anyway.
// .vscode/settings.json
{
"editor.formatOnSave": true,
"editor.defaultFormatter": "esbenp.prettier-vscode",
"editor.codeActionsOnSave": {
"source.fixAll.eslint": "explicit"
},
"eslint.validate": ["javascript", "typescript", "typescriptreact"],
// run eslint from project root, not node_modules lookup
"eslint.workingDirectories": [{ "mode": "auto" }]
}
// .eslintrc.json β the rules I actually enforce
{
"extends": ["eslint:recommended", "plugin:@typescript-eslint/strict"],
"rules": {
"no-console": "warn",
"@typescript-eslint/no-explicit-any": "error",
"@typescript-eslint/strict-null-checks": "error",
// this one catches 30% of runtime bugs I used to see in PRs
"@typescript-eslint/no-floating-promises": "error"
}
}
Husky v9 + lint-staged
Husky v9 changed its config format β it no longer uses a .huskyrc file. Hooks now live in .husky/pre-commit as plain shell scripts. If you’re upgrading from v8, the migration will silently break your hooks if you don’t check this. I lost two days to that.
# .husky/pre-commit
#!/usr/bin/env sh
npx lint-staged
// package.json β lint-staged config
{
"lint-staged": {
"*.{ts,tsx}": [
"eslint --fix --max-warnings=0",
"prettier --write"
],
"*.{json,md,yaml}": [
"prettier --write"
]
}
}
The --max-warnings=0 flag is the one most teams skip. It means warnings block commits, not just errors. Without it, warning debt accumulates until someone inherits a codebase with 400 suppressed warnings and no idea which ones matter.
GitHub Actions CI Workflow
I run four jobs sequentially β lint, type-check, test, Sonar. Sequentially on purpose: Sonar costs scan minutes and there’s no point running it if types are broken. Here’s the actual workflow file:
# .github/workflows/ci.yml
name: CI
on:
push:
branches: [main, develop]
pull_request:
branches: [main]
jobs:
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: '20'
cache: 'npm'
- run: npm ci
- run: npm run lint -- --max-warnings=0
type-check:
needs: lint
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: '20'
cache: 'npm'
- run: npm ci
- run: npx tsc --noEmit
test:
needs: type-check
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: '20'
cache: 'npm'
- run: npm ci
- run: npm test -- --coverage --coverageThreshold='{"global":{"lines":80}}'
- uses: actions/upload-artifact@v4
with:
name: coverage
path: coverage/
sonar:
needs: test
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0 # Sonar needs full git history for blame data
- uses: actions/download-artifact@v4
with:
name: coverage
- uses: SonarSource/sonarcloud-github-action@master
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
SONAR_TOKEN: ${{ secrets.SONAR_TOKEN }}
The fetch-depth: 0 on the Sonar step is non-obvious. Shallow clones (the GitHub Actions default) cause SonarCloud to misattribute blame and show incorrect new-code detection. I had a PR pass Sonar clean for three weeks before I realized it was scanning against a shallow baseline. Full history fetch adds ~8 seconds, completely worth it.
Jest + Stryker (Only Where It Counts)
I don’t run Stryker across the whole codebase. Mutation testing on a large codebase takes 20+ minutes and produces noise in UI components where the mutations don’t reflect real failure modes. I scope it to critical business logic modules β payment processing, auth, data transformations β using the mutate config key:
// stryker.config.mjs
export default {
packageManager: 'npm',
testRunner: 'jest',
// only mutate the modules where a bug actually costs money
mutate: [
'src/billing/**/*.ts',
'src/auth/**/*.ts',
'!src/**/*.test.ts'
],
thresholds: {
high: 80,
low: 70,
// CI fails below this mutation score
break: 65
},
reporters: ['html', 'clear-text', 'progress']
};
Stryker will tell you things Jest coverage won’t. A line can be “covered” by a test that doesn’t actually assert the right behavior. Stryker mutates your conditionals β flipping > to >=, removing return statements β and checks if your tests catch it. My billing module had 92% Jest coverage and a 58% mutation score when I first ran this. That 34-point gap represented real untested behavior.
Linear for Bug Tracking
I switched from Jira to Linear about 18 months ago and the friction difference is real. In Jira, creating a bug took ~7 clicks and a form with 12 mandatory fields half the team had configured wrong. Developers stopped filing bugs and started DMing instead. That’s how bugs disappear from your backlog without being fixed.
My Linear setup uses three specific things to keep the backlog honest. First, a Bug label with a dedicated triage cycle (every Monday, 30 minutes). Second, a “No Status Bugs” saved view that surfaces anything filed without an assignee or priority β these get triaged or closed within 48 hours, no exceptions. Third, Linear’s GitHub integration links PRs to issues automatically when you include the issue ID in the branch name (fix/BUG-123-null-pointer-checkout), so you can see which bugs actually got fixed vs. which ones got orphaned.
# Branch naming convention enforced via .husky/commit-msg
#!/usr/bin/env sh
# Validates branch references a Linear issue or is a conventional type
BRANCH=$(git rev-parse --abbrev-ref HEAD)
if ! echo "$BRANCH" | grep -qE '^(feat|fix|chore|docs)\/[A-Z]+-[0-9]+'; then
echo "Branch must reference a Linear issue: fix/ENG-123-description"
exit 1
fi
Why I Dropped Jira
The real breaking point wasn’t the UI or the price ($7.75/user/month on the Standard plan, which adds up). It was the admin overhead. Every time I wanted a new workflow state or custom field, I needed someone with admin access, a 20-minute config session, and a Confluence page explaining what the field meant. Linear lets me add a label or change a workflow state in about 4 seconds, inline, without leaving the ticket I’m looking at. For a team trying to move fast on quality, that friction difference compounds across hundreds of bug reports per quarter. Jira is fine for very large orgs with dedicated project managers maintaining it. For a 4β12 person engineering team trying to keep bugs visible and moving, it creates more process debt than it prevents.
What This Actually Looks Like Day to Day
The most disorienting part of shifting toward zero-bug culture isn’t the tooling β it’s how differently your mornings start. Before I write a single line of code, I open the CI dashboard. Not Slack. Not email. If something red is sitting there from an overnight deployment or a scheduled job, that takes priority over whatever I planned. The rule I set for myself: don’t open an editor until the dashboard is green or I’ve triaged every failure. Takes maybe 5 minutes on a quiet day, sometimes 20 minutes when something actually broke. Either way, it reorients your brain from “feature mode” to “system health mode” before you’ve even made your second coffee.
The PR checklist I run before hitting “ready for review” isn’t written down anywhere β it’s internalized after shipping enough bugs that hurt. I ask myself four questions:
- Did I test the unhappy path explicitly? Not just “it works when inputs are valid” but what happens on null, on empty array, on 401, on network timeout.
- Is there a test that would catch this regression if someone touched this file in 3 months? If the answer is no, I write it before marking ready.
- Did I read my own diff like a stranger would? I close the PR, wait 10 minutes, reopen it. The stuff that looked obvious at 11pm looks weird at 9am.
- Is this change observable? Logging, metrics, something. If it fails silently in production I won’t know for days.
That last one catches the most bugs. Silent failures compound fast.
Sprint planning is where most teams silently sabotage themselves. Features get points, bugs get “we’ll handle it,” and six sprints later the backlog looks like a graveyard. The rule I use: bugs that affect users in production get scheduled in the current sprint, not the next one. Not as stretch goals β as actual committed work. Bugs that are caught pre-production go into a dedicated bug slot I reserve (roughly 20% of sprint capacity) before any feature work gets estimated. If that slot fills up, features slip, not the bug fixes. This feels harsh until you realize that fixing a bug costs roughly 10x more after it’s been sitting in backlog for two months because context evaporates and the code it touched has since been refactored twice.
# A dead-simple way to track this in a team retro
# Each sprint, answer these two questions:
bugs_shipped_to_production: 2 # caught by users, not tests
bugs_caught_pre_production: 8 # caught by CI, review, staging
# Target: ratio should shift over time.
# If bugs_shipped_to_production stays flat, your process isn't working.
# If both numbers drop, your test coverage is actually improving.
The honest answer β and I’d rather say this plainly than bury it β is that zero bugs is not a destination. You will ship a bug next week. So will I. The thing that changes with this process isn’t the bug count hitting zero, it’s the shape of the codebase six months from now. Bugs stop clustering. The same file stops breaking every sprint. On-call rotations get boring, which is exactly what you want. New developers can make changes without fear because there’s a test suite that actually catches things. The difference isn’t perfection β it’s that the feedback loops tighten so much that bugs surface in hours instead of weeks, and get fixed before they compound into something architectural that takes a full quarter to unwind.