Does using GitHub Copilot actually make you a worse programmer over time?

Honest answer: it depends entirely on how you use it. If you're accepting suggestions without understanding them, specific skills like reading unfamiliar code, debugging from first principles, and API surface knowledge do fade — I've experienced this directly. If you treat it like a calculator (fast for arithmetic, but you still know arithmetic) and deliberately practice without it, it doesn't have to. The risk is real but it's not inevitable.

How do I actually turn off Copilot in VS Code when I want to practice without it?

Fastest way: click the Copilot icon in the VS Code status bar (bottom right) and toggle 'Disable Globally' or 'Disable for this Workspace'. For a permanent per-project setup, add `{ "github.copilot.enable": { "*": false } }` to your `.vscode/settings.json`. You can also run `code --disable-extension GitHub.copilot` from the terminal to launch VS Code entirely without it for a session.

Is there any evidence that AI tools hurt junior developers more than seniors?

I won't cite made-up statistics here, but the mechanism is straightforward: senior developers using AI are accelerating work on top of an existing mental model. Juniors using AI may be shipping code without building that model in the first place. The difference between understanding a pattern and being able to recognize and produce it without assistance is real and it shows up in debugging and architecture work later. Whether that matters depends on what kind of work you want to be able to do independently.

What's the aviation concept of 'automation surprise' and does it apply to coding?

Automation surprise is when a pilot doesn't understand what the autopilot is doing and is suddenly surprised by an unexpected state — because they've been passively monitoring rather than actively flying. The coding equivalent is when you can't reason about what a block of AI-generated code does until you run it, or when you hit an edge case the AI didn't anticipate and you have no mental model to fall back on. The fix in aviation was structured manual flying requirements. The fix for developers is deliberate periodic practice without the tools, not eliminating them.

AI Coding Tools Are Making Us Dependent — Aviation Figured This Out 40 Years Ago

The Moment I Realized I’d Forgotten How to Write a Regex

Six months into daily Copilot use, I hit a 2-hour train journey (no signal, not a flight, but the metaphor holds) and needed to write a regex to parse some log timestamps. Basic stuff — capturing named groups, optional milliseconds, timezone offset. I sat there for a solid four minutes staring at a blank editor. Then I wrote something wrong. Then I wrote something else wrong. A junior dev I’d been mentoring the week before would have shipped that pattern faster than me. That stung.

The thing that caught me off guard wasn’t the rust — it was how fast the rust set in. Six months of typing // parse ISO 8601 with optional ms and tz offset and watching Copilot nail it on the first suggestion had completely eroded the muscle memory. My brain had outsourced the pattern-building to something that wasn’t there anymore. I couldn’t even remember whether lookaheads use ?= or ?: without mentally running through examples. I knew I’d known this. The knowledge wasn’t gone, it was just… deeply buried.

Aviation worked through exactly this failure mode and it almost killed people first. The transition to glass cockpits and fly-by-wire in the 80s and 90s created pilots who were phenomenal at managing automated systems and genuinely dangerous when those systems failed. Air France 447 is the case that gets cited most — a crew that couldn’t interpret raw instrument data when the automation dropped out. The industry’s response wasn’t to remove automation. It was structured, mandatory manual flying hours, recurrency checks on raw skills, and what they call “hand flying” requirements baked into standard operating procedures. The automation stayed. The skill maintenance became non-negotiable.

My regex problem was the same failure mode, smaller stakes. And the fix isn’t dramatic. I now keep a scratch.md file where I deliberately write the first pass of anything pattern-related, SQL window functions, bit manipulation, whatever I’ve been leaning on Copilot for — without autocomplete. I use VS Code’s "editor.inlineSuggest.enabled": false setting in a dedicated workspace profile I call manual-mode.code-workspace. I do this for maybe 20% of my work. That’s it. The point isn’t suffering, it’s recurrency.

// manual-mode.code-workspace settings snippet
{
  "settings": {
    "editor.inlineSuggest.enabled": false,
    "github.copilot.editor.enableAutoCompletions": false,
    "editor.quickSuggestions": {
      "other": false,
      "comments": false,
      "strings": false
    }
  }
}

This isn’t a “ban AI tools” take. Copilot genuinely makes me faster on the 80% of code that’s boilerplate, glue logic, and things I’ve written a hundred times. The argument is about the 20% that matters — the novel problems, the debugging sessions where the AI is confidently wrong, the offline moments, the code reviews where you need to read regex someone else wrote and know immediately if it’s broken. For a broader look at tools shaping how small dev teams actually work, the Essential SaaS Tools for Small Business in 2026 guide covers the ecosystem these assistants live inside. But the skill-atrophy question is separate from the tool-selection question, and confusing the two is how you end up with a team that can only code with the autocomplete on.

The Aviation Parallel That Actually Holds Up

The Gimli Glider incident in 1983 isn’t a metaphor I’m retrofitting onto AI tools — it’s a fully documented accident where Air Canada Flight 143 ran out of fuel at 41,000 feet because the crew trusted automated fuel calculation software that was feeding it incorrect data. The root cause wasn’t a software bug in isolation. The crew had stopped doing the manual cross-check. They’d flown with the automation long enough that the manual verification felt redundant. The plane glided 65 miles to an emergency landing at a decommissioned airstrip in Gimli, Manitoba. Nobody died, but the NTSB and Transport Canada reports are very specific: automation dependency degraded the crew’s independent verification habit.

The FAA and ICAO didn’t respond by pulling autopilot from cockpits. They built a framework called Automation Surprise Training — structured scenarios where pilots encounter unexpected automation behavior and have to recognize it, override it, and fly manually under pressure. ICAO mandated minimum raw-data flying hours (no flight director, no autopilot) as part of recurrent proficiency checks. The ATP certification process in the US requires demonstrating specific manual skill thresholds, not just logging total hours. The entire regulatory response was: keep the automation, but build deliberate friction back into skill maintenance.

📚 Related Reading
Top AI Debugging Tools That Explain Errors in Plain English

The concept they formalized is called skill fade — measurable degradation of manual competency that occurs not from lack of training but from lack of practice under the automation’s shadow. Aviation researchers found it’s not linear either. You don’t slowly get worse. You maintain a false plateau where you feel competent, then hit a sudden cliff when the automation fails and you haven’t actually hand-flown in eight months. That cliff behavior is what makes it dangerous. I’ve seen the same thing in dev teams that adopted GitHub Copilot hard in early 2023 — juniors who could scaffold a React component in seconds but genuinely couldn’t explain why their useEffect dependency array was wrong when the autocomplete led them into a stale closure.

What aviation built to counter this maps almost surgically onto software development contexts:

Mandatory manual hours — Pilots log a specific number of approaches and landings without automation per review period. The equivalent for developers would be: write one non-trivial function per week without touching Copilot or Claude. Not as a punishment — as a calibration check.
Proficiency events — Simulator sessions designed specifically to expose automation gaps, not just reinforce confident behavior. Code reviews where someone has to explain every line they didn’t write from scratch qualify here.
Mode awareness training — Pilots learn exactly what the automation is doing and why, not just how to operate it. If you don’t understand what the AI generated, you haven’t cleared this bar.

The thing aviation explicitly did not do is worth repeating: they didn’t ban autopilot, they didn’t add warnings to autopilot interfaces, and they didn’t restrict autopilot to senior pilots. They separated the question of “is this tool useful” from “does using this tool atrophy a skill we need when it fails.” Those are two different questions and conflating them is where most of the AI tools debate goes wrong. A 787 captain who uses autothrottle 98% of the time but passes a manual raw-data ILS approach check every six months is exactly the model worth copying. The goal isn’t less automation — it’s maintained competency alongside it.

How AI Coding Tools Actually Erode Specific Skills (Not Just ‘Thinking’)

The erosion isn’t “you forget how to code.” That’s the strawman people argue against. The actual damage is subtler: you lose speed on first principles. Specific, measurable speed. I noticed it when our CI went down for two hours and I had to work in a plain vim session on a remote box with no internet. Problems that used to take me 10 minutes took 35. That gap is the skill debt.

Pattern Recognition Gets Replaced by Acceptance

Copilot and similar tools are trained to produce working code, not semantically clean code. When you have a list of users and need the active ones’ emails, Copilot usually suggests a for loop with a push. It works. You accept it. But the developer instinct that says “this is a filter + map — two words, zero mutations” atrophies from disuse. I caught myself last month writing a loop that accumulated into an array before realizing I could write users.filter(u => u.active).map(u => u.email) — and the uncomfortable part was that I only noticed because a coworker flagged it in review, not because my eye caught it first. That used to be automatic.

Stack Trace Fluency Is a Real Skill and I’ve Lost Some of It

I used to read a Python traceback like reading a sentence. The frame that mattered would jump out visually. Now my muscle memory is: copy error, open Claude, paste. I’ve done it so many times that the manual path feels effortful. I ran an informal test on myself: timed how long it took me to find the root cause in a Django stack trace manually vs. two years ago when I had no AI tools. My honest estimate is I’m 40% slower. The information is all still there — I haven’t forgotten what a traceback is — but the pattern-matching shortcut that made me fast has degraded from underuse. It’s like how you can still read a map but GPS has made you slower at it.

API Surface Knowledge Is Quietly Disappearing

Ask me what methods exist on a JavaScript Set. I’ll hesitate. Ask me what Array.prototype has beyond the five I use constantly. I’ll describe what I want and let Copilot complete it. This sounds fine until you’re in a situation where the tool is wrong — and you have no independent frame of reference to know it’s wrong. I accepted a suggestion last year that called .flat() in a context targeting Node 10, which doesn’t have it. I didn’t catch it because I hadn’t internalized the method’s compatibility story — I’d only ever seen it via autocomplete. The suggestion was confidently wrong and I had no tripwire to catch it.

# What I used to know cold:
str.strip()        # removes leading/trailing whitespace
str.lstrip()       # left only
str.rstrip()       # right only
str.removeprefix() # Python 3.9+ — I now routinely forget this exists
                   # and describe "remove a specific prefix" to the AI
                   # instead of recalling the method name directly

The Junior Developer Problem Is Structural, Not Anecdotal

I’ve mentored three junior devs in the last 18 months who started their careers with Copilot and ChatGPT already normalized. The thing that strikes me isn’t that they can’t write code — they can ship features fine. The problem surfaces during debugging and architecture discussions. They don’t have the internal library of “I’ve seen this go wrong before.” Mental models for things like why mutating shared state in a loop causes bugs, or why a recursive solution will stack-overflow on large input, get built through pain — through writing the bug, seeing it fail, fixing it manually. If the AI absorbs that feedback loop before the lesson lands, the model never forms. Aviation calls this “automation dependency.” The FAA has documented cases where pilots who trained heavily on glass cockpit simulators struggled with partial-panel failures because they never built the underlying instrument scan habit. We’re building a generation of developers with the same gap: competent under normal conditions, fragile when the scaffold disappears.

The fix isn’t to stop using these tools. It’s to deliberately create scaffold-free sessions. I block off time — usually debugging sessions on low-stakes tickets — where I close the AI tab and force the manual read. It feels inefficient. That discomfort is the training.

The Tools I Actually Use Daily and What Each One Takes From You

The sneakiest damage Copilot did to me wasn’t the bad suggestions — it was the good ones. When the autocomplete is wrong, you catch it and think through the correct answer. When it’s right, you just Tab and move on. After a few months of that pattern, I noticed I’d stopped forming the next line in my head before looking at what Copilot suggested. The muscle of anticipating your own next move atrophies silently. You don’t get an error message. You just gradually become a reviewer of generated code rather than a writer of it.

Cursor is a different beast. The codebase-aware chat is genuinely useful — I’ve thrown a 12-file refactor at it and gotten back a plan that was 80% correct and saved me 45 minutes of diagramming. The trap is “explain this function”. I used that feature probably 30 times in one week when I was onboarding to a new repo. Fast, accurate, helpful. Also completely bypassed the reading-code-slowly habit that’s how you actually build intuition for a codebase. Two months later I realized I had shallow familiarity with a lot of files and deep familiarity with almost none. The feature works; the habit it creates doesn’t.

My Claude setup is where I’ve tried to be most deliberate about the risk profile. I run it via API with a system prompt that forces it to give me options with trade-offs rather than a single answer:

You are a senior systems architect. When asked about design decisions,
always present 2-3 distinct approaches. For each, specify:
- what it optimizes for
- what it trades away
- under what scale or constraints it breaks down
Never recommend a single answer without this framing.

This slows it down in a useful way. The risk with Claude on architecture isn’t that it gives you wrong syntax — it’s that it gives you confident-sounding judgment on decisions that should cost you real thinking. I’ve caught myself accepting “use an event-driven approach here” without asking why that matches my constraints. The system prompt is a forcing function to keep the judgment on my side of the conversation.

ChatGPT for rubber duck debugging has the lowest skill-drain risk of anything in my stack, and I think it’s entirely because of the tab-switching friction. You have to stop, formulate the problem in words, open a browser, paste context. That process of articulating the problem clearly enough to ask about it is most of the debugging work anyway. By the time I’ve written a good ChatGPT prompt, I’ve solved it myself maybe 40% of the time. Compare that to Copilot where there’s zero friction between “I’m confused” and “accepted a suggestion” — that gap is where your thinking either happens or doesn’t.

Tabnine self-hosted is worth bringing up specifically for teams in air-gapped or regulated environments (finance, defense, HIPAA-adjacent). The offline constraint means the model is smaller, the suggestions are less magical, and you know it’s less magical — so you don’t defer to it the way you would Copilot. I’ve talked to engineers running Tabnine on-prem who described their relationship with it as more like a fancy snippet expander than an AI pair programmer. That’s actually healthy. The model ceiling forces you to stay in the driver’s seat by default rather than by discipline. If your team can’t do the discipline voluntarily, the infrastructure constraint does the work for you.

What ‘Instrument Currency’ Looks Like for Developers

The aviation rule that keeps haunting me: instrument-rated pilots can’t just have the rating and fly in clouds whenever they want. Under FAR 61.57(c), you must log actual instrument time or use a simulator every 6 months or you lose currency. Miss the window, and you’re grounded in IMC even though you passed the check ride. The FAA doesn’t care that you flew a lot last year. The skill window closes fast.

I’ve been running a direct analog for about eight months now: one no-AI day per sprint. Not a full week of asceticism — just one day where I pick a real task from the board and work it with Copilot disabled and no Claude tab open. The thing that surprised me the first time I tried this wasn’t that it was slow. It was that I’d forgotten how to hold a problem in my head. I kept reaching for the tab that wasn’t there. That reflex told me something uncomfortable about where my head had gone.

Disabling Copilot is dead simple, and I’d recommend doing it the fast way so you’re not tempted to skip the whole exercise because the friction is too high. The status bar icon is the right move:

# CLI approach — works but slow if you toggle often
code --disable-extension GitHub.copilot

# Better: click the Copilot icon in the VS Code status bar
# It toggles globally in one click — no restart needed
# You can also scope it per-language if you only want to drill specific skills

I keep a keyboard shortcut bound to workbench.action.toggleSidebarVisibility and manually switched my status bar toggle to muscle memory. The point is to make the “instrument conditions” day feel like flipping a switch, not an ordeal. If disabling it takes four menus, you’ll rationalize skipping it.

Advent of Code and LeetCode without AI aren’t about interview prep — I genuinely don’t care about that use case anymore. What they force is sustained single-threaded problem decomposition. Day 7 of AoC 2023 (the camel poker hand problem) took me 90 minutes without assistance. That’s not a win or a loss, it’s a measurement. I can feel when I’m losing the ability to keep a recursive state machine in working memory without externalizing it to a chat window. Doing three or four of these problems a month without AI keeps that ceiling from dropping. Pick problems that are just outside comfortable — easy ones don’t build currency, they just confirm you can still type.

Code review is the manual skill I see people outsourcing the fastest, and it’s the most damaging one to lose. Asking an AI to summarize a PR before you read it is the equivalent of a pilot having a co-pilot describe the instruments instead of reading them directly. Your pattern recognition for code smells, for architectural drift, for “this author doesn’t understand how the ORM generates this query” — that comes from hours of reading real diffs with your own eyes. I now have a personal rule: I read the diff first, form an opinion, write at least one comment, and only then I might use AI to check if I missed something. Never the other way around. The sequence matters as much as whether you use it at all.

A Practical Proficiency Framework Stolen Directly from Pilot Training

Pilots have three distinct levels of regulatory requirement: currency (can you legally fly today?), proficiency (are you actually good right now?), and the full competency check (can you handle emergencies and edge cases under pressure?). I borrowed this framework almost verbatim after reading about how commercial aviation handles automation dependency — because the problem maps perfectly onto what happens when developers lean on AI assistants for months without any structured practice.

Tier 1 — Currency: Weekly from-scratch function

This is your minimum viable practice. Once a week, pick something non-trivial — a binary search variant, a rate limiter, a small parser — and write it completely without suggestions. I mean actually disable autocomplete, not just ignore the suggestions. The specific VS Code config that kills Copilot inline suggestions entirely:

// settings.json
{
  "github.copilot.enable": {
    "*": false,
    "typescript": false,
    "python": false
  }
}

You can toggle this per workspace or globally. The "*": false key covers every language not explicitly listed, so you don’t have to enumerate them all. Put this in your workspace .vscode/settings.json if you want it scoped to one repo, or in your user settings if you want the weekly practice session to apply everywhere. The discomfort you feel staring at a blank function body for 30 seconds? That’s the diagnostic. Currency isn’t about writing perfect code — it’s about confirming you still can.

Tier 2 — Proficiency: Monthly raw debugging

Once a month, when a real production issue hits, give yourself a hard rule: no AI chat until you’ve exhausted your own investigation. That means reading actual stack traces, checking metrics, forming a hypothesis, and testing it. This matters because AI assistants are genuinely great at debugging — which is exactly why skipping this tier destroys your proficiency faster than anything else. The thing that caught me off guard was how quickly I’d forgotten the mental model of reading a flame graph without narration. I’d been pasting profiler output into Claude and asking “what’s slow here?” for so long that my own pattern recognition had atrophied. The monthly solo debug session is how you keep that muscle.

Tier 3 — Competency Check: Quarterly full-day with AI disabled

This is the equivalent of a simulator check ride. Block a full day on your calendar, pick a greenfield feature — something real, not a toy project — and disable everything: Copilot, ChatGPT, Claude, even docs search if you’re being strict. The goal isn’t to measure output velocity. The goal is to measure your discomfort level as a diagnostic signal. High discomfort at hour two when you can’t remember the exact API signature for Promise.allSettled? Fine, that’s currency stuff. High discomfort because you genuinely don’t know how to architect the data flow without asking an AI to sketch it? That’s a competency gap worth knowing about.

Logging this without it becoming a chore

I keep a single file at ~/dotfiles/practice-log.md and the format is one line per session, no elaboration required:

# practice-log.md
2025-06-02 [T1] wrote LRU cache from scratch — slow on the doubly-linked list bookkeeping
2025-06-10 [T1] small recursive descent parser — felt fine, ~25 min
2025-07-01 [T2] traced memory leak in Node.js worker — found it in 40 min solo, heap snapshot readable
2025-07-15 [T1] implemented debounce + throttle without looking — confident
2025-09-03 [T3] full day greenfield, auth + RBAC — uncomfortable around hour 3 on DB schema design decisions

Three fields: date, tier, one honest sentence. The whole point is that you can scan three months of this file in 60 seconds and see whether your discomfort is trending up or down. If Tier 1 entries are consistently saying “slow” or “had to look up basics,” that’s your signal to pull back on AI assistance for a couple of weeks. If the Tier 3 entries go from “uncomfortable at hour 2” to “uncomfortable at hour 6,” you’re building real robustness. Version control it with your dotfiles and you get a timestamped history automatically — no app, no subscription, no friction.

When AI Tools Actually Make You Better, Not Worse

The aviation analogy that keeps coming up in discussions about AI coding tools — autopilot making pilots worse — breaks down in one important place: autopilot handles tasks pilots have already mastered. The more accurate framing is whether you’re using AI to skip learning something you’ll need, or to offload something you’ve already internalized and genuinely don’t need to re-derive every time. That distinction changes everything about how you should use these tools.

Dockerfile syntax is the clearest example I can give of legitimate offloading. I’ve written maybe 200 Dockerfiles over the years. I know the mental model: layer caching order matters, multi-stage builds keep images small, non-root users reduce attack surface. What I don’t know off the top of my head every time is whether COPY --chown takes uid:gid or a username, or what the exact syntax for HEALTHCHECK interval flags looks like. Letting Copilot fill that in isn’t deskilling — it’s the same as using a reference card. Same goes for GitHub Actions YAML and Terraform provider blocks. The required_providers block format changes between Terraform versions, the AWS provider arguments shift, and none of that is worth memorizing when the understanding of why you need it is already there.

# Copilot-suggested block I'd normally have to look up:
terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
  required_version = ">= 1.6.0"
}

The cross-language situation is where I’ve gotten the most genuine value. My day job is TypeScript. I occasionally need to write Go or Rust — maybe once a month for a specific service, or when contributing to an open-source project. Without AI assistance, I’d write Go that looks like TypeScript: verbose error ignoring, wrong idioms for goroutines, slice operations that technically work but no Go developer would write. AI assistance in these moments doesn’t prevent me from learning Go — it actively steers me toward correct Go patterns while I’m writing it. I’ll accept a suggestion, notice it uses errors.As instead of a type assertion, and actually look that up. That’s better than shipping bad-habit code and moving on.

Documentation generation has a precise best-practice that most people get backwards. The wrong move is asking AI to generate JSDoc before you’ve written and tested the function — you end up with comments that describe what the AI imagined the function would do. The right move is writing the function, testing it against real inputs, then asking Claude to draft the JSDoc. At that point the cognitive work is done. You’ve already reasoned through the parameters, the return types, the side effects. The AI is just formatting your understanding into the right shape, and you can spot errors in the output because you already know what’s true.

// After writing and testing this function, I paste it to Claude with:
// "Write JSDoc for this. The second parameter accepts null to mean 'no timeout'."

/**
 * Fetches a resource with optional timeout enforcement.
 * @param {string} url - The endpoint to fetch.
 * @param {number | null} timeoutMs - Milliseconds before aborting, or null to wait indefinitely.
 * @returns {Promise} Resolves with the fetch Response on success.
 * @throws {DOMException} If the request is aborted due to timeout.
 */
async function fetchWithTimeout(url, timeoutMs) { ... }

The edge-case prompt is the sleeper feature I underused for the first six months. After writing an implementation — not before — I’ll paste it and ask: “What edge cases am I not handling here?” The results are genuinely useful because the AI is reviewing code that already reflects your intent, not speculating about what you might write. I’ve caught off-by-one errors in pagination logic, missed empty-array cases in reduce calls, and once a subtle race condition in a retry handler, all from this single follow-up question. What you’re doing is using AI as a second pass of code review where you already understand the domain well enough to evaluate whether each suggestion is real or noise. That’s use, not a crutch.

The Junior Dev Problem Is Real and Nobody Is Talking About It Honestly

The 1,500-hour rule for commercial pilots didn’t come from nowhere. It came from crash investigations. The FAA raised the requirement from 250 hours after the 2009 Colgan Air 3407 crash, where a first officer with 774 hours froze at the controls during a stall. The investigation found she had passed all her checkrides but had accumulated those hours in ways that didn’t build the kind of deep, automatic response patterns the job demands. Hours in a logbook and hours of genuine skill-building are not the same thing, and aviation learned this the hard way.

I keep thinking about that when I watch junior devs work with Copilot. A dev who’s been using AI autocomplete since their first week may have merged hundreds of PRs, but the mental reps they’ve accumulated are fundamentally different from someone who fought through those same problems manually. They’ve seen solutions, but they haven’t necessarily constructed them. There’s a difference between recognizing a correct answer and being able to generate one under pressure — in an interview, during an incident at 2am, or when the AI suggestion is subtly wrong and you need to know why before you accept it.

I changed how I mentor because of this. My current rule: before a junior accepts any AI suggestion that’s longer than two lines, they have to explain it to me. Not summarize it — explain the reasoning. Why does this approach work? What problem does this specific line solve? What would happen if we removed that condition? If they can’t answer, we close the AI panel and work through it by hand. This isn’t punishment. It’s the same as a flight instructor covering up instruments to force a student to fly by feel. The point is to force the mental model to form, not to make the task harder for its own sake.

What I’ve noticed: juniors who accept AI suggestions they can’t explain tend to hit a wall around the 18-month mark. They’re productive on familiar patterns but visibly uncomfortable when the problem shape changes. The AI trained them to pattern-match their way to shipping, which works until the pattern doesn’t exist yet. Meanwhile, devs who were forced to understand what they were accepting — even when it slowed them down early — tend to handle novel problems better because they’ve actually internalized why certain approaches exist. They have the thing that aviation calls airmanship: judgment that runs underneath the procedure, not instead of it.

None of this is an argument against AI tooling. I use Copilot and Claude Code daily and I’m not going back. But there’s a real difference between a senior dev using AI to accelerate work they already understand and a junior dev using AI as a substitute for developing that understanding in the first place. Shipping code and building intuition are genuinely separate activities. You can do a lot of the first while doing almost none of the second, and the gap won’t show up on a sprint board — it’ll show up the first time the system behaves in a way nobody expected and someone needs to actually think.

Honest Assessment: Which Tools Are Highest and Lowest Risk for Skill Development

The most dangerous tool isn’t the most powerful one — it’s the one you don’t notice using. That’s why I’d rank inline autocomplete at the top of the skill-atrophy risk chart, not because it’s bad software (Copilot and Cursor’s tab completion are genuinely impressive), but because the seamlessness is architecturally hostile to learning. You type three characters, a grey suggestion appears, you hit Tab. No decision happened. No retrieval from memory. No struggle. The research on skill acquisition is consistent here: the struggle is where encoding happens. When you eliminate the struggle completely, you eliminate the learning, and you won’t feel it happening because the code still works and you’re still shipping.

Chat-based tools like Claude and ChatGPT sit in a safer middle zone, and the reason is friction. You have to articulate what you’re trying to do. That act of forming a prompt — even a sloppy one — forces you to at least partially understand the problem. I’ve caught my own misconceptions just by trying to write a coherent question. There’s also a natural pause between asking and applying: you read the response, you evaluate it, you adapt it to your context. That’s not the same cognitive engagement as writing the code yourself, but it’s not zero either. The risk goes up when you start copy-pasting full implementations without reading them, which is a discipline problem more than a tool problem.

AI-assisted code review tools — CodeRabbit, Sourcery, and similar — are where I see the most sustainable workflow. The code came from your brain first. The AI is reacting to what you built. When CodeRabbit flags a potential null dereference or Sourcery suggests a refactor, you’re in the position of evaluating a suggestion against code you already understand. That’s close to how a good human code reviewer functions. You might disagree, and that disagreement is itself a learning event. The skill risk is low because the core act of design and implementation already happened before the AI touched anything.

The genuinely low-risk use case I keep coming back to: using AI to generate tests after you’ve already written the implementation. By the time you paste your function into Claude and ask for edge case tests, the architecture decisions are done, the logic is yours, the design thinking already happened. You’re outsourcing the mechanical work of writing twenty variations of similar assertions, not the thinking. If anything, watching the AI generate tests you didn’t think of is useful — it surfaces edge cases and teaches you what you were missing. That’s the loop working in your favor.

One practical note on pricing that’ll save you embarrassment: every tool mentioned here has changed its pricing tier structure at least once in the past year. Copilot moved from flat $10/month individual to a model with free tier limitations and higher enterprise tiers. Cursor shifted plans. CodeRabbit has a free open-source tier and a paid one. I’m not going to quote specific numbers here because whatever I write will be wrong by the time you read it. Go directly to each tool’s pricing page. Don’t trust this article. Don’t trust any article on this, including the ones ranking “best AI tools of 2025” — those numbers are almost always stale within two quarters.

The aviation parallel that keeps coming back to me: pilots log “hand-flying hours” separately because they know autopilot makes them worse at the stick. I do the same thing with coding — I keep a hard boundary between “work with AI” and “work without AI”, and the tooling setup is what makes that boundary stick. Without it, the AI just creeps back in.

The simplest layer is VS Code’s workspace settings. You can drop a .vscode/settings.json in any project folder and it will override your global Copilot config for that workspace only. This is the actual config I use:

{
  "github.copilot.enable": {
    "*": false,        // off for all file types in this workspace
    "markdown": false,
    "javascript": false,
    "typescript": false,
    "python": false
  },
  "github.copilot.inlineSuggest.enable": false,
  "editor.inlineSuggest.enabled": false  // also kills the visual ghost text even if copilot fires
}

The editor.inlineSuggest.enabled line is the one that actually matters — I added it after noticing the ghost text was still appearing sometimes, which I think was a Copilot Chat side-channel. Killing it at the editor level removes the temptation entirely. You still get IntelliSense for types and imports, which is fine; that’s language server territory, not AI completion.

The VS Code setting handles local enforcement, but the repo-level setting is what makes it airtight. My practice repo is a private GitHub repo called no-assist — it’s where I implement data structures from scratch, work through exercises, and rebuild things I rely on at work (recently: a basic LRU cache, a debounce function, and a simple state machine). To disable Copilot at the repo level so it can never be re-enabled accidentally, go to your GitHub account: Settings → Copilot → Configure → Policies → then scroll to the repository exclusion list and add the repo name. The path is slightly buried — it’s under your personal account settings, not the repo settings page itself. Once it’s there, Copilot will not activate in that repo regardless of what any local config says, for any collaborator using your account.

A few things I’ve learned from maintaining this setup for about eight months:

Name the repo something boring. no-assist or practice is better than learning-journey — the boring name means you open it with the right mindset, not a performance one.
Keep it private. The moment it’s public you start optimizing for how the code looks rather than whether you can write it yourself.
Commit even the broken attempts. The git history of a practice repo is where the actual learning lives. I can look back at how I fumbled through implementing a trie three months ago and see exactly where my mental model was wrong.
Don’t use this repo for Advent of Code if you’re going to share your scores. The external pressure will make you want the AI back.

One honest caveat: this setup doesn’t stop you from opening a browser tab and asking ChatGPT. The tooling only removes the path of least resistance — the inline ghost text that you accept without thinking. The harder discipline is deciding that when you’re in the no-assist folder, you also close the AI chat tabs. I use a separate browser profile for practice sessions. That’s maybe overkill, but I found that having a tab open with Claude ready to go was functionally the same as having Copilot on — the temptation threshold was just slightly higher.

Disclaimer: This article is for informational purposes only. The views and opinions expressed are those of the author(s) and do not necessarily reflect the official policy or position of Sonic Rocket or its affiliates. Always consult with a certified professional before making any financial or technical decisions based on this content.

Written by Eric Woo

Lead AI Engineer & SaaS Strategist

Eric is a seasoned software architect specializing in LLM orchestration and autonomous agent systems. With over 15 years in Silicon Valley, he now focuses on scaling AI-first applications.