DOS Programming in 2026: Set Up a Real Dev Environment, Break the 640KB Wall, and Write a Working TSR

Why Anyone Is Still Writing DOS Code in 2026 (And Why It’s a Legitimate Technical Exercise)

## Why Anyone Is Still Writing DOS Code in 2026 (And Why It’s a Legitimate Technical Exercise)

DOS is not dead. That sentence reads like a joke until you’ve stared at a SCADA terminal running DOS 6.22 that controls a $4 million CNC lathe, and your job is to fix the memory fault without taking the production line down.

Industrial controllers, point-of-sale systems in certain manufacturing verticals, and embedded measurement equipment from the late 90s and early 2000s are still running real-mode x86 DOS in production environments. These aren’t museum pieces — they’re active workloads with no business case for replacement. The hardware works. The software works. Until it doesn’t, and someone has to open Turbo C.

The FreeDOS project has maintained active development and a measurable contributor base for over two decades precisely because this maintenance reality exists. People are filing bugs and writing patches against a real-world need, not sentiment.

Here’s the thesis, stated plainly: DOS development is a direct training ground for ARM Cortex-M embedded work. When you write a DOS TSR (Terminate and Stay Resident) program that has to coexist with a hostile interrupt chain inside 640KB of conventional memory, you are solving the exact class of problems that Cortex-M developers hit every day — fixed memory ceilings, interrupt latency budgets, no allocator to bail you out, no OS scheduler to hide your timing mistakes. The mental model transfers almost one-to-one.

That’s the reason to read this. Not nostalgia. Not retro-computing aesthetics. Constraint-driven engineering is a skill that atrophies fast when you spend your days writing Python against a cloud API, and DOS is one of the sharpest whettstones available because the constraints are absolute and the feedback is immediate. You either fit in 640KB or the program doesn’t run. There is no negotiating with the memory map.

Choosing Your Environment: DOSBox-X vs. FreeDOS in a VM vs. Real Hardware

## Choosing Your Environment: DOSBox-X vs. FreeDOS in a VM vs. Real Hardware

This is a decision, not a preference. Pick the wrong environment early and you will waste days debugging behavior that does not exist on your actual target.

—

### Path 1: DOSBox-X

Use this if you are learning the toolchain, experimenting with memory models, or writing code that will run on a broad DOS target rather than a specific machine.

DOSBox-X gives you cycle-accurate 8086/286/386 emulation, configurable EMS and XMS, and a machine state you can snapshot and roll back. You can fake a 640KB conventional memory ceiling, tune CPU cycles per second, and swap between machine types in a config file. For toolchain work — getting Open Watcom or NASM talking to a linker producing real MZ executables — this is the right starting point. No hardware to source, no disk imaging, no IRQ surprises on boot.

The limitation is real: DOSBox-X is not cycle-exact at the interrupt level. It approximates INT timing well enough for most software, but edge cases diverge. If your INT handler depends on a specific relationship between the PIT (8253/8254 timer) firing and the CPU acknowledging the interrupt, you may see clean results in DOSBox-X that fail on hardware.

**Do not use DOSBox-X when:**
– Your target is a machine with OEM-customized INT vectors in ROM (certain Compaq or Tandy BIOSes had non-standard behavior that emulators do not replicate)
– You need to validate actual EMS behavior against a physical LIM 4.0 card — the emulated EMS page frame and a real expanded memory board do not behave identically under edge-case page switching
– Your production target is a known, fixed hardware configuration you can actually obtain

—

### Path 2: FreeDOS in QEMU or VirtualBox

This sits between emulation and real hardware in ways that matter.

FreeDOS under QEMU gives you a much closer approximation of real DOS behavior: actual FAT filesystem semantics, a real memory manager stack (HIMEM, EMM386 or JEMM), and a boot sequence that exercises actual INT 19h/INT 1Ah behavior. TSRs that hook interrupt vectors and stay resident behave more predictably here because the memory manager chain is real software, not an approximation.

QEMU specifically is worth using over VirtualBox for low-level DOS work. VirtualBox does more opaque abstraction at the hardware level. QEMU with `-machine pc` and appropriate CPU flags gives you observable, adjustable emulation. You can attach GDB to a running QEMU instance and step through real-mode code at the instruction level — this is genuinely useful when debugging a TSR that corrupts the interrupt vector table.

For TSR development and memory manager interaction testing, this is probably the right default environment in 2026. It costs nothing, runs on any Linux or macOS host, and behaves close enough to physical DOS that most bugs surface here rather than waiting until real hardware.

**Where FreeDOS/QEMU still falls short:** IRQ timing is still virtualized. If you are hooking INT 13h and need to validate that your handler does not stall during actual disk seek latency, you need physical media. QEMU’s disk I/O does not simulate seek times — it is functionally instant compared to a spinning IDE drive.

—

### Path 3: Real x86 Hardware

This is necessary, not optional, in one specific scenario: you are deploying to a known hardware target and need to validate against that exact machine.

A TSR that hooks INT 13h (disk interrupt) or INT 09h (keyboard interrupt) will behave differently against a real Award BIOS than against QEMU’s BIOS implementation. The real BIOS may issue EOI (End of Interrupt) to the 8259 PIC at a different point in the handler chain. Your hook might work perfectly in emulation and silently corrupt keyboard state on physical hardware because the timing assumption was wrong.

Same problem with INT 09h specifically: keyboard scan codes arrive from the 8042 controller with real hardware debounce and buffering behavior. Emulators simplify this.

**The procurement reality in 2026:** sourcing a working ISA-bus machine is not straightforward. Pentium-era (Socket 7, early Socket 370) machines with ISA slots still exist on eBay and at estate sales, but expect to spend time on capacitor issues, dead CMOS batteries, and IDE drives that are either dead or unreliable. A working late-486 or early Pentium with 8MB of RAM, a functional ISA slot, and a reliable IDE drive is a find worth keeping. Budget for a CF-to-IDE adapter and a CF card — it is more reliable than a 30-year-old spinning drive and eliminates one class of flaky behavior.

If your target is a specific embedded DOS controller or industrial PC from the 1990s, there is no substitute. No emulator will replicate a proprietary BIOS that remaps INT vectors into an on-board ROM extension. Get the hardware.

—

### The Decision in One Pass

Are you learning toolchain / building a general DOS app?
→ DOSBox-X. Fast iteration, easy config, good enough.

Are you writing a TSR or testing memory manager interaction?
→ FreeDOS in QEMU. Real software stack, debuggable, no hardware cost.

Are you validating against a specific production machine,
or hooking hardware-level interrupts (INT 09h, INT 13h)?
→ Real hardware. There is no shortcut here.

One note on toolchain parity: whatever environment you choose, make sure you are cross-compiling from a consistent host. Mixing NASM versions between environments produces subtle differences in segment alignment output. Pin your assembler and compiler versions in a Makefile and do not rely on whatever happens to be in the emulator’s guest OS PATH.

DOS Dev Environment Setup in 2026: OpenWatcom, DOSBox-X, and a Legal Toolchain

## DOS Dev Environment Setup in 2026: OpenWatcom, DOSBox-X, and a Legal Toolchain

Before writing a single line of code, you hit a licensing problem that most tutorials quietly ignore.

**Turbo C 2.0 and MASM 5.x are not legally obtainable in 2026.** Borland went through several corporate deaths, and the IP ended up places where nobody is issuing licenses or hosting downloads. Every “free download” link you find for these tools is piracy, full stop. If you’re doing this professionally, or just don’t want the headache, don’t go that route.

Here’s what you can actually use:

– **Turbo C++ 1.01** — Borland released this as freeware before the company collapsed completely. It’s legitimate, it targets real-mode DOS, and it’s fine for experimentation and nostalgia. Don’t build anything serious with it; the optimizer is weak and the runtime library has sharp edges.
– **OpenWatcom C/C++** — This is the recommended path for any new work. Open-source, actively maintained, produces genuine real-mode DOS `.EXE` files, and the `WCL` command makes compilation straightforward. The project lives on GitHub under the `open-watcom` organization.
– **DJGPP** — GCC-based toolchain targeting DOS, but it produces DPMI extended-mode executables rather than real-mode. Useful if you need 32-bit protected mode and a DPMI host (like CWSDPMI), but that’s a different architecture than classic real-mode DOS work.
– **NASM** — The free, open-source assembler that replaces MASM for anything requiring inline assembly or standalone `.COM` files. Syntax differs from Intel MASM syntax in a few places, but the NASM manual covers every divergence.

—

### Setting Up DOSBox-X

DOSBox-X is the right emulator for this work. The mainline DOSBox project is mostly aimed at games and hasn’t kept pace with development-oriented features. DOSBox-X supports configurable memory models, EMS/XMS emulation, and a more accurate INT 21h implementation.

**Step 1: Install DOSBox-X.**

Grab the latest release from the official GitHub releases page at `github.com/joncampbell123/dosbox-x`. There are pre-built binaries for Windows, macOS, and Linux. On macOS you can also pull it via Homebrew: `brew install dosbox-x`.

**Step 2: Create your host directory.**

bash
mkdir ~/dosdev
mkdir ~/dosdev/src
mkdir ~/dosdev/tools

Unpack the OpenWatcom installer into `~/dosdev/tools/watcom`. The directory you want in your DOS `PATH` is the `BINW` subdirectory inside that — it contains `WCL.EXE`, `WCC.EXE`, `WLINK.EXE`, and the rest.

**Step 3: Configure DOSBox-X.**

Find your `dosbox-x.conf` file (on Linux it’s typically `~/.config/dosbox-x/dosbox-x.conf`; on macOS it’s in `~/Library/Preferences/DOSBox-X/`). Add or edit the `[autoexec]` section at the bottom:

ini
[autoexec]
MOUNT C ~/dosdev
C:
SET PATH=C:\TOOLS\WATCOM\BINW;%PATH%
SET WATCOM=C:\TOOLS\WATCOM
SET INCLUDE=C:\TOOLS\WATCOM\H
SET LIB=C:\TOOLS\WATCOM\LIB286\DOS

The `WATCOM` and `INCLUDE` environment variables are not optional — `WCL` uses them to locate the standard library headers and the C runtime. Skip them and you get cryptic “file not found” errors during compilation that don’t point to the actual problem.

If you prefer to mount manually each session, skip the autoexec and run this at the DOSBox-X `Z:\>` prompt:

MOUNT C ~/dosdev
C:

—

### Hello World: The Actual Commands

Create `C:\SRC\HELLO.C`. In DOSBox-X you can use the `EDIT` command if it’s available, but realistically you’ll write the file from your host OS in `~/dosdev/src/hello.c` with whatever editor you prefer.

c
#include

int main(void) {
printf(“Hello from real-mode DOS\n”);
return 0;
}

At the DOSBox-X prompt:

C:\SRC> WCL -bcl=dos HELLO.C

**What `-bcl=dos` does:** The `-bcl` flag sets the “build configuration library,” which tells WCL which target environment and calling convention to use. The `dos` target selects the real-mode flat executable format — 16-bit 8086-compatible code, the DOS C runtime, and a standard `.EXE` header that DOS loads directly without any extender. Without this flag, WCL defaults to OS/2 or Win32 depending on your OpenWatcom build, and you’ll produce something that won’t run under plain DOS at all.

You should see output like:

WCL Version 1.9
hello.c
WLINK Version 1.9

Then:

C:\SRC> HELLO.EXE
Hello from real-mode DOS

If `WCL` isn’t found, your `PATH` isn’t set — go back to the autoexec configuration. If you get linker errors about missing `cstart.obj`, your `LIB` environment variable is pointing at the wrong directory; double-check that you’re targeting the `LIB286\DOS` path and not the 386 or OS/2 variants.

—

### A Note on What the Emulator Is Actually Doing

When DOSBox-X runs your program, it’s emulating an 8086/286/386 processor in real mode with 1MB of addressable physical memory — the same constraint that shipped on every PC from 1981 through the mid-1990s. The first 640KB of that space is conventional memory available to DOS programs. The remaining 384KB (from 640KB to 1MB) is the Upper Memory Area: reserved for video buffers, ROM BIOS, and hardware I/O. This is the physical layout that DOS extended memory managers like HIMEM.SYS actually dealt with in practice — they couldn’t change this map, but they could reach *above* the 1MB line by briefly enabling the A20 address line, and they could use bank-switching tricks to page extended or expanded memory into the UMA on demand. That addressing architecture is exactly why the 640KB ceiling existed and why breaking it required the contortions covered in the next section.

The 640KB Wall: EMS, XMS, and What HIMEM.SYS Actually Did

## The 640KB Wall: EMS, XMS, and What HIMEM.SYS Actually Did

The 640KB limit is one of the most misunderstood constraints in computing history. People blame DOS. DOS didn’t do this.

The 8086 has a 20-bit address bus, which gives you exactly 1MB of addressable space (0x00000 through 0xFFFFF). IBM’s original PC design carved that 1MB up with a fixed memory map: the bottom 640KB (0x00000–0x9FFFF) goes to application RAM, and the top 384KB (0xA0000–0xFFFFF) is reserved for hardware. Video buffer at 0xA0000. Adapter ROM at 0xC0000. System BIOS at 0xF0000. That’s it. No negotiation, no dynamic allocation. The hardware map is burned into the PC architecture itself. Every IBM PC-compatible machine ever made honors this layout.

So when people say “DOS can only use 640KB,” they’re describing a symptom, not a cause. A different OS running on the same hardware in real mode hits the exact same wall.

—

### EMS: Bank Switching the Hard Way

The first serious workaround was EMS — the Expanded Memory Specification, LIM version 4.0 being the one that mattered. The concept is brutally simple: carve out a 64KB window somewhere in the upper 384KB (the “page frame”), and use that window as a viewport into a much larger pool of memory sitting on a dedicated EMS card.

The larger pool is physically present on the card — up to 32MB under LIM 4.0. But to actually read or write any of it, your application calls INT 67h to swap 16KB pages in and out of the page frame. You pick which four 16KB slots to map, you do your work, you swap again. Every single access to extended data requires this explicit bookkeeping.

EMM386.EXE later emulated an EMS card in software using the 386’s virtual memory hardware, which meant you no longer needed physical EMS memory — just extended memory and EMM386. But the programming model stayed the same: manually managed page swaps, INT 67h, no transparency.

This was the dominant approach for database engines and spreadsheet software through the late 1980s. Lotus 1-2-3 was the poster child — it would use EMS to hold large spreadsheet data while keeping the engine itself in conventional memory. It worked, but it required every application to be explicitly written around the page-swap model. You didn’t just malloc() and get EMS. You wrote a memory manager.

The practical failure mode: if you get your page mapping wrong — referencing a logical page that isn’t currently mapped into the frame — you corrupt memory or crash, with no protection mechanism to catch you. Real mode, no segfault, no signal. You just start writing garbage somewhere.

—

### XMS: The A20 Line and HIMEM.SYS

On 286 and later processors, the hardware can address memory above 1MB when the A20 address line is enabled and the CPU is in protected mode. But in real mode, the 8086 wraps around — address 0x100000 aliases back to 0x00000. This is intentional; it’s what keeps old 8086 software from breaking. The A20 gate is a physical chip (originally on the keyboard controller, of all things) that controls whether address line 20 is live or gated off.

HIMEM.SYS manages this. It gates A20 on, then provides an API via INT 15h / XMS calls that lets programs request extended memory blocks (XMBs) and perform block-move copies between conventional memory and extended memory. You can’t just point a segment register at extended memory and start reading — segment registers are 16-bit in real mode, and you’d need a protected-mode switch to directly address 0x100000+. Instead you copy chunks down into conventional memory, process them, copy results back up.

HIMEM.SYS also manages the High Memory Area — the first 64KB above 1MB (0x100000–0x10FFEF). With A20 enabled, this small region is directly addressable in real mode, which is why `DOS=HIGH` in CONFIG.SYS lets DOS load most of itself there, freeing up a few KB of conventional memory.

DPMI (DOS Protected Mode Interface) came later and changed the game properly. It gave programs a clean protected-mode environment under DOS — memory protection, 32-bit addressing, proper descriptor tables. DJGPP targets DPMI. When you compile a C program with DJGPP and run it, a stub calls into a DPMI host (typically CWSDPMI), which sets up a protected-mode environment, and your 32-bit code runs with full access to extended memory. No page swapping, no INT 67h, no block copies. You call malloc() and it works like you expect.

—

### What This Means If You’re Writing DOS Code in 2026

If you’re targeting OpenWatcom in the small or medium memory model, you are in pure real mode with 16-bit segments. Small model: code and data each fit in one 64KB segment. Medium model: multiple code segments, one data segment. Either way, your total addressable conventional memory is the pool below 640KB, and the actual usable ceiling — after DOS overhead, drivers, and TSRs — is often closer to 580–600KB on a loaded system.

For anything that needs to process large data sets, this matters immediately. A 200KB lookup table blows your data segment. A decompression buffer won’t fit alongside your program state.

Your options:

**Stay real-mode, use EMS/XMS manually.** OpenWatcom has library support for XMS block moves. You can move data to and from extended memory explicitly. This works, but you’re writing a memory manager on top of your actual program. Expect bugs. The failure mode when you get the offsets wrong is silent memory corruption.

**Target DPMI via DJGPP.** Compile for 32-bit protected mode. Your program links with a DPMI stub, and at runtime the stub negotiates with a DPMI host for a protected-mode environment. Inside that environment, your C code runs as a normal 32-bit process with flat memory addressing. This is the right call for any program that handles non-trivial data.

The real constraint here: DPMI executables require a DPMI host. In FreeDOS, CWSDPMI ships as the standard host and loads automatically when a DPMI program runs. On a clean embedded DOS setup — a stripped ROM-DOS or a minimal MS-DOS image with no extended memory manager loaded — there is no DPMI host. Your DJGPP executable will fail at startup with an error about no DPMI host being available. If you’re targeting embedded hardware where you control the boot environment, CWSDPMI can be loaded from AUTOEXEC.BAT or bundled with the executable as a loader. But you have to plan for this explicitly. It doesn’t just work.

The decision tree is simple: if you’re targeting a known FreeDOS or modern DOS-compatible environment and need real memory, use DJGPP. If you’re targeting an unknown or constrained embedded environment where you can’t guarantee a DPMI host, stay real-mode with OpenWatcom and manage your memory ceiling manually from the start.

Write a Working TSR in Under 50 Lines of NASM

## Write a Working TSR in Under 50 Lines of NASM

A TSR is DOS’s answer to background processing. The mechanism is simple enough to hold in your head: your program hooks one or more interrupt vectors in the IVT (Interrupt Vector Table, a 1KB table at address 0000:0000 containing 256 4-byte far pointers), calls either INT 27h or INT 21h/AH=31h to tell DOS “keep this memory allocated even after I exit,” then returns. From that point forward, every time the hooked interrupt fires, your handler runs — sitting in the interrupt chain exactly the way a Linux kernel module sits in a syscall path, or an RTOS ISR sits in a hardware vector table.

This isn’t emulation or abstraction. Your code is *in* the IVT chain. When the hardware fires IRQ1 (keyboard), the CPU looks up the 4-byte vector at 0000:0024, jumps there, and if your handler is installed, that’s your code running. You chain to the original handler, do your work, and return with IRET. Miss a step and the machine hangs with no error message.

—

### The Skeleton

Build this with NASM. The complete toolchain command:

nasm -f bin tsr.asm -o tsr.com

Flat binary, runs directly under DOSBox-X or bare DOS. No linker, no runtime, no licensing friction.

nasm
; tsr.asm — minimal INT 09h hook TSR
; Build: nasm -f bin tsr.asm -o tsr.com
; Runs as a COM file: CS=DS=ES=SS, IP starts at 0x100

org 0x100 ; COM files load at offset 0x100

; ── ENTRY POINT ─────────────────────────────────────────────────────────────
start:

; Step 1: Read the original INT 09h vector from the IVT.
; The IVT lives at segment 0. INT 09h is vector 9, so its entry
; is at physical address 9*4 = 36 = 0x24. Each entry is [offset, segment].

xor ax, ax
mov es, ax ; ES = 0 (point at IVT segment)
mov ax, [es:0x24] ; grab original handler offset
mov [orig_off], ax ; save it in our data area
mov ax, [es:0x26] ; grab original handler segment
mov [orig_seg], ax ; save the segment too

; Step 2: Install our handler into the IVT.
; CLI/STI brackets the write so a hardware interrupt can’t catch us
; mid-update with a half-written vector — that would cause a wild jump.

cli ; disable interrupts while we patch the IVT
mov word [es:0x24], handler ; install our handler offset
mov [es:0x26], cs ; our handler’s segment is current CS
sti ; re-enable interrupts

; Step 3: Go resident.
; INT 27h requires DX = size of resident portion in bytes (from CS:0).
; For a COM file CS==DS, and we want to keep everything up to ‘init_code’
; (the setup code we don’t need after TSR installation can be discarded).
; Add 15 before shifting right 4 to round up to the next paragraph (16 bytes).

mov dx, init_code ; DX = offset of first non-resident byte
add dx, 15 ; round up
shr dx, 4 ; convert bytes to paragraphs
inc dx ; one extra paragraph for safety margin
; INT 27h uses DX as paragraph count; CS is the base segment
; DOS marks [CS:0 .. CS:(DX<<4)] as non-freeable int 0x27 ; terminate and stay resident ; ── INTERRUPT HANDLER ─────────────────────────────────────────────────────── ; Everything below this label is resident. The init code above is NOT ; resident — it runs once and is discarded after INT 27h. handler: ; Preserve ALL registers. DOS has no exception handler. ; If you skip even one register the caller's state is silently trashed. ; 'pusha' saves AX/CX/DX/BX/SP/BP/SI/DI in one instruction (80186+). ; We also need to preserve the segment registers manually. pusha ; save all general-purpose registers push ds push es ; save segment registers ; Step 3a: Chain to the original INT 09h handler FIRST. ; This preserves the BIOS keyboard processing (scan code → ASCII, ; updating the BIOS keyboard buffer). Skipping this breaks keyboard input. ; A far call requires [segment:offset] on the stack in the right order. pushf ; simulate the CPU's interrupt-call mechanism call far [cs:orig_off] ; chain: jump to [orig_off:orig_seg] ; 'cs:' prefix needed because DS may be wrong ; inside the ISR context ; Step 3b: Custom action goes here. ; DANGER ZONE: Do NOT call INT 21h here without first checking the InDOS ; flag (a byte pointer returned by INT 21h/AH=34h during init). ; If InDOS > 0, DOS is in a non-reentrant state and calling it again
; causes intermittent data corruption — the kind that only shows up
; under load, not during your test session.
; Safe actions: poke directly into video memory (0xB800:xxxx),
; toggle a flag in your own data segment, or read hardware ports directly.

; [your custom action here — keep it short and safe]

; Restore registers and return from interrupt.
; IRET pops IP, CS, and FLAGS — the CPU interrupt mechanism pushed these.

pop es
pop ds
popa ; restore all general-purpose registers
iret ; return from interrupt, restoring flags

; ── RESIDENT DATA ────────────────────────────────────────────────────────────
; These variables are part of the resident image. They must live BELOW
; init_code so INT 27h keeps them in memory.

orig_off dw 0 ; original INT 09h handler offset
orig_seg dw 0 ; original INT 09h handler segment

; ── END OF RESIDENT PORTION ──────────────────────────────────────────────────
; INT 27h receives DX = offset of this label as the resident size boundary.
; Everything from here down is initialization code, discarded after TSR loads.

init_code:
; (nothing else needed for this minimal example —
; ‘start’ already ran the init and issued INT 27h)

—

### Three Ways This Kills Your Machine

**Register preservation.** `pusha` covers the general-purpose registers. Segment registers are not included — you must `push ds` / `push es` manually, and if your handler touches `fs` or `gs` on a 386+ target, those too. Forget one and you silently corrupt the calling program’s state. DOS has no exception handler to catch this. The machine does not print an error; it just behaves wrong, or hangs on the next keypress, or corrupts a file. You will spend an hour blaming something else.

**Calling INT 21h from inside an ISR.** DOS is not reentrant. If an INT 21h call is already in progress when your keyboard handler fires and you issue another INT 21h, you’ve re-entered DOS through the side door. The INT 21h/AH=34h call (during initialization, *not* inside the handler) returns a far pointer to the `InDOS` flag — a one-byte counter that is non-zero whenever DOS is in a critical section. Beginner TSR tutorials skip this entirely. The bug it produces is the worst kind: works fine during isolated testing, corrupts data intermittently under real load.

**Size calculation.** INT 27h terminates resident with DX holding the size of the resident portion in *bytes* measured from CS:0, then the DOS kernel converts that to paragraphs internally. INT 21h/AH=31h takes the paragraph count directly in DX. Mix these up and you either leave too little resident (your handler code gets overwritten by the next program DOS loads — this produces a crash that may not happen for minutes) or you leave too much (you waste memory that 640KB DOS desperately needs). The rounding trick — `add dx, 15` then `shr dx, 4` — ensures you never accidentally truncate to the paragraph below.

—

### One More Practical Note on the Toolchain

NASM is free, actively maintained, and produces real-mode flat binaries without a linker step. `nasm -f bin` gives you a raw byte-for-byte image that DOSBox-X loads and executes as a COM file. No MASM license, no Turbo Assembler, no 16-bit linker archaeology. If you’re testing on real hardware, write the binary to a floppy image with `dd` or copy it directly — it runs on anything from an 8088 forward (with the caveat that `pusha` is 80186+; substitute individual `push` instructions for 8088 targets).

The entire chain from source to running TSR: one NASM command, one copy operation, one DOS prompt. The friction that stopped most people in 1988 is gone.

DOS Patterns That Map Directly to Modern Embedded Development

## DOS Patterns That Map Directly to Modern Embedded Development

The assumption most people bring to DOS programming is that it’s archaeology — something you study to understand where we came from, not where we’re going. That assumption is wrong. Every constraint DOS imposed on a developer in 1991 is still active on a Cortex-M4 in 2026. The environment changed. The physics didn’t.

DOS gave you no virtual memory, no protected mode (in real-mode DOS, anyway), no scheduler, no hardware abstraction layer worth the name. You owned the interrupt vector table. You owned every byte of the address space. You were responsible for not overwriting your own stack. If you shipped something that worked, you had reasoned correctly about memory layout, interrupt latency, and binary size — because the system had no mechanism to forgive reasoning errors. The program either worked or it locked up.

An STM32F4 running bare-metal C is the same situation with a different datasheet.

—

### The Comparison That Actually Holds Up

Before the table, a few sentences of context are worth reading carefully. These aren’t loose analogies — each pair shares the same underlying mechanism, just expressed in different syntax and toolchain conventions.

| DOS Concept | Modern Embedded Equivalent | What They Actually Share |
|—|—|—|
| INT vector hooks via `setvect()` | ARM NVIC interrupt handlers via vector table | Both are function pointers at fixed memory addresses; priority, chaining, and re-entrancy concerns are identical |
| TSRs (Terminate and Stay Resident) | RTOS tasks or bare-metal background loops | Both run deferred work triggered by interrupt context, with explicit stack partitioning |
| `far` pointers and segment:offset arithmetic | `__far` qualifiers, `__attribute__((section(…)))`, IAR/Keil region attributes | Both encode a physical address space that exceeds a single flat pointer width |
| DOS memory models (tiny/small/compact/large/huge) | Linker scripts partitioning Flash, SRAM, CCM regions on STM32 | Both are static decisions about code and data locality made before the binary is linked |
| PIT (Programmable Interval Timer) | SysTick on Cortex-M | Both are countdown timers driving a periodic ISR; programming involves the same clock-divider math |

**INT hooks → NVIC handlers.** When you hooked an INT in DOS, you read the existing vector, stored it, wrote your handler’s address into the interrupt vector table at segment `0x0000`, and chained to the old handler at the end of your ISR. On ARM, the vector table lives at an address defined by VTOR (Vector Table Offset Register), and NVIC priority registers control preemption. The concept is identical: a table of function pointers, a priority system, and a chaining convention. DOS developers who burned time debugging a broken INT 08h hook — the timer tick — learned exactly the lessons that prevent priority inversion bugs in NVIC configurations today. The failure modes are structurally the same.

**TSRs → background tasks.** A TSR allocated a resident block, hooked one or more interrupts, and returned control to DOS with `INT 27h` or `INT 21h/AH=31h`. From that point, it ran code only inside interrupt handlers or via a software interrupt interface exposed to the foreground. This is functionally identical to an RTOS task that blocks on a queue, or a bare-metal superloop that checks a flag set by an ISR. The discipline TSR authors developed around shared-data access — checking DOS’s InDOS flag, deferring work to a safe tick, never calling non-reentrant functions from interrupt context — maps directly to the rules every embedded developer learns about ISR-safe code. The vocabulary changed. The problem didn’t.

**Far pointers → memory region attributes.** In the large or huge DOS memory model, a pointer was 32 bits: 16-bit segment selector plus 16-bit offset, giving access to the full 1MB real-mode address space. Normalization was required when doing pointer arithmetic across segment boundaries. In IAR Embedded Workbench or Keil MDK targeting a Cortex-M7 with tightly-coupled memory (TCM), you annotate data or function with `__attribute__((section(“.dtcm”)))` and write a linker script region to place it at the TCM base address. The linker handles the “normalization.” You still have to think about which address region you’re targeting and why. A developer who never worked with segmented memory often has to be taught explicitly that not all RAM on an STM32H7 has the same latency. A DOS developer already knows this in their bones.

**Memory models → linker scripts.** The DOS memory model you chose at compile time (`/MT` for tiny, `/ML` for large, etc.) determined the default pointer sizes the compiler emitted and where code and data would sit. A tiny model program had code and data in one 64KB segment. A large model program had separate far code and far data segments. You chose based on binary size and access patterns. On STM32, you write a linker script that explicitly places `.text` in Flash, `.data` and `.bss` in SRAM, hot interrupt handlers in ITCM, and DMA buffers in a non-cached region. These are the same decisions with the same trade-offs: access speed, size limits, and pointer overhead. The linker script is just the memory model dialog box made explicit.

**PIT → SysTick.** Programming the 8253/8254 PIT on a PC meant writing a divisor to a control port, which divided the 1.193182 MHz input clock to produce your desired tick rate. The default BIOS tick was 18.2 Hz (divisor 0, which wraps to 65536). If you wanted 1000 Hz, you wrote divisor 1193 to the PIT. On Cortex-M, SysTick is a 24-bit countdown timer clocked from the processor or an external reference. You write a reload value to `SYST_RVR`, set the clock source and enable bits in `SYST_CSR`, and your SysTick ISR fires at `F_CPU / (SYST_RVR + 1)`. The math is the same. The register names changed. A developer who has done PIT programming reads the SysTick section of the ARM TRM and recognizes it immediately.

—

### What DOS Got Right That Desktop Development Abandoned

This is not a complaint about progress. It’s an observation about what falls out when you remove hard constraints from a development environment.

**Deterministic timing.** A DOS program running on bare hardware could guarantee timing to within a single PIT tick — typically 54.9ms at the default rate, or better if you reprogrammed the PIT. No scheduler preempted you between two lines of code. No kernel decided to run a background task during your timing-critical loop. This is why vintage games and MIDI sequencers were written for DOS rather than Windows 3.1: Windows introduced scheduler jitter that made sample-accurate audio impossible. The embedded world preserved this property because it has no choice. A general-purpose OS scheduler, by design, sacrifices determinism for throughput and fairness. That’s the right trade-off for a file server. It’s the wrong trade-off for a motor controller or a digital filter.

**Build times measured in seconds.** A typical DOS C program — a commercial application, not a toy — compiled and linked in under a minute, usually in seconds, on the hardware of the era. The program was small, the dependencies were explicit, and the toolchain had no plugin ecosystem to load. In 2026, a mid-size JavaScript application can take several minutes to build from scratch because the dependency graph involves hundreds of packages, each with its own compilation step, and the build tool has to resolve, transpile, and bundle all of it. A bare-metal embedded project built with make and arm-none-eabi-gcc still compiles in seconds for the same reason DOS programs did: the source is small, the dependencies are controlled, and nothing is being discovered at build time. The discipline that produced small DOS programs produces fast embedded builds.

**Binary size as a design constraint.** A commercial DOS application had to fit in 640KB of conventional memory along with its data, the DOS kernel, and any loaded TSRs. This forced developers to make explicit decisions about what code shipped. Dead code elimination wasn’t a compiler optimization you hoped for — it was a discipline you practiced manually. The result was software where every byte was owned and accounted for. Compare this to a modern Electron application, where the base runtime alone is measured in hundreds of megabytes before a single line of application code runs. The embedded world still enforces the old constraint: an STM32F103 has 64KB of Flash. You will know what is in it. You will remove what doesn’t need to be there. The habit of mind that let a DOS developer ship a full-featured application in 200KB is the same habit that lets an embedded developer fit a useful firmware image in 32KB.

The constraints produced discipline. The discipline is still valuable — not because the past was better, but because the physics that made those constraints real haven’t changed. RAM is still finite. Interrupts still have latency. Timers still divide clocks. DOS developers had to reason about all of it explicitly, and explicitly is still the right way to reason about it.

Version Control and Workflow for DOS Code in 2026

## Version Control and Workflow for DOS Code in 2026

The claim that version control was nonexistent in 1980s DOS shops was accurate. Most shops tracked changes through dated backup directories, if at all. A developer shipping a TSR in 1988 might have `MYAPP_V2.ASM`, `MYAPP_V3.ASM`, and `MYAPP_V3B.ASM` in a single directory and called it a day. That was the reality.

A 2026 reader working on DOS code does not have that excuse.

**Use Git on the host machine. Full stop.** DOSBox-X mounts a host directory and makes it available as a DOS drive letter. The files sitting under that mount point are ordinary host filesystem files — Git sees them, `git diff` works on them, `git bisect` works on them. There is no special DOS-aware tooling required and no reason to invent any.

This is already standard practice in the DOS open-source world. The FreeDOS project manages a substantial codebase through Git on modern hosts. Chocolate-doom and similar DOS-targeting ports treat DOS-era source the same way they would treat any C codebase: Git repository, pull requests, changelogs. The tooling gap that made version control impractical in 1988 simply does not exist anymore.

Practical setup is two commands:

MOUNT C C:\Users\you\projects\myapp
C:

Or in Linux/macOS:

MOUNT C /home/you/projects/myapp
C:

Your source lives at `/home/you/projects/myapp` on the host. You edit with whatever editor you prefer on the host — VS Code, Vim, anything. You run `git commit` on the host. You switch into DOSBox-X only to compile and test. The DOS environment is a build and execution target, not a development environment in the full sense.

Do not install a DOS-native version control tool. There were DOS ports of RCS and similar utilities. They are not worth your time. The filesystem is shared; use the host-side tooling.

**The one genuine friction point is line endings.**

Host Git on Linux or macOS defaults to LF. Some DOS assemblers and compilers are fine with LF-only source files. Others are not — particularly older Turbo Assembler and certain versions of Turbo C, which expect CRLF and will produce confusing parse errors or simply refuse to process a file if the line endings are wrong. The failure mode is annoying because the error message rarely says “wrong line endings.” You get something like `Error: unexpected end of file` on line 1, and you spend twenty minutes checking syntax before you think to check `xxd`.

Set this in your `.gitconfig` or directly in the repository’s `.git/config`:

[core]
autocrlf = false

Then add a `.gitattributes` file to the repository that specifies endings explicitly:

*.asm text eol=crlf
*.c text eol=crlf
*.h text eol=crlf
*.mak text eol=crlf

This way the files on disk always have CRLF endings regardless of what platform the host is running, and Git does not silently convert them during checkout or commit. If you are working across a mixed team — say, someone is editing source on Windows and someone else on Linux — explicit `.gitattributes` rules are the only way to prevent silent corruption.

One additional edge: if you use a host-side editor that strips trailing whitespace or normalizes line endings on save, check its settings before you start. VS Code in particular has a per-file line ending indicator in the status bar and will respect `eol=crlf` files correctly once the `.gitattributes` is in place, but only if you have not overridden line endings in the workspace settings.

The short version: the workflow is host Git, host editor, DOSBox-X for compilation only, `autocrlf=false`, explicit `.gitattributes`. That eliminates every version control problem that made 1980s DOS development difficult to maintain, and it costs you about five minutes of setup.

FAQ: DOS Development Questions That Have Real Search Volume

## FAQ: DOS Development Questions That Have Real Search Volume

—

**Q: Can I legally download Turbo C today?**

A: Partially yes. Borland released Turbo C++ 1.01 as freeware, so that one is clean to download and use. Turbo C 2.0 and Borland C++ 3.1 are different — neither has been released under a free license as of 2026, so copies floating around on abandonware sites are technically infringing regardless of how dead the product line is.

For new DOS development, the answer is OpenWatcom C/C++. It’s actively maintained, produces correct real-mode and protected-mode output, has a working DOS target, and carries a proper open-source license. Turbo C’s IDE is nostalgic but OpenWatcom’s optimizer is better anyway. Don’t start a new project on a 35-year-old compiler you can’t legally redistribute just because the blue screen is comforting.

—

**Q: What is the difference between DOSBox and DOSBox-X?**

A: DOSBox was built primarily for running games. It makes tradeoffs that favor compatibility with a specific slice of late-DOS gaming titles over accuracy to how DOS actually behaved across its full version history.

DOSBox-X is a fork specifically aimed at correctness and completeness. The practical differences for a developer:

– EMS and XMS emulation that actually behaves like the real hardware, not a close approximation
– Configurable CPU types so you can target 8086, 286, or 386 behavior separately
– CGA, EGA, and VGA emulation accurate enough that you’ll catch display bugs your code actually has rather than bugs the emulator papers over
– DOS version selection that matters when you’re testing whether your code runs on DOS 3.3 versus 6.22

If you’re writing software, DOSBox-X is the right tool. DOSBox is fine for playing Doom. These are different jobs.

—

**Q: Can I run real DOS programs I compiled on actual vintage hardware?**

A: Yes, with caveats worth understanding before you plug in that parallel port cable.

A real-mode COM file or small-model EXE compiled against standard DOS interrupts with no DPMI extensions will run on genuine 8086 or 286 hardware without modification. OpenWatcom and NASM both produce output that works here. The constraints are the same ones that constrained original DOS developers: stay in real mode, don’t assume extended memory management that requires a driver your target machine might not have loaded, and don’t rely on a CPU timing assumption that collapses on a 4.77 MHz 8088.

The workflow that makes sense: iterate fast under DOSBox-X where you can snapshot state, reboot instantly, and add debug output without caring about your hardware. Then, when IRQ timing or actual hardware interrupt latency matters — which it will if you’re writing a TSR or anything touching hardware directly — validate on the physical machine. DOSBox-X’s CPU timing emulation is good but it is still emulation. A real ISA bus does not behave identically to a software model of one.

—

**Q: What replaced MASM for free, legal x86 assembly development?**

A: NASM — the Netwide Assembler — is the standard answer. Intel syntax, produces flat binaries, COM files, and DOS MZ EXE format, actively maintained, and the documentation is genuinely readable. The community around it is large enough that most real-mode assembly questions have been answered somewhere.

FASM (Flat Assembler) is the other serious option. It’s self-hosting, extremely fast, and has its own syntax that some people prefer. Both produce correct real-mode output and both are free and open-source.

MASM itself is technically available through Visual Studio’s build tools, but you’d be pulling in a large toolchain to get one assembler, and Microsoft’s licensing around redistributing MASM-produced binaries in certain contexts has historically been murky enough to be annoying. For a clean, simple setup that produces DOS-compatible output without a legal question mark hanging over it, NASM is the default recommendation.

Written by Eric Woo

Lead AI Engineer & SaaS Strategist

Eric is a seasoned software architect specializing in LLM orchestration and autonomous agent systems. With over 15 years in Silicon Valley, he now focuses on scaling AI-first applications.