Comparison

DMA vs Interrupt Data Transfer

Streaming 44.1 kHz stereo audio from an I2S codec to a DAC on an STM32F4 using interrupts fires 88,200 ISRs per second — the CPU spends 30% of its time context-switching into ISRs, saving registers, and returning. Wire the same I2S peripheral to DMA channel 4 and the CPU usage for that transfer drops to zero; the DMA controller moves data between the I2S FIFO and SRAM while the CPU runs the audio DSP algorithm. That is the practical value of DMA, and it matters every time you push data rates above a few thousand transfers per second.

ECE, EI

Side-by-side comparison

ParameterDMAInterrupt Data Transfer
CPU Involvement per TransferFull — ISR fires per byte or block; CPU saves context, runs ISRZero during transfer — DMA copies autonomously
CPU Load at 1 Mbps UART~30–50% CPU occupied in ISR overhead~2% CPU (setup + completion ISR only)
Latency to First ByteLow — ISR fires within 12 cycles (ARM Cortex-M4)Slightly higher — DMA request, arbitration, bus grant: ~3–5 extra cycles
Setup ComplexityConfigure NVIC priority, ISR function, volatile flagConfigure DMA stream, direction, burst size, M/P increment, enable DMAEN
Transfer SizeEfficient for 1–8 bytes per eventEfficient for 32 bytes to MB blocks
Memory-to-MemoryNot applicablePossible — DMA can copy between two SRAM regions without CPU
Circular ModeNot native — must re-enable in ISRDMA circular mode — auto-restarts; perfect for ADC sampling buffers
Bus Bandwidth ImpactCPU uses AHB/APB; ISR stalls other bus mastersDMA uses dedicated DMA bus matrix; CPU and DMA can work in parallel (STM32 AHB matrix)
Typical Use CaseLow-rate GPIO events, single-byte SPI commands, keypad scansADC streaming (12-bit, 1 MSPS), UART at > 115200, I2S audio, SD card block writes
Example MCUSTM32F4: UART RXNE interrupt, 1 byte per ISR at 9600 baudSTM32F4: DMA2 Stream0 Ch0 → ADC1 → 1024-sample SRAM buffer, half-transfer + complete ISR

Key differences

An interrupt fires per transaction event — at 115200 baud UART receiving a 256-byte packet, that is 256 ISR entries, 256 register saves (8 registers × 2 cycles each on Cortex-M4), 256 branch-to-ISR, 256 returns — roughly 6000 wasted clock cycles. DMA fires two ISRs (half-transfer and transfer-complete) for the same 256 bytes — 50 wasted cycles. The STM32F4 DMA controller has a dedicated AHB slave bus that does not block the CPU's AHB master; both can transfer simultaneously, giving true zero CPU overhead. DMA does not replace interrupts — the DMA completion ISR is still needed to process the received buffer; DMA removes the per-byte interrupt overhead while keeping the end-of-block notification.

When to use DMA

Use interrupt-driven transfer for low-rate events (< 10 kHz), short bursts (< 8 bytes), or when the data must be processed byte by byte as it arrives. Example: an I2C address-match interrupt on an STM32L4 slave fires per byte (400 kHz, 8 bytes per frame = 400 kHz / 9 bits × 8 ≈ 35 kHz ISR rate) — manageable without DMA.

When to use Interrupt Data Transfer

Use DMA for any peripheral streaming more than 32 bytes continuously or at rates above 100 kHz. Example: an STM32F407 samples 12-bit ADC at 1 MSPS using DMA2 circular mode into a 1024-sample double buffer — the CPU receives a half-transfer interrupt every 512 samples (512 µs) to process one half while DMA fills the other, achieving zero-sample loss with < 1% CPU load.

Recommendation

For any data rate above 100 kbps or block sizes above 32 bytes, configure DMA — CPU load reduction and bus throughput are decisive. Use interrupts for low-rate, byte-by-byte events where DMA setup overhead outweighs savings. On STM32F4, always pair DMA with double-buffering for audio and ADC to process one half while DMA fills the other.

Exam tip: Examiners ask students to calculate the CPU overhead of interrupt-driven UART receive at 115200 baud, 8N1, assuming each ISR takes 1 µs — the ISR fires 115200/10 = 11520 times per second; 11520 µs = 1.15% CPU overhead at this rate; at 1 Mbps it becomes 10%, justifying DMA.

Interview tip: An embedded systems interviewer at a hardware company will ask you to describe double-buffered DMA for ADC streaming — explain that two SRAM buffers are configured; DMA fills buffer A while the CPU processes buffer B; on half-transfer interrupt the CPU switches; this guarantees zero overrun at any sample rate within SRAM bandwidth.

More Embedded Systems comparisons