Error Detection and Correction Interview Questions

Interview questions & answers

Q1. What is the difference between error detection and error correction?

Error detection identifies that one or more bits in a received codeword have been corrupted, while error correction additionally determines which bits were corrupted and restores them to the correct value — correction requires more redundant bits than detection alone. A simple parity bit in a USB keyboard interface detects single-bit errors but cannot correct them; a Hamming(7,4) code adds 3 redundant bits to 4 data bits and can both detect and correct any single-bit error. The trade-off is bandwidth efficiency: more powerful codes add more overhead bits.

Follow-up: What is the Hamming distance between two codewords and how does it determine error detection/correction capability?

Q2. What is parity checking and what are its limitations?

Parity checking appends one extra bit to a data word to make the total number of 1s either even (even parity) or odd (odd parity), and the receiver checks whether the parity is preserved after transmission. In UART serial communication (like an Arduino Mega communicating at 9600 baud), even parity detects any single-bit error but cannot detect 2-bit errors (two flips cancel out) and cannot correct any error. The limitation is that parity only detects odd numbers of bit errors — even numbers of flipped bits produce valid parity and go undetected.

Follow-up: What is a two-dimensional parity scheme and how does it improve over single-byte parity?

Q3. What is the Hamming code and how does it correct single-bit errors?

Hamming code inserts redundant parity bits at positions that are powers of 2 (positions 1, 2, 4, 8, ...) within the data stream, where each parity bit covers a specific set of data bit positions; the syndrome (XOR of parity check results) directly gives the bit position in error as a binary number. In Hamming(7,4), 4 data bits and 3 parity bits are combined; if bit 5 flips, the syndrome evaluates to 101 binary = 5, pointing exactly to bit 5 for correction. This self-locating error property makes Hamming code the foundation of ECC RAM used in server-grade DDR4 memory modules.

Follow-up: What is SEC-DED (Single Error Correct, Double Error Detect) and how is it an extension of Hamming code?

Q4. What is the minimum Hamming distance required for a code to detect d errors and correct t errors?

To detect d errors, minimum Hamming distance dmin ≥ d+1; to correct t errors, dmin ≥ 2t+1. A code with dmin = 3 (like Hamming code) can correct 1 error (2×1+1=3) or detect 2 errors (2+1=3). A BCH code used in flash memory controllers like the Samsung K9F1G08 NAND chip has dmin = 15, enabling correction of up to 7-bit errors in a 512-byte sector — necessary because NAND flash has higher raw bit error rates than DRAM.

Follow-up: Can a code simultaneously correct t errors and detect more than t errors? Explain.

Q5. What is a CRC (Cyclic Redundancy Check) and how does it work?

CRC treats the data block as a polynomial, divides it by a fixed generator polynomial using modulo-2 arithmetic (XOR, no carries), and appends the remainder as the CRC checksum; the receiver performs the same division and checks for zero remainder. CRC-32 using generator polynomial 0x04C11DB7 is used in Ethernet frames — a 32-bit CRC can detect all burst errors up to 32 bits long and virtually all longer burst errors. CRC is more powerful than simple checksum because it detects burst errors (common in noisy channels) that checksums miss.

Follow-up: What is a burst error and why is CRC better than parity for detecting burst errors?

Q6. What is a generator polynomial in CRC and how is it selected?

The generator polynomial is a predefined binary polynomial used as the divisor in CRC computation; it is selected based on the error-detecting properties required — its degree equals the number of CRC bits and its structure determines which error patterns it can detect. CRC-CCITT uses G(x) = x¹⁶ + x¹² + x⁵ + 1 (0x1021), detecting all burst errors of length ≤ 16 and used in HDLC and X.25 protocols. Different standards use different polynomials: CRC-32 for Ethernet, CRC-16 for UART, CRC-8 for I²C and 1-Wire sensors like the DS18B20.

Follow-up: What property must the generator polynomial have to detect all odd-numbered bit errors?

Q7. What is a convolutional code and how does it differ from a block code?

A convolutional code encodes data continuously (the encoder has memory — output depends on current and previous input bits), while a block code encodes a fixed-size block independently with no memory between blocks. A rate-1/2 convolutional code with constraint length K=7 (as used in GSM cellular speech coding) produces 2 output bits for every 1 input bit, using the last 6 input bits as memory. Convolutional codes are decoded using the Viterbi algorithm, which finds the most likely transmitted sequence through a trellis — this is used in 4G LTE turbo codes and 5G LDPC codes.

Follow-up: What is the Viterbi algorithm and what computational structure (trellis) does it operate on?

Q8. What is interleaving in error correction and why is it used?

Interleaving reorders bits across multiple codewords before transmission so that a burst error in transmission affects different codewords when de-interleaved at the receiver, converting a burst error into isolated random errors that error correction codes can handle. In GSM voice transmission, bits from 8 consecutive 20ms speech frames are interleaved across 8 TDMA bursts; a 3-burst dropout (common in fading) only erases a few bits per speech frame, within the correction capability of the channel code. Without interleaving, a burst would wipe out an entire codeword beyond correction.

Follow-up: What is the interleaving depth and how does it affect both error correction performance and latency?

Q9. What is the Reed-Solomon code and where is it used?

Reed-Solomon (RS) is a block code that operates on multi-bit symbols (typically bytes) rather than individual bits, making it extremely powerful against burst errors that corrupt multiple consecutive bits within a symbol. RS(255,223) used in the NASA deep space standard adds 32 redundant bytes to 223 data bytes per block, correcting up to 16 corrupted bytes regardless of how many bits within each byte are flipped. It is used in QR codes, DVDs, Blu-ray discs (which use RS(182,172) outer code + RS(208,192) inner code), and satellite communications.

Follow-up: What is the difference between the inner and outer codes in a concatenated coding scheme like that used in DVDs?

Q10. How does LDPC (Low-Density Parity-Check) code work and where is it used?

LDPC codes are linear block codes defined by sparse parity check matrices (most entries are zero), decoded iteratively using the belief propagation algorithm on a bipartite Tanner graph, achieving near-Shannon-limit performance. 5G NR uses LDPC codes for data channels (PDSCH/PUSCH) with code rates from 1/5 to 8/9, replacing the LTE turbo code because LDPC achieves better throughput at high code rates (needed for gigabit 5G). The sparse matrix structure enables highly parallel VLSI implementation, which is critical for the throughput requirements of 5G base stations.

Follow-up: What is the Shannon capacity theorem and what does 'near-Shannon-limit' performance mean for a channel code?

Q11. What is a checksum and how does it compare to CRC?

A checksum is computed by summing all data bytes (with or without carry) and transmitting the sum; the receiver recomputes the sum and compares. IPv4 header checksum uses 16-bit one's complement addition over the header bytes — if they match, the header is assumed intact. CRC is more powerful because it detects burst errors and any 2-bit error, while checksum may miss transpositions (swapping two bytes leaves sum unchanged) and certain burst patterns. CRC is therefore used for data payloads (Ethernet, ZIP files) while checksum is acceptable for headers with random error profiles.

Follow-up: Why does a checksum fail to detect the transposition of two data bytes, and does CRC detect this?

Q12. What is the relationship between code rate and error correction capability?

Code rate R = k/n (data bits / total codeword bits); lower code rate means more redundancy and higher error correction power, but less bandwidth efficiency. A rate-1/2 convolutional code uses twice the bandwidth of uncoded transmission but provides substantial coding gain — in satellite communications, a rate-1/2 code with Viterbi decoding gives approximately 5dB coding gain, meaning the transmitter power can be reduced by a factor of 3 for the same bit error rate. The trade-off between code rate and correction power is central to every channel coding design decision.

Follow-up: What is coding gain and how is it measured in terms of Eb/N0?

Q13. How is the Hamming code used in ECC DRAM memory?

ECC DRAM uses an extended Hamming code (SEC-DED — single error correct, double error detect) where an extra overall parity bit is added to the standard Hamming code. In a 64-bit ECC DDR4 DIMM like Samsung's server modules, 8 additional bits (ECC bits on an extra ×8 chip) allow the memory controller to correct any single DRAM cell error and detect any double-bit error in real time without stopping operation. This is critical in servers running databases or virtual machines where a single bit flip in memory can cause data corruption or system crash.

Follow-up: What is the difference between hard error (permanent) and soft error (transient) in DRAM and which does ECC address?

Q14. What is the Hamming bound (sphere-packing bound) and what does it imply?

The Hamming bound states that for a code with n-bit codewords that can correct t errors, the total number of codewords × (volume of a Hamming ball of radius t) ≤ 2ⁿ. A code that meets this bound with equality is called a perfect code — the Hamming(7,4) and Golay(23,12) codes are the only non-trivial binary perfect codes. The bound implies there is a fundamental limit on how many codewords can coexist in the code space while maintaining minimum distance, and codes that come close to this limit (like LDPC) are considered capacity-approaching.

Follow-up: What is a perfect code and why are there so few of them?

Q15. What is turbo coding and what made it revolutionary in channel coding?

Turbo codes use two recursive systematic convolutional encoders connected by an interleaver, with an iterative turbo decoder that passes soft information between two SISO (soft-in soft-out) decoders, achieving near-Shannon-limit performance for the first time when introduced in 1993. 3G UMTS CDMA systems use rate-1/3 turbo codes for voice and data channels, achieving Eb/N0 within 0.5dB of the Shannon limit at BER = 10⁻⁵. The revolutionary aspect was the iterative decoding concept — information passes between decoders multiple times, with each pass refining the estimate until convergence.

Follow-up: What is the 'turbo cliff' effect in turbo code performance and what causes it?

Common misconceptions

Misconception: A parity bit can detect and correct any single-bit error.

Correct: A single parity bit can only detect (not correct) single-bit errors — it indicates that an error occurred but cannot identify which bit flipped; Hamming code is needed for single-error correction.

Misconception: CRC and checksum provide the same error detection capability.

Correct: CRC is far more powerful than checksum — it detects all burst errors up to its generator polynomial degree and all odd-bit errors, while checksum misses transpositions and many burst patterns.

Misconception: A higher code rate means a more powerful error correction code.

Correct: A higher code rate means less redundancy and less error correction power; a lower code rate (more redundant bits) provides more powerful error correction at the cost of bandwidth efficiency.

Misconception: Hamming code can correct multiple bit errors in a single codeword.

Correct: Standard Hamming code (dmin=3) can only correct single-bit errors; for multi-bit correction, extended Hamming (SEC-DED), BCH, or Reed-Solomon codes are needed.

Quick one-liners

What is the minimum Hamming distance to correct t errors?dmin ≥ 2t + 1.

How many redundant bits does Hamming(7,4) have?3 redundant bits for 4 data bits.

What does CRC detect better than parity?Burst errors — consecutive corrupted bits common in noisy channels.

What is the code rate of a rate-1/2 convolutional code?R = 1/2, producing 2 output bits for every 1 input bit.

What does the syndrome in Hamming code indicate?The binary value of the syndrome gives the exact bit position in error.

What channel coding standard does 5G NR use for data channels?LDPC (Low-Density Parity-Check) codes.

What is interleaving used for in error correction?Converting burst errors into random scattered errors that ECC codes can correct.

What is a perfect code?A code that meets the Hamming bound with equality; Hamming(7,4) is an example.

What memory type uses Hamming-based SEC-DED error correction?ECC DRAM (DDR4 ECC DIMM) in servers and workstations.

Interview questions & answers

Common misconceptions

Quick one-liners

More Digital Electronics questions