Interview questions & answers
Q1. What is the fundamental difference between an FPGA and a CPLD?
An FPGA is organized as an array of configurable logic blocks (CLBs) with a programmable interconnect fabric and is implemented in SRAM that must be reconfigured at power-up, while a CPLD uses macrocell-based logic in a sum-of-products architecture with non-volatile Flash or EEPROM configuration that retains its programming without power. A Xilinx XC7A35T Artix-7 FPGA holds up to 5200 slices and 33,280 look-up tables but requires a configuration bitstream loaded from an external SPI Flash at startup. A Lattice ispMACH 4256ZE CPLD begins operating immediately on power-up because its configuration is stored in on-chip Flash, making it ideal for power sequencing controllers.
Follow-up: Why is a CPLD preferred over an FPGA for system power-on sequencing circuits?
Q2. What is a look-up table (LUT) in an FPGA and how does it implement logic?
A LUT is a small SRAM with N address inputs and one output; the SRAM is programmed with the truth table of any N-input combinational Boolean function so the logic inputs become address lines that select the pre-computed output value. A 6-input LUT in a Xilinx 7-series FPGA holds 64 SRAM bits and can implement any function of up to 6 variables, including complex operations like adders or comparators as a single resource. Because any combinational function of up to 6 variables maps to one LUT, FPGA synthesis tools use LUT count as the primary measure of design complexity.
Follow-up: How many LUTs does a 4-bit binary adder require in a 4-input LUT architecture?
Q3. What is the difference between synchronous and asynchronous reset in FPGA design?
A synchronous reset is sampled on the active clock edge and only takes effect at the next clock cycle, while an asynchronous reset takes effect immediately regardless of the clock state and is connected to the dedicated asynchronous clear pin of FPGA flip-flops. Xilinx 7-series flip-flops have a dedicated synchronous set/reset but share asynchronous reset resources that, if overused, can cause timing closure difficulties. Synchronous reset is recommended in FPGA design because it avoids asynchronous reset recovery and removal timing violations that can cause metastability.
Follow-up: What is a reset recovery time violation and why is it dangerous in FPGA clock domains?
Q4. What is clock domain crossing (CDC) and how do you handle it in FPGA design?
CDC occurs when a signal generated in one clock domain is sampled by flip-flops in a different clock domain; if unsynchronized, metastability in the receiving flip-flop can propagate corrupted data into logic. A dual-flop synchronizer using two back-to-back flip-flops clocked at the destination domain frequency is the minimum synchronization for single-bit control signals crossing from a 50 MHz source to a 100 MHz destination clock domain. Multi-bit data crossing clock domains requires a handshake protocol or asynchronous FIFO (like an 8-deep BRAM-based FIFO in a Xilinx design) to ensure all bits are captured from a consistent snapshot.
Follow-up: Why are two flip-flops used in a synchronizer instead of just one?
Q5. What are BRAM (Block RAM) resources in an FPGA and when do you use them versus distributed RAM?
BRAM are dedicated hardened 18 Kbit or 36 Kbit SRAM blocks in an FPGA that operate as independent True Dual-Port or Simple Dual-Port memories with registered outputs, while distributed RAM is formed from LUT SRAM cells for small memories of typically 32 to 256 bits. A Xilinx XC7A35T has 50 BRAMs (900 Kbits total), used for large FIFOs, frame buffers, or coefficient lookup tables in DSP designs. Distributed RAM is chosen for small register files (under 256 bits) because it avoids the clock-cycle read latency and routing overhead of instantiating a BRAM.
Follow-up: What is the read latency difference between a BRAM and a distributed RAM implementation in a Xilinx 7-series device?
Q6. What is timing closure in FPGA design and what causes a setup time violation?
Timing closure means all timing constraints (setup, hold, clock skew) are met across all flip-flop paths after place-and-route. A setup time violation occurs when the data path delay from one flip-flop's Q output through combinational logic to the next flip-flop's D input plus the clock skew exceeds the clock period, causing the data to arrive too late to be captured reliably. In Vivado timing reports, a setup slack of −0.3 ns on a 100 MHz (10 ns period) path means the data arrives 0.3 ns after the capture edge, and the path must be retimed or pipelined to fix it.
Follow-up: How does adding a pipeline register to a long combinational logic path fix a setup time violation?
Q7. What is an FPGA DSP slice and how does it accelerate arithmetic operations?
A DSP slice is a hardened multiply-accumulate block in an FPGA containing a dedicated multiplier, pre-adder, and accumulator that performs A×B+C in one clock cycle without consuming LUT resources. A Xilinx 7-series DSP48E1 slice handles 25×18-bit signed multiplication with a 48-bit accumulator at 450 MHz, implementing a second-order IIR filter section in one slice versus hundreds of LUTs if built from fabric. Using DSP slices for FIR filter coefficient multiplication is essential for audio, radar, and SDR designs where dozens of multiply-accumulate operations must run at full clock rate.
Follow-up: What is resource cascading between DSP48E1 slices and how does it help implement wide multipliers?
Q8. What is partial reconfiguration in FPGAs and what applications use it?
Partial reconfiguration allows a portion of an FPGA's programmable fabric to be reprogrammed while the rest of the device continues operating, loading a new bitstream into a designated reconfigurable partition without disturbing the static region. A Xilinx Zynq-7000 running a software radio can swap between different modulation demodulators (BPSK, QPSK, 64-QAM) loaded into a reconfigurable region while the RF interface and processor remain active. Partial reconfiguration is used in space-borne instruments and cognitive radio systems where hardware functionality must adapt without system restart.
Follow-up: What are the constraints on the interface between the static region and a partially reconfigurable region?
Q9. What is the difference between behavioral, structural, and RTL coding styles in HDL?
Behavioral HDL describes what a circuit does using algorithmic constructs (if/else, case, loops) without specifying implementation, structural HDL instantiates specific gates or component modules and wires them together explicitly, while RTL (Register Transfer Level) describes data flow between registers using concurrent signal assignments and synchronous processes, and is the style that synthesis tools directly translate to gates. Writing a 4-bit adder behaviorally as 'sum <= a + b' in VHDL synthesizes efficiently, while a structural version would instantiate four full-adder components with carry connections. Professional FPGA design uses RTL style for all synthesizable code and behavioral style only in simulation testbenches.
Follow-up: Why can behavioral VHDL constructs like WAIT FOR 10 ns not be synthesized into FPGA logic?
Q10. What is the purpose of the FPGA configuration DONE pin and why does it matter in system design?
The DONE pin goes high after the FPGA has successfully loaded its configuration bitstream and its I/O pins have been initialized, signaling to the system that the FPGA is ready to operate and its outputs are valid. Connecting the DONE pin to the enable input of downstream power supplies or processor reset circuits ensures that other system components do not try to communicate with the FPGA before it is configured. On a Xilinx Artix-7 based PCB, driving the DONE pin through a pull-up to a microcontroller GPIO allows the MCU to poll FPGA readiness before initiating SPI communication.
Follow-up: What happens to FPGA I/O pins during and immediately after configuration loading?
Q11. What is metastability in flip-flops and why is it more of a concern in FPGAs than ASICs?
Metastability occurs when a flip-flop's input data changes within the setup or hold time window, causing the output to settle to an indeterminate voltage for an unbounded time before resolving to a valid logic level. In FPGAs, routing delays between flip-flops vary depending on place-and-route, so the effective setup margin is determined by implementation, not just the device datasheet, making MTBF calculations more complex than in fixed-layout ASICs. Using synchronizer chains and Xilinx's ASYNC_REG attribute forces the tool to place synchronizer flip-flops close together and use the fastest available interconnect to minimize metastability resolution time.
Follow-up: What is the MTBF formula for metastability and which parameters dominate it in a practical design?
Q12. How does an FPGA implement a state machine compared to a CPLD?
In an FPGA, state machine flip-flops map to LUT-connected slice flip-flops with next-state logic synthesized into LUTs and optimized during place-and-route, while a CPLD maps state bits to macrocell flip-flops with sum-of-products next-state equations fitting directly into the AND-OR PLA structure. A 16-state FSM in a Xilinx 7-series FPGA uses approximately 4 flip-flops and 8 LUTs for next-state and output logic. CPLDs are preferred for small state machines requiring guaranteed propagation delay from input to output because the PLA structure has predictable and flat timing.
Follow-up: What is the advantage of one-hot encoding for state machines in FPGAs compared to binary encoding?
Q13. What is IOB (I/O Block) in an FPGA and what features does it provide?
The IOB is the programmable I/O cell at the periphery of an FPGA that includes configurable drive strength, slew rate control, pull-up/pull-down resistors, programmable input/output voltage standard selection, and optional input/output flip-flops that can be placed directly in the IOB for minimum I/O delay. A Xilinx Artix-7 IOB supports voltage standards from LVCMOS 1.2V to LVTTL 3.3V and differential standards like LVDS, all selectable in the constraint file. Using IOB flip-flops for DDR interfaces eliminates the routing delay from the IOB to the first fabric flip-flop, which is critical for timing closure on high-speed external memory interfaces.
Follow-up: What is the difference between using IOB flip-flops and fabric flip-flops for a high-speed external SRAM interface?
Q14. What is the difference between PLL and MMCM clock resources in Xilinx FPGAs?
A PLL (Phase-Locked Loop) in Xilinx 7-series can generate one to six output clocks with programmable multiplication and division ratios and fixed 90° phase offset increments, while an MMCM (Mixed-Mode Clock Manager) additionally supports fine-grained phase shifting in 1/56 VCO period steps and dynamic reconfiguration of clock parameters at runtime. The MMCM in a Zynq-7000 device is used in SDR applications to dynamically shift the ADC sampling clock phase to center the eye diagram without stopping the clock. Both PLL and MMCM must be used for all FPGA clock generation because routing a clock signal through fabric LUTs creates unequal skew that violates setup timing across the device.
Follow-up: What is jitter in a PLL output clock and how does it affect ADC sampling performance in an FPGA-based data acquisition system?
Q15. What is the purpose of a constraints file (XDC in Vivado) in FPGA implementation?
An XDC (Xilinx Design Constraints) file specifies timing constraints (create_clock, set_input_delay, set_output_delay) that tell the implementation tools what timing requirements must be met, and physical constraints (set_property LOC, IOSTANDARD) that assign design signals to specific FPGA pins with defined I/O standards. Without a create_clock constraint on the 100 MHz system clock, Vivado's router treats timing as unconstrained and produces a routing that may violate setup time at the operating frequency. Incomplete XDC files are the most common cause of working-in-simulation, failing-in-hardware bugs in student FPGA projects.
Follow-up: What is the difference between a false path constraint and a multicycle path constraint in timing analysis?
Common misconceptions
Misconception: An FPGA can implement any logic function without resource limits.
Correct: FPGAs have a finite number of LUTs, flip-flops, BRAMs, and DSP slices; designs that exceed these resources simply cannot be implemented on the chosen device.
Misconception: Simulating an HDL design passing all test cases guarantees correct FPGA hardware behavior.
Correct: Simulation does not account for routing delays, metastability from real clock domain crossings, or I/O timing constraints; hardware testing is always required after simulation.
Misconception: CPLD and FPGA are just different sizes of the same architecture.
Correct: CPLDs use a non-volatile PLA-based macrocell architecture with fixed routing delays, while FPGAs use SRAM-based LUT arrays with a programmable interconnect fabric — they are fundamentally different architectures.
Misconception: Using asynchronous reset in FPGA designs is always safer than synchronous reset because it works even without a clock.
Correct: Asynchronous reset in FPGAs can cause reset recovery and removal violations that trigger metastability; synchronous reset is recommended in FPGA design guidelines from both Xilinx and Intel.