Interview questions & answers
Q1. What is SPI and how does it differ from UART?
SPI (Serial Peripheral Interface) is a synchronous full-duplex serial protocol using four lines — SCLK (clock from master), MOSI (master out slave in), MISO (master in slave out), and CS/SS (chip select) — where the master provides the clock for every transfer, unlike UART which is asynchronous with no clock line. An STM32F4 SPI1 at 42 MHz transfers a 16-bit ADC reading from an ADS1118 in 16 clock cycles ≈ 380 ns, while UART at 115200 baud would need 174 µs for the same 2 bytes — 450× slower. SPI's clock-driven operation eliminates baud rate agreement issues and allows much higher data rates, making it the protocol of choice for high-speed sensors, flash memory, and displays.
Follow-up: What is the maximum SPI clock speed achievable on STM32F4 SPI1, and what limits it?
Q2. What are CPOL and CPHA in SPI and what do the four modes mean?
CPOL (Clock Polarity) sets the idle state of the clock line — CPOL=0 means clock idles low, CPOL=1 idles high; CPHA (Clock Phase) sets when data is sampled — CPHA=0 samples on the first clock edge, CPHA=1 samples on the second edge. Mode 0 (CPOL=0, CPHA=0) is the most common: clock idle low, data sampled on rising edge — used by most sensors including the BME280 pressure sensor and W25Q128 flash. Mode 3 (CPOL=1, CPHA=1) uses clock idle high with sampling on rising edge — the SCK polarity is inverted vs Mode 0 but the data relationship to edges is the same as Mode 0, which confuses many engineers when checking oscilloscope traces.
Follow-up: How do you determine the correct SPI mode for a slave device if the datasheet shows a timing diagram but does not explicitly state CPOL and CPHA?
Q3. What is the chip select (CS) signal in SPI and why is it important?
The chip select (active low) signal enables a specific slave device to participate in an SPI transaction; all SPI slaves share MOSI, MISO, and SCLK, but each has its own CS line, so pulling one CS low while all others are high ensures only the selected slave drives MISO and responds to commands. On an STM32F4 SPI bus with a W25Q128 flash and an ADXL345 accelerometer sharing MOSI/MISO/SCLK, asserting PB0 (flash CS) low while PB1 (accelerometer CS) is high reads flash without the accelerometer interfering. NSS hardware management on STM32 SPI can automatically control a single CS, but multi-slave buses always use software-controlled GPIO CS pins to address each slave independently.
Follow-up: What happens if two SPI slaves are simultaneously selected (both CS lines active) on the same bus?
Q4. Explain SPI full-duplex operation with a practical example.
In SPI full-duplex, every clock cycle simultaneously shifts one bit out on MOSI and shifts one bit in from MISO, so a write and a read always happen together — to read from a slave, a dummy byte (0xFF or 0x00) is transmitted on MOSI while the real data comes back on MISO. Reading a 12-bit result from an MCP3204 ADC: the master sends a 3-byte command frame (start bit + channel select) while simultaneously receiving zeros in the first bits and then 12 bits of ADC data in the last 12 clock cycles of the frame. This simultaneous exchange is why SPI register read operations always require one more byte than a write — the first byte sent selects the register and the response arrives one byte later.
Follow-up: How do you distinguish between write-then-read and single full-duplex exchange in SPI communication with a register-based sensor?
Q5. How do you configure and use SPI with DMA on STM32?
SPI DMA on STM32 links two DMA streams to SPI — one for TX (MOSI data), one for RX (MISO data) — and transfers the entire frame buffer without CPU involvement; HAL_SPI_TransmitReceive_DMA() initiates the transaction and the HAL_SPI_TxRxCpltCallback() fires when done. For reading 128 bytes from a W25Q128 flash on SPI1 at 42 MHz, DMA2 Stream 2 (SPI1 RX) and DMA2 Stream 3 (SPI1 TX) handle the transfer in about 24 µs, freeing the CPU to do other work. Without DMA, polling 128 bytes in a tight loop at 42 MHz still takes only 24 µs, so DMA benefit is most significant when the SPI transaction can overlap with unrelated CPU processing like sensor fusion or display rendering.
Follow-up: Why must both TX DMA and RX DMA streams be enabled simultaneously in an SPI DMA transaction even if you only care about received data?
Q6. What is the difference between SPI master and slave mode?
In master mode, the MCU generates the clock, controls CS, and initiates all transfers; in slave mode, the device waits for the master's clock and CS assertion and responds with data loaded into its TX shift register before the transaction begins. STM32 SPI in slave mode is used when another processor (e.g., a main application processor) reads sensor data from the STM32 acting as a co-processor; the STM32 slave must prepare data in the SPI data register before the master starts clocking, or the first bits of the response will be invalid. Slave mode requires the clock to arrive within the SPI peripheral's input setup time — high clock rates on long PCB traces can violate this timing margin and cause first-byte errors.
Follow-up: What is the NSS (slave select) input function in SPI slave mode and how does it prevent bus contention?
Q7. What is the maximum SPI clock frequency and what practical factors limit it?
STM32F4 SPI1 and SPI4 are clocked from APB2 (up to 84 MHz) and can output a maximum SCK of 42 MHz (APB2/2); SPI2 and SPI3 on APB1 (42 MHz) give maximum 21 MHz SCK. In practice, PCB trace length limits maximum frequency because a 10 cm trace has roughly 10 ns propagation delay, comparable to the bit period at 100 MHz; a W25Q128 datasheet specifies maximum SCK of 133 MHz but real PCB traces with vias and stubs limit reliable operation to 40–80 MHz in production designs. MISO setup and hold times relative to SCK falling edge must be verified for the slave at the operating temperature and voltage to ensure no data capture violations.
Follow-up: How do you increase SPI reliability when operating near the maximum frequency limit?
Q8. How do you read a register from an SPI sensor like BME280?
Reading register 0xF3 (status) from a BME280 requires: assert CS low, send 0xF3 | 0x80 = 0xF3 (read bit = bit7 for BME280 protocol), send one dummy byte 0x00 while reading the response byte, then deassert CS. The first SPI byte (register address with read bit) selects the register; during the second byte's 8 clock cycles, the BME280 shifts out the register contents on MISO while the master shifts out don't-care data. On STM32 HAL, uint8_t txBuf[] = {0xF3, 0xFF}; uint8_t rxBuf[2]; HAL_SPI_TransmitReceive(&hspi1, txBuf, rxBuf, 2, 10); — the result is in rxBuf[1], not rxBuf[0].
Follow-up: Why is the useful data in rxBuf[1] and not rxBuf[0] in a 2-byte SPI read transaction?
Q9. What is the SPI transaction overhead and how does it affect high-speed data logging?
SPI transaction overhead includes CS assertion time, inter-byte gaps inserted by HAL, and CS deassert time — on STM32 with HAL at 42 MHz SPI, a 1-byte read has about 2 µs overhead for CS toggle and HAL delay, so at 1000 samples/second, overhead consumes 0.2% of CPU time. But for a 6-axis IMU (MPU-6050 on I2C) versus a SPI version (ICM-42688-P): the ICM at 24 MHz SPI reads all 14 bytes of accelerometer + gyro + temperature in about 5 µs including CS overhead; at 8 kHz ODR (output data rate), this is 4% CPU time — feasible for a flight controller. Raw SPI throughput without overhead is simply SCK frequency / bits per word — 42 MHz SPI gives 42 Mbps = 5.25 MB/s.
Follow-up: What is the SPI FIFO depth on STM32F7/H7 and how does it reduce overhead at high sample rates?
Q10. What is SPI over long cables and what special considerations are needed?
SPI over cables longer than 30–50 cm suffers from capacitive loading, signal reflections, and clock-data skew; MISO from the slave becomes unreliable as rise/fall time increases and the clock edge may arrive before stable data. For a 1 m cable connecting a remote SPI ADC (ADS1248) to an STM32, reducing SCK to 1–4 MHz, adding 33 Ω series termination on each line, and using twisted-pair for MISO and GND reduces reflections and noise. Alternatively, SPI-over-isolated-link ICs (ADUM4154) convert SPI to a differential signal that travels reliably over several meters at up to 150 MHz — used in isolated ADC front-ends for industrial measurement.
Follow-up: What is ISSI and how does it achieve galvanic isolation in SPI interfaces?
Q11. How is daisy-chaining used in SPI for multiple devices?
In SPI daisy-chain configuration, all devices share a single CS line and SCLK; MOSI of the master connects to MOSI of the first device, whose MISO connects to MOSI of the second device, and so on — data ripples through the chain, with all devices shifting simultaneously. A chain of three 74HC595 shift registers in daisy-chain receives 24 bits (3 bytes) in a single transaction: after 24 clock pulses, the first byte sent is in the last register and the last byte sent is in the first register — requiring data to be sent in reverse order. Daisy-chaining saves CS GPIO pins and PCB routing but all devices must support the same transaction length per frame, making it impractical for mixed device types.
Follow-up: What limitation of daisy-chain SPI makes it unsuitable for reading different sensors of different register sizes in the same chain?
Q12. What is QSPI (Quad SPI) and where is it used?
Quad SPI uses four data lines (IO0–IO3) simultaneously instead of single MOSI/MISO, quadrupling throughput to 4× SCK frequency in quad mode — at 80 MHz QSPI clock, data transfer rate is 320 Mbps. STM32H743 OCTOSPI peripheral interfaces directly with W25Q256 flash memory at 80 MHz DDR (160 Mbps effective) for execute-in-place (XIP) code execution, allowing the MCU to run code from external flash as if it were internal RAM. Automotive MCUs (Renesas RH850, NXP S32K) use QSPI flash for boot code storage beyond the on-chip flash capacity, critical for ADAS applications with large AI model weights.
Follow-up: What is execute-in-place (XIP) mode in QSPI flash and what are its latency characteristics?
Q13. How do you debug SPI communication issues on an oscilloscope?
SPI debugging requires probing all four lines (SCLK, MOSI, MISO, CS) simultaneously on a mixed-signal oscilloscope or logic analyzer; triggering on CS falling edge captures the complete transaction, and decoding the bit stream against the expected register map confirms correct protocol. A Saleae Logic 8 at 100 MHz sample rate captures STM32F4 SPI at 10 MHz with 10 samples per bit, resolving individual bit values; a BME280 read returning 0xFF on MISO for all bytes indicates the device is not asserting MISO, pointing to CS not reaching the device, wrong CPOL/CPHA mode, or missing device power. The most common first-time SPI bug is leaving the CS pin in the wrong default state or forgetting to assert it before the transfer in software.
Follow-up: How do you check if the CPOL/CPHA mode mismatch is causing SPI errors using only an oscilloscope?
Q14. What is the inter-byte gap in SPI and how does it affect flash memory programming?
An inter-byte gap is an idle period between consecutive byte transfers within the same CS assertion, caused by HAL overhead or software delay; for some SPI flash devices, a gap within a command frame resets the device's internal command parser and causes the operation to abort. W25Q128 datasheet specifies that CS must remain low continuously throughout a page program command (259 bytes minimum); any CS deassert or inter-byte gap longer than tCSH_min aborts the write and the page must be re-programmed. Using DMA for the full page program transfer (256 bytes in one HAL_SPI_Transmit_DMA() call) prevents inter-byte gaps and is mandatory for reliable flash programming in production firmware.
Follow-up: What is the maximum page program size for W25Q128 flash and what happens if you exceed it?
Q15. How does SPI compare to I2C in terms of speed, wiring, and use cases?
SPI is faster (up to tens of MHz vs I2C max 3.4 MHz in high-speed mode), uses more wires (4 vs 2), supports true full-duplex, and has no addressing overhead — ideal for high-speed point-to-point peripherals like flash, displays, and ADCs. I2C uses only 2 wires, supports multiple masters and up to 127 devices on one bus with 7-bit addressing, and requires no CS GPIO per device — ideal for low-speed sensors (BME280, MPU-6050) where bus economy matters. In a wearable with 6 sensors, I2C is chosen to minimize pin count; in a camera module requiring 60 MB/s raw image streaming to flash, SPI or even parallel interface is mandatory.
Follow-up: Why can't I2C replace SPI for EEPROM or flash memory interfacing at high throughput?
Common misconceptions
Misconception: SPI Mode 0 and Mode 3 are completely different and interchangeable with any device.
Correct: Mode 0 (CPOL=0, CPHA=0) and Mode 3 (CPOL=1, CPHA=1) both sample on the leading edge relative to the data — only the clock idle level differs; many sensors accept both modes, but a mismatch still causes every byte to be shifted by one half-clock cycle, giving corrupted data.
Misconception: SPI slaves can share MISO and MOSI lines only if they have separate CS lines.
Correct: This is correct for standard SPI, but MISO must be tristate (high-Z) when the slave is not selected; if a slave drives MISO continuously regardless of CS state, bus contention occurs — always verify the slave MISO behavior in the datasheet.
Misconception: The first byte received in rxBuf[0] during a 2-byte SPI transaction contains the register data.
Correct: In a register-read transaction, the first byte sent is the register address command and the slave responds during the second byte; the useful data is in rxBuf[1] because the slave shifts out data one full byte after receiving the address.
Misconception: SPI at high speed is always reliable on any PCB layout.
Correct: PCB trace length, vias, and capacitive loading create reflections and clock-data skew that limit reliable SPI speed; series termination resistors, controlled impedance traces, and reduced clock speed are required for PCB traces longer than a few centimeters at high SPI frequencies.