A few years ago, in my search for hardware/firmware best practices, I came across a series of “Firmware Friendly” articles written by David Fechser of LSI Logic. (See Related Links on my website.) In one of his articles, “Firmware-friendly reset design,” Fechser discusses three levels of hardware reset. The following table summarizes guidelines for the control that hardware and firmware should have over these types of hardware reset.
Level | Action | Hardware | Firmware |
---|---|---|---|
Hard | Reset the whole block to a known state. | Hardware asserts a reset line into the chip during power-on reset. | Firmware should have the ability to also assert this signal into the block to recover from hard hangs if necessary. |
Abort (Soft) | Abort any active operation. Leave configuration registers unchanged but reset status, buffers, counters, and state machines back to their settings before the operation started. | The hardware would not invoke this reset. | Firmware invokes this to abort running operations or to reset after halted error conditions. |
Halt (Soft Stop) | Halt any active operation but do not reset any registers (except the active or go bit.) Depending on the design of the block, the block may or may not be able to resume from where it was stopped. | Hardware invokes this when it detects an error condition. This allows firmware to inspect the registers to look for error causes, after which firmware can invoke an abort to reset the block for the next operation. | Firmware invokes this when it wants to stop operation and inspect registers. Firmware can then either resume (if capable) or abort. |
Note that when the block detects an error condition, it simply halts operations but leaves all registers unchanged. This allows firmware to inspect the registers, look at counters, scan buffers, and check status bits in order to collect clues into the problem. The more information that is available to firmware, the more easily problems can be diagnosed and resolved. By not resetting itself on error, the block gives firmware valuable control over when to reset the block.
- Best Practice: Design each block to halt but not reset itself on error.
In addition, hardware should give firmware a way to do a hard reset on the block. One project I worked on had a block that would occasionally get stopped in a weird state, but the block did not have a firmware-accessible hard reset. Instead, we had to devise a very complicated algorithm in firmware to walk the block back out to where we could use it again. In addition, because of the limited visibility into the internal signals and state machines of the block, we had to make guesses along the way to get the block back to ready. Imagine being stuck in the middle of a maze and then having to walk out blindfolded.
- Best Practice: Provide firmware-accessible hard reset controls to each block.
Until the next reset…