A common question engineers often wrestle with is how long hardware will take to do a requested task so firmware can take the next step. Engineers implement different designs (both in hardware and firmware) depending on the length of time, and these designs have varying impacts on hardware and firmware complexity and overall system performance. Understanding their ramifications during the design phase helps balance the load between hardware and firmware.
Based on the hardware and firmware implementation required, we can group these designs into three categories:
Let’s take aborts in hardware as an example, since implementations exist in each of the three categories – no, short, and long delays. For some aborts there is no delay; it is a simple matter of returning back to the home or idle state, clearing counters and buffers, and completing other activities that can be done quickly. Such an operation is so quick that it is not necessary for hardware to add extra logic for a status or interrupt bit. In these cases, firmware can initiate the abort and simply move on to the next step, which may be to set up the hardware for the next job. The key is for hardware to complete the abort before firmware tries to access it again.
Some abort implementations can take several clock cycles to complete, which means that firmware must wait for completion before accessing the block again. If it is a short delay, hardware should provide a status bit that firmware could poll, looping a few times until the task is done, then move on to the next step. If there is a long delay, then hardware should provide an interrupt bit that firmware will enable. Firmware will then do other processing while waiting for the interrupt to occur. Setting up, waiting and responding to an interrupt requires several CPU cycles with task swaps, context switches and semaphore handling. Thus, for firmware, polling a status bit is preferable to managing an interrupt if the task will be done after a short delay.
Where that line should be between short and long delays must be determined on a case-by-case basis and depends on the hardware platform, operating system and performance requirements. The dividing line could even move dynamically depending on the current operating conditions of the product. To give engineers the flexibility of moving that dividing line, the hardware for short and long delays should be the same, implemented with both a status bit and a maskable interrupt. This flexibility allows engineers to calculate or take measurements to count how many loops the polling is taking and determine if polling is acceptable or if interrupts are needed.
For some blocks, the time the abort takes can vary from a short delay if the block is in an idle state to a long delay if the block is busy and needs to gracefully terminate. Since firmware cannot know the current state, it must always assume the worse case. If firmware wants to take advantage of the shorter aborts when they do occur, it could poll for several loops in case the task completes quickly. If not, then enable the interrupt and switch to another task.
To help engineers know how to implement the firmware, put in the block’s documentation the min and max abort times and the conditions in which they will occur. It could be something such as, “if the block is already idle, the abort will complete in 20ns, otherwise it will take 2-3us to complete.”
I used aborts for these examples, but the concepts apply for any firmware-initiated hardware task that could take time to complete. Implementing both status and interrupt bits for short- and long-delay hardware tasks allows firmware to balance the system load and performance by using polling loops or interrupts as appropriate.
Until the next interrupt (which will not occur for at least 2,000,000,000,000us)…