I especially liked the idea of CR2 and CR3 as scratchpad registers when memory access is really slow (386SX and cacheless 386DXs). And the trick of using ESP as a loop counter without disabling interrupts (by making sure it always points to a valid stack location) is just genius.
Yes! I know nothing about low level programming, but the idea of using a register that you don't need for a fast 'memory' location is particularly clever.
I especially liked the idea of CR2 and CR3 as scratchpad registers when memory access is really slow (386SX and cacheless 386DXs). And the trick of using ESP as a loop counter without disabling interrupts (by making sure it always points to a valid stack location) is just genius.