You see similar constructs in about any system where the system both touches multiple addresses in a single instruction, and can take an interrupt somewhere in the middle of that instruction (whether that be an exception generated by the instruction, or just to keep external interrupt latency down).
The lowly Cortex-M0 exposes the progress of load/store multiple instructions in architectural state so they can be restarted where they left off after an interrupt for example. They even do this with the multiplier too, so if you have the slow but tiny 32 cycle iterative multiplier in your design, you can still get single cycle interrupt latency.
The M68ks had a halfway mechanism where it would just barf up partially documented internal microcode state onto the stack in a chip version specific manner on exceptions in the middle of instructions so you didn't restart the full instruction. Probably the grossest thing about that architecture.
The 68000 didn't do this, but couldn't restart a page fault.
The 68010 didn't either (but could restart a page fault).
The 68020 and 030 did do this horrible thing - doing Unix recursive signals was pretty hard if not impossible. And you couldn't copy this stuff to the user stack because it wasn't documented and so therefore you couldn't validate it when you pulled it back into the kernel.
The 68040 was sane again (and I presume subsequent 68ks)
Really this is part of the CISC vs. RISC thing - RISC instructions tend to have 1 side effect only, either they run to completion, or not at all, but CISC instructions can have multiple side effects - consider the infamous pdp11 instruction "mov -(pc), -(pc)" 3 side effects - 68k instructions are more complex multiple memory indirects, many possible faults, all that crud on the stack represents half done stuff
The lowly Cortex-M0 exposes the progress of load/store multiple instructions in architectural state so they can be restarted where they left off after an interrupt for example. They even do this with the multiplier too, so if you have the slow but tiny 32 cycle iterative multiplier in your design, you can still get single cycle interrupt latency.
The M68ks had a halfway mechanism where it would just barf up partially documented internal microcode state onto the stack in a chip version specific manner on exceptions in the middle of instructions so you didn't restart the full instruction. Probably the grossest thing about that architecture.