That may be true but in an industry that produces millions of new computers every year there's quite a bit of room for outliers. After all someone has been buying all these specialised expansion cards for all these years.
Graphics, sound and networking may have been the biggest reasons and may now be adequately catered for in everyday PCs without the extra hardware. However the high end of all of those markets still needs more than what you get on any basic motherboard and processor and then there are all the speciality cards as well catering for who knows how many niche markets.
No, we're all doing that now. This loss of PCIE expansion is really annoying and I'll probably be skipping this next gen largely for that reason. Highend threadripper is unobtanium and has been for the last couple years. But I have a gpu from like every generation that I wish I could plug into my pcie bus and have it be useful, if I could buy a platform that supported that I would.
I have a PCIe sound card but I have to say these days that I'm actually using a tiny apple usb dongle for my audio, and it sounds cleaner than the internal sound card does. The dongle was also a fraction of the price of the sound card, and way more convenient.
Capture cards need the bandwidth. Whether they need the latency is arguable, but they need a lot more latency determinism than USB tends to offer out of the box.
Introduced latency on the capture side makes latency tuning your entire production pretty difficult. For non-real time usage, sure, latency in the 100-200ms range is more than acceptable (assuming it's deterministic, as you pointed out), but in the real-time world? Keeping things within a frame is pretty much required, and with the popularity of software-driven studio workflows across both amateur streaming and professional production, it's been real hard to get reliable performance out of USB hardware that didn't add frustrating amounts latency due to pre-ingest compression or seemingly random amounts of delay due to protocol or CPU time starvation.
Yep, there's a reason I have Blackmagic quad 4K capture cards in my workstation. Syncing multiple video streams with USB capture cards would be nigh impossible even if you put them on separate USB controllers. Ingest over USB is fine (though slow) but pretty much every USB capture card does its own internal compression, as you point out, and then involves the CPU to decompress it and get it into VRAM or DRAM.
Realtime video production is definitely an outlier. You probably want a workstation-class system anyway, with a full TB of RAM so you know that's never an issue.
This is both only true for small transfers (not bulk/asynchronous transfers) and for the ~99th percentile. Large transfers have some buffer management and handshaking, so they tend to have highly variable latency that has a very fat tail.
The latency degradation for large transfers is so noticeable that most audio DACs (not just ones for gullible audiophiles, also for the pro market) use custom drivers and USB protocols. For 1/1000000th the data rate that a capture card would need.
USB everything tends to end in dongle spaghetti. And then plugboard spaghetti to hold all of the power adapters. Just the feature of being an expansion mounting rack and power/data backplane is valuable even if it only ran at basic USB 3.0 speed.