Seems like an easy problem for DL, as you have an enormous amount of data available (just take any color image, convert it to grayscale and you have a pair of training images).
(This is also the case for e.g. the superresolution problem.)
You probably need an enormous GPU (24GB RAM) as well to make as large model as possible for as good generalization as you can (there are so many different types of objects/surfaces/fabric and their compositions).
It's Deep Learning, not much to do with any analytical model, it's not thinking like a human :-(. Recently even good NLP processing needs 24GB+ for training (won't fit into 16GB), a good quality colorizing (no spills, natural colors) could be expected to be as demanding.
From the article:
"BEEFY Graphics card. I'd really like to have more memory than the 11 GB in my GeForce 1080TI (11GB). You'll have a tough time with less. The Unet and Critic are ridiculously large but honestly I just kept getting better results the bigger I made them."
(This is also the case for e.g. the superresolution problem.)