As a somewhat outsider to deep learning, intuitively it seems to be true that if...

b3kart · on Feb 5, 2020

I am not convinced the explanations we’d get from these models would be interpretable enough to drive their conceptual improvement. Deep nets just don’t learn nice interpretable features, nothing in the training objective makes them do so. E.g. they make use of really dubious features [1], and might not actually learn “hierarchies” of features as we thought they were [2].

[1] https://arxiv.org/abs/1905.02175

[2] https://arxiv.org/abs/1904.00760

naresh_xai · on Feb 6, 2020

Or you can actually visualize and quantify impact of aggregates of meaningful features within networks using the methodology described in NetDissect or in TCAV. But of course, that casts doubt into a lot of claimed mechanisms and tons of ML research as I have found out.