Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

As a somewhat outsider to deep learning, intuitively it seems to be true that if you were able to demystify the black box, then it would be easier to improve your models (as you understand where it succeeds and fails and why). From this perspective, explainability would be incredibly productive.


I am not convinced the explanations we’d get from these models would be interpretable enough to drive their conceptual improvement. Deep nets just don’t learn nice interpretable features, nothing in the training objective makes them do so. E.g. they make use of really dubious features [1], and might not actually learn “hierarchies” of features as we thought they were [2].

[1] https://arxiv.org/abs/1905.02175

[2] https://arxiv.org/abs/1904.00760


Or you can actually visualize and quantify impact of aggregates of meaningful features within networks using the methodology described in NetDissect or in TCAV. But of course, that casts doubt into a lot of claimed mechanisms and tons of ML research as I have found out.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: