Abstract
The proliferation of Deep Neural Networks in various domains has seen an increased need for interpretability of these models. Prelimi-nary work done along this line, and papers that surveyed such, are focused on high-level representation analysis. However, a recent branch of work has concentrated on interpretability at a more granular level of analyzing neurons within these models. In this paper, we survey the work done on neuron analysis including: i) methods to discover and understand neurons in a network; ii) evaluation methods; iii) major findings including cross architectural compar-isons that neuron analysis has unraveled; iv) applications of neuron probing such as: controlling the model, domain adaptation, and so forth; and v) a discussion on open issues and future research directions.
| Original language | English |
|---|---|
| Pages (from-to) | 1285-1303 |
| Number of pages | 19 |
| Journal | Transactions of the Association for Computational Linguistics |
| Volume | 10 |
| DOIs | |
| Publication status | Published - 22 Nov 2022 |
Keywords
- Explanation
- Recurrent