[DL] Interpretability, Model Inspection and Representation Analysis

Identifying underlying mechanisms giving rise to observed patterns in the data. When applying deep learning in scientiﬁc settings, we can use these observed phenomena as prediction targets, but the ultimate goal remains to understand what attributes give rise to these observations
- A survey of Deep Learning for Scientific Discovery ( Maithra Raghu, Eric Schmidt 2020 )

이 포스트는 A survey of Deep Learning for Scientific Discovery 의 Part 7 : Interpretability, Model Inspection and Representation 을 요약하였습니다.

Introduction

과학 분야에서 Deep Learning 은 대부분 특정 현상을 해석하기 위해 사용되어 왔다. 예를 들어 입력이 아미노산 이고 출력이 특정 단백질의 특성을 예측하는 것인 모델을 생각해보자. 이 때 모델이 아미노산의 배열을 이해하여 단백질의 기능을 예측할 수 있다.

이처럼 해석가능한 딥러닝에 대한 기술이 연구되어 왔으며 이 때 요구되는 사항은 Fully understandable and step-by-step explanation of the model's decision preocess 이다. 본 논문에서는 크게 두 가지 카테고리로 나누어 이 기술에 대해 설명하고 있다.

1. Feature Attribution ( Per Example Interpretability ) : 어떠한 input feature 가 중요한지 결정하는 것

- 학습된 신경망 구조를 통해 특정 input 에 대해 조사한다

- Input feature 에서 어떠한 부분이 중요한지 결정한다.

2. Model inspection ( Representation Analysis ) : 무엇이 Neural Network 에서 특정 Neuron 을 fire 하는지 결정하는 것

- 딥러닝 모델이 학습한 데이터의 중요한 hidden pattern 에 대해 조사한다. 예를 들어 Machine translation 문제에서 딥러닝 모델이 representation analysis 기술을 활용하여 latent linguistic structure 에 대해 설명할 수 있다

저작자표시 변경금지 (새창열림)

Aaron's Tech Blog

[DL] Interpretability, Model Inspection and Representation Analysis

Introduction

댓글

티스토리툴바