Attention is not not explanation

tags: study paper DSMI lab

paper: Attention is not not explanation

(不太喜歡這篇的表達方式,很多雙重否定@@,而且結構有點亂)

Introduction

這篇是基於”Attention is not explanation”(以下簡稱 ==J&W==)這篇論文的探討。目的不是在說attention有解釋性,而是在說”attention 不一定沒有解釋性”。文中有點出幾個在”Attention is not explanation”的實驗和論述裡面,潛在的文問題。並且提出一些比較好的分析方式(雖然我沒有完全被說服)。

Main Claim

  1. Attention Distribution is not a Primitive

    The base attention weights are not assigned arbitrarily by the model, but rather computed by an integral component whose parameters were trained alongside the rest of the layers; the way they work depends on each other.

Attention是有參與模型訓練的,並不是獨立於模型存在。J&W的把attention random permute 的實驗不是那麼適當。在製造adversary的時候應該要retrain model.

  1. Existence does not Entail Exclusivity

    We hold that attention scores are used as providing an explanation; not the explanation.

找到另一個可解釋的方式不代表本來的方式沒有解釋性。解釋性不具有唯一性。尤其對於binary classification task,input是很多個字(維度很大),output是0~1,在降維的過程中容易有比較大的彈性。

Defining Explanation

J&W 沒有清楚的定義甚麼是explanation,其實在過去的文獻對於explanation有不同定義。提到AI的可解釋性,常會出現下列三個名詞。

  1. transparency: 找到model中可以令人理解的部分。

    Attention mechanisms do provide a look into the inner workings of a model, as they produce an easily-understandable weighting of hidden states.

  2. explainability
    • 可以提供決策
    • 可以模擬人類從過去所發生的事情中進行推斷的能力
  3. interpretability
    • relationship between input and output (類似回歸裡面的regressor)
    • 需要專家幫忙鑑定

Experiments

Dataset

跟J&W做一樣的實驗,但只有做 binary classification,沒有做QA。

    1. Diabetes: whether a patient is diagnosed with diabetes from their ICU discharge summary
  • Anemia: hether the patient is diagnosed with acute (neg.) or chronic (pos.) anemia
  • IMDb: positive or negative sentiment from movie reviews
  • SST:positive or negative sentiment from sentences
  • AgNews:the topic of news articles as either world (neg.) or business (pos.)
  • 20News:the topic of news articles as either baseball (neg.) or hockey (pos.)

Model

same as J&W

  • single-layer bidirectional LSTM with tanh activation
  • attention layer
  • softmax prediction
  • hyperparameters are set to be same as J&W

Uniform as the Adversary

Attention is not explanation if you don’t need it

每個字給相同的權重在某些task上面其實跟attention一樣好,那在這些task上面,attention的確沒什麼用。

Variance within a Model

(其實我不太懂這個實驗到底要幹嘛)

  • Test whether the variances observed by J&W between trained attention scores and adversarially-obtained ones are unusual.
  • Train 8 models with different initial seeds
  • Plot the distribution of JSD(Jensen-Shannon Divergence) attention weights by the models
  • IMDB, SST, Anemia are robust to with seed changes.
  • (e): J&W 所產生出的advesary attentions 確實和本來的model很不一樣
  • (d):negative-label instances in Diabetes dataset subject to relatively arbitrary distributions from the different random seeds.因此對於分布差很多的attention weights,效果可能還是不錯
  • (f): 所以J&W 所產生出的advesary attentions 不是足夠adversarial

Training an Adversary

所以要怎麼樣才能建立合理的Adversary呢?上面有提到attention weight 不是獨立於model而存在的,所以每換一組attention weight都應該train a new model

  • Given a base model $M_b$, train a model $M_a$ which

    • can provide similar prediction scores
    • its distribution of attention weights should be very different from $M_b$
  • Loss function: $L(M_a, M_b)=\sum^N_{i=1}TVD(\hat{y}_a^i,\hat{y}_b^i)-\lambda KL(\alpha_a^i||\alpha_b^i)$
    $TVD(\hat{y}_a^i,\hat{y}_b^i)=\frac{1}{2}|\hat{y}_a^i-\hat{y}_b^i|$

  • 以下的圖:

    • 曲線越convex 表示attention weight 越可以被操控。
    • x軸是某個model和$M_b$的attention JSD
  • 圖例:

    • 三角形: 固定$\lambda$ (不同dataet自己最好的$\lambda$),使用不同random seeds 所train 出來的model
    • 正方形: uniform weight model
    • 加號: J&W’s adversary model
    • 點點: 不同$\lambda$所train 出來的model

Diagnosing attention distribution by guiding simpler models

使用RNN系列的 model,會有前後字的影響,其實很難排除前後字是不是會影響解釋性,所以這篇使用MLP(non-contextual model,不能看左右鄰居)來診斷if attention weights provide better guides.

  • 選擇一組pretrained attention weight (有對比MLP自己學)
  • train MLP
  • 實驗結果有發現用本來的model還是最好的(so attention provides better guide)

Conclusion

  • 首先要定義好explanation是甚麼
  • J&W 的實驗有不少漏洞,我們提供了比要合理的實驗方式
  • 從MLP的實驗可以看到attention is somehow meaningful
  • Future work: 擴展實驗到QA tasks、不是英文的語言、請專家鑑定

Reference

Powered by Hexo and Hexo-theme-hiker

Copyright © 2020 - 2021 DSMI Lab's website All Rights Reserved.

UV : | PV :