Skip to main content

What do you mean by Attention?


Question: 
What do you mean by Attention? What are the types of Attention in Neural Networks?

Expected Answer:
Define attention as a representation of a distribution learnt by a Neural Network, use case and an example. You could also specify types of attention and give applications of each.




Neural attention mechanism equips a neural network with the ability to focus on a subset of its inputs (or features): it selects specific inputs. Let xRd be an input vector, zRk a feature vector, a[0,1]k an attention vector, gRk an attention glimpse and fϕ(x) an attention network with parameters ϕ. Typically, attention is implemented as

where  is element-wise multiplication, while  is an output of another neural network  with parameters . We can talk about soft attention, which multiplies features with a (soft) mask of values between zero and one, or hard attention, when those values are constrained to be exactly zero or one, namely . In the latter case, we can use the hard attention mask to directly index the feature vector: , which changes its dimensionality and now with .

Types of Attention:
1. Visual
2. Hard Attention
3. Soft Attention
4. Gaussian Attention

Read more:

Comments

Popular posts from this blog

Statistical and Machine Learning Techniques applied to Noisy Data

Question: Given a form of Noisy Sensor data, How do you apply Statistical/Machine learning techniques to interpret it? This is a rather broad question with many interpretations. I would approach this to ask follow up questions e.g. What is the goal? One goal could be to get a high confidence value of the data or to understand how noisy the data is? Possible Answers: 1. If the goal is to model the noisy data, one option can be to model it as a Gaussian or Normal Distribution . In this case, the sample mean or expected value can be a reliable measure of the noisy sensor data. Once, you have fit a distribution to the data - variance can help as a tool to understand the effect of noise. 2. If the goal is to apply machine learning tools to predict next sample, one can use tools like Linear Regression or a Neural Network to fit the data to better understand and predict next sample based on features. 3. Another goal can be predicting outliers, one can apply Clustering methods l