It says,"we project the context feature map to a query feature map and a key feature map. We then take the dot product of the two feature maps and a softmax to obtain an attention matrix"
but in network.py line 99,I just found "attention = self.att(inp)". this is what puzzled me.
It says,"we project the context feature map to a query feature map and a key feature map. We then take the dot product of the two feature maps and a softmax to obtain an attention matrix"
but in network.py line 99,I just found "attention = self.att(inp)". this is what puzzled me.