This is the code repository associated with the paper
"Abstractors and relational cross-attention: An inductive bias for explicit relational reasoning in Transformers" --- Awni Altabaa, Taylor Webb, Jonathan D. Cohen, John Lafferty.
This paper appears in ICLR 2024. The arXiv version is here: https://arxiv.org/abs/2304.00195. The project webpage contains a high-level summary of the paper and can be found here: https://awni00.github.io/abstractor.
The following is an outline of the repo:
abstracters.pyimplements the main variant of the abstracter module with positional symbols.symbol_retrieving_abstractor.pyimplements an abstractor with symbol-retrieval via symbolic attention.abstractor.pyis a 'simplified' implementation that avoids using tensorflow's MHA layer.autoregressive_abstractor.pyimplements sequence-to-sequence abstractor-based architectures.seq2seq_abstracter_models.pyis an older, less general, implementation of sequence-to-sequence models.multi_head_attention.pyis a fork of tensorflow's implementation of MultiHeadAttention which we have adjusted to support different kinds of activation functions applied to the attention scores.transformer_modules.pyincludes implementations of different Transformer modules (e.g.: Encoders, Decoders, etc.). Finally,attention.pyimplements different attention mechanisms for Transformers and Abstractors (including relational cross-attention).- The
experimentsdirectory contains the code for all experiments in the paper. See thereadme's therein for details on the experiments and instructions for replicating them.