API Reference
L-GATr Networks
We provide two main L-GATr networks, LGATr as a stack of transformer encoders,
and ConditionalLGATr as a stack of transformer decoders.
For tasks where conditional inputs are required, you can process the condition with a LGATr
and then include this processed condition using a ConditionalLGATr.
In addition LGATrSlim and ConditionalLGATrSlim
provide more efficient versions of the respective networks using only scalar and vector representations.
|
L-GATr network. |
Conditional L-GATr network. |
|
|
L-GATr-slim network. |
Conditional L-GATr-slim network. |
L-GATr Layers
The LGATr and ConditionalLGATr networks
have a structure similar to standard transformers. We construct them using variants of the standard
transformer layers adapted to the geometric algebra framework.
|
L-GATr encoder block. |
|
L-GATr decoder block. |
|
Linear layer. |
L-GATr self-attention. |
|
|
L-GATr cross-attention. |
|
MLP with geometric product. |
Geometric product on multivectors. |
|
|
Gated nonlinearity on multivectors. |
Layer normalization. |
|
Dropout on multivectors. |
L-GATr Primitives
The L-GATr primitives implement the core equivariant operations and are called by the L-GATr layers.
Equivariant attention. |
|
Geometric product. |
|
Grade dropout. |
|
Invariants, e.g. inner product, absolute squared norm, pin invariants. |
|
Linear operations on multivectors, in particular linear basis maps. |
|
Gated nonlinearities on multivectors. |
|
Multivector normalization. |
L-GATr Configuration Classes
L-GATr uses dataclass objects to organize less relevant hyperparameters like number of heads or the MLP nonlinearity.
The MLPConfig, SelfAttentionConfig and CrossAttentionConfig are arguments for the LGATr/ConditionalLGATr modules,
whereas the LGATrConfig is a global object that is accessed within the L-GATr primitives.
Configuration for global settings like the symmetry group. |
|
Configuration for self-attention. |
|
Configuration for cross-attention. |
|
Geometric MLP configuration. |
Interface to the Geometric Algebra
Before we feed data into L-GATr networks and after we extract results, we have to convert between common scalar/vector objects and multivectors. This is very simple, we still introduce convenience methods for this step. We also include functionality to construct spurions, or reference multivectors, which can be added as extra items or channels to break equivariance at the input level.
Embedding and extracting scalars into multivectors. |
|
Embedding and extracting vectors into multivectors. |
|
Embedding and extracting pseudoscalars into multivectors. |
|
Embedding and extracting axial vectors into multivectors. |
|
Tools to include reference multivectors ('spurions') for symmetry breaking. |
L-GATr-slim Layers
In addition to the full L-GATr network, we provide a slimmed-down version that uses only scalar and vector representations instead of full multivectors. This approach allows a more efficient implementation while achieving similar performance on all high-energy physics tasks we have tested so far.
|
A single block of the L-GATr-slim, consisting of self-attention and MLP layers, pre-norm and residual connections. |
|
Multi-layer perceptron (MLP) for vector and scalar features. |
|
Gated linear unit (GLU) for vector and scalar features. |
|
Linear operations for vector and scalar features. |
|
Normalize jointly over vector and scalar features. |
|
Dropout module for scalar and vector features. |