cuEquivariance Kernels¶

OF3 supports cuEquivariance triangle_multiplicative_update and triangle_attention kernels which can speed up inference/training of the model. Note: cuEquivariance acceleration can be used while DeepSpeed acceleration is enabled. cuEquivariance would take precedence, and then would fall back to either DeepSpeed (if enabled) or PyTorch for the shapes it does not handle efficiently. Notably, it would fall back for shorter sequences (threshold controlled by CUEQ_TRIATTN_FALLBACK_THRESHOLD environment variable), and for shapes with hidden dimension > 128 (diffusion transformer shapes).

To enable, first install OpenFold3 with cuEquivariance:

pip install openfold3[cuequivariance]

Then, to enable these kernels via the runner.yaml, add the following:

model_update:
  presets: 
    - "predict"
    - "low_mem"  # for lower memory systems
  custom:
    settings:
      memory:
        eval:
          use_cueq_triangle_kernels: true
          use_deepspeed_evo_attention: true  # set this to False to use cueq only

This is specifically for inference, but similar settings can be used for training.