Anthrogen Introduces Odyssey: A 102B Parameter Protein Language Model that Replaces Attention with Consensus and Trains with Discrete Diffusion

Anthrogen has launched Odyssey, a household of protein language fashions for sequence and construction era, protein modifying, and conditional design. The manufacturing fashions vary from 1.2B to 102B parameters. The Anthrogen’s analysis crew positions Odyssey as a frontier, multimodal mannequin for actual protein design workloads, and notes that an API is in early entry.

What drawback does Odyssey goal?
Protein design {couples} amino acid sequence with 3D construction and with useful context. Many prior fashions undertake self consideration, which mixes info throughout the whole sequence directly. Proteins comply with geometric constraints, so lengthy vary results journey by means of native neighborhoods in 3D. Anthrogen frames this as a locality drawback and proposes a brand new propagation rule, known as Consensus, that higher matches the area.

Input illustration and tokenization
Odyssey is multimodal. It embeds sequence tokens, construction tokens, and light-weight useful cues, then fuses them right into a shared illustration. For construction, Odyssey makes use of a finite scalar quantizer, FSQ, to transform 3D geometry into compact tokens. Think of FSQ as an alphabet for shapes that lets the mannequin learn construction as simply as sequence. Functional cues can embody area tags, secondary construction hints, orthologous group labels, or brief textual content descriptors. This joint view provides the mannequin entry to native sequence patterns and lengthy vary geometric relations in a single latent area.

Backbone change, Consensus as a substitute of self consideration
Consensus replaces international self consideration with iterative, locality conscious updates on a sparse contact or sequence graph. Each layer encourages close by neighborhoods to agree first, then spreads that settlement outward throughout the chain and contact graph. This change alters compute. Self consideration scales as O(L²) with sequence size L. Anthrogen stories that Consensus scales as O(L), which retains lengthy sequences and multi area constructs inexpensive. The firm additionally stories improved robustness to studying charge decisions at bigger scales, which reduces brittle runs and restarts.

Training goal and era, discrete diffusion
Odyssey trains with discrete diffusion on sequence and construction tokens. The ahead course of applies masking noise that mimics mutation. The reverse time denoiser learns to reconstruct constant sequence and coordinates that work collectively. At inference, the identical reverse course of helps conditional era and modifying. You can maintain a scaffold, repair a motif, masks a loop, add a useful tag, and then let the mannequin full the remaining whereas conserving sequence and construction in sync.
Anthrogen stories matched comparisons the place diffusion outperforms masked language modeling throughout analysis. The web page notes decrease coaching perplexities for diffusion versus complicated masking, and decrease or comparable coaching perplexities versus easy masking. In validation, diffusion fashions outperform their masked counterparts, whereas a 1.2B masked mannequin tends to overfit to its personal masking schedule. The firm argues that diffusion fashions the joint distribution of the total protein, which aligns with sequence plus construction co design.

Key takeaways
- Odyssey is a multimodal protein mannequin household that fuses sequence, construction, and useful context, with manufacturing fashions at 1.2B, 8B, and 102B parameters.
- Consensus replaces self consideration with locality conscious propagation that scales as O(L) and exhibits strong studying charge habits at bigger scales.
- FSQ converts 3D coordinates into discrete construction tokens for joint sequence and construction modeling.
- Discrete diffusion trains a reverse time denoiser and, in matched comparisons, outperforms masked language modeling throughout analysis.
- Anthrogen stories higher efficiency with about 10x much less information than competing fashions, which addresses information shortage in protein modeling.
Editorial Comments
Odyssey is spectacular mannequin as a result of it operationalizes joint sequence and construction modeling with FSQ, Consensus, and discrete diffusion, enabling conditional design and modifying below sensible constraints. Odyssey scales to 102B parameters with O(L) complexity for Consensus, which lowers value for lengthy proteins and improves learning-rate robustness. Anthrogen stories diffusion outperforming masked language modeling in matched evaluations, which aligns with co-design aims. The system targets multi-objective design, together with efficiency, specificity, stability, and manufacturability. The analysis crew emphasizes information effectivity close to 10x versus competing fashions, which is materials in domains with scarce labeled information.
Check out the Paper, and Technical details. Feel free to take a look at our GitHub Page for Tutorials, Codes and Notebooks. Also, be happy to comply with us on Twitter and don’t overlook to affix our 100k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.
The publish Anthrogen Introduces Odyssey: A 102B Parameter Protein Language Model that Replaces Attention with Consensus and Trains with Discrete Diffusion appeared first on MarkTechPost.