Google AI Releases C2S-Scale 27B Model that Translate Complex Single-Cell Gene Expression Data into ‘cell sentences’ that LLMs can Understand

ByRicardo October 17, 2025

A group of researchers from Google Research, Google DeepMind, and Yale launched C2S-Scale 27B, a 27-billion-parameter basis mannequin for single-cell evaluation constructed on Gemma-2. The mannequin formalizes single-cell RNA-seq (scRNA-seq) profiles as “cell sentences”—ordered lists of gene symbols—so that a language mannequin can natively parse and purpose over mobile states. Beyond benchmarking positive aspects, the analysis group studies an experimentally validated, context-dependent pathway: CK2 inhibition (silmitasertib/CX-4945) mixed with low-dose interferon amplifies antigen presentation, a mechanism that might make “chilly” tumors extra aware of immunotherapy. The result’s ~50% improve in antigen presentation in vitro underneath the mixed situation.

Understanding the mannequin

C2S-Scale converts a high-dimensional expression vector into textual content by rank-ordering genes and emitting the top-Ok symbols as a gene-name sequence. This illustration aligns single-cell information with customary LLM toolchains and permits duties reminiscent of cell-type prediction, tissue classification, cluster captioning, perturbation prediction, and organic QA to be phrased as textual content prompts and completions.

https://github.com/vandijklab/cell2sentence

Training information, stack, and launch

C2S-Scale-Gemma-2-27B is constructed on Gemma-2 27B (decoder-only Transformer), educated on Google TPU v5, and launched underneath CC-BY-4.0. The coaching corpus aggregates >800 public scRNA-seq datasets spanning >57M cells (human and mouse) with related metadata and textual context; pretraining unifies transcriptomic tokens and organic textual content into a single multimodal corpus.

The key outcome: an interferon-conditional amplifier

The analysis group constructed a dual-context digital display over >4,000 medicine to search out compounds that increase antigen presentation (MHC-I program) solely in immune-context-positive settings—i.e., main affected person samples with low interferon tone—whereas having negligible impact in immune-context-neutral cell-line information. The mannequin predicted a hanging context break up for silmitasertib (CK2 inhibitor): sturdy MHC-I upregulation with low-dose interferon, little to none with out interferon. The analysis group studies in-lab validation in human neuroendocrine fashions unseen in coaching, with the mixture (silmitasertib + low-dose interferon) producing a marked, synergistic improve in antigen presentation (≈50% of their assays).

The amplifier lowers the response threshold to interferon fairly than initiating antigen presentation de novo; flow-cytometry readouts present HLA-A,B,C upregulation solely underneath mixed therapy (together with IFN-β and IFN-γ), throughout two neuroendocrine fashions, with consultant MFI positive aspects (e.g., 13.6% @10 nM and 34.9% @1000 nM silmitasertib in a single mannequin).

Key Takeaways

C2S-Scale 27B (Gemma-2) encodes scRNA-seq profiles as textual “cell sentences,” enabling LLM-native single-cell evaluation workflows.
In a two-context digital display (>4,000 compounds), the mannequin predicted an interferon-conditional amplifier: CK2 inhibition (silmitasertib) boosts MHC-I antigen-presentation solely with low-dose IFN.
Wet-lab exams in human neuroendocrine cell fashions confirmed the prediction, with ~50% antigen-presentation improve for silmitasertib+IFN versus both alone; this stays preclinical/in vitro.
Open weights and utilization docs are reside on Hugging Face (vandijklab) with each 27B and 2B Gemma variants for analysis use.

Editorial Comments

C2S-Scale 27B is a technically credible step for LLMs in biology: translating scRNA-seq into “cell sentences” lets a Gemma-2 mannequin run programmatic queries over cell states and perturbations, and in observe it surfaced an interferon-conditional amplifier—silmitasertib (CK2 inhibition)—that will increase MHC-I antigen presentation solely with low-dose IFN, a mechanism the group then validated in vitro. The worth right here isn’t headline rhetoric however the workflow: text-native screening throughout >4k compounds underneath twin immune contexts to suggest a context-dependent pathway that might convert immune-“chilly” tumors towards visibility. That stated, all proof is preclinical and bench-scale; the precise learn is “hypothesis-generating AI” with open weights enabling replication and stress-testing, not a medical declare.

Check out the Technical Paper, Model on HF, GitHub Page and Technical details . Feel free to take a look at our GitHub Page for Tutorials, Codes and Notebooks. Also, be happy to observe us on Twitter and don’t neglect to affix our 100k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.

The publish Google AI Releases C2S-Scale 27B Model that Translate Complex Single-Cell Gene Expression Data into ‘cell sentences’ that LLMs can Understand appeared first on MarkTechPost.

AI Paper Summary AI Shorts

NVIDIA AI Releases Nemotron-Elastic-12B: A Single AI Model that Gives You 6B/9B/12B Variants without Extra Training Cost
ByRicardo November 24, 2025

Why are AI dev groups nonetheless coaching and storing a number of giant language fashions for various deployment wants when one elastic mannequin can generate a number of sizes on the similar value? NVIDIA is collapsing the same old ‘mannequin household’ stack right into a single coaching job. NVIDIA AI workforce releases Nemotron-Elastic-12B, a 12B…

Read More NVIDIA AI Releases Nemotron-Elastic-12B: A Single AI Model that Gives You 6B/9B/12B Variants without Extra Training Cost
AI Paper Summary AI Shorts

Nous Research Team Releases Hermes 4: A Family of Open-Weight AI Models with Hybrid Reasoning
ByRicardo August 28, 2025August 28, 2025

Nous Analysis has launched Hermes 4, a household of open-weight fashions (14B, 70B, and 405B parameter sizes based mostly on Llama 3.1 checkpoints) that achieves frontier-level efficiency via pure post-training strategies. Hermes 4 introduces hybrid reasoning – fashions can toggle between commonplace responses and specific reasoning utilizing <assume>…</assume> tags when complicated issues require deeper deliberation….

Read More Nous Research Team Releases Hermes 4: A Family of Open-Weight AI Models with Hybrid Reasoning
AI Paper Summary AI Shorts

Salesforce AI Research Releases CoDA-1.7B: a Discrete-Diffusion Code Model with Bidirectional, Parallel Token Generation
ByRicardo October 6, 2025

Salesforce AI Research launched CoDA-1.7B, a diffusion-based language mannequin for code that generates by denoising entire sequences with bidirectional context, updating a number of tokens in parallel moderately than left-to-right next-token prediction. The analysis group revealed each Base and Instruct checkpoints and an end-to-end coaching/analysis/serving stack. Understanding the structure and coaching CoDA adapts a 1.7B-parameter…

Read More Salesforce AI Research Releases CoDA-1.7B: a Discrete-Diffusion Code Model with Bidirectional, Parallel Token Generation
AI Shorts Applications

Meet Trackio: The Free, Local-First, Open-Source Experiment Tracker Python Library that Simplifies and Enhances Machine Learning Workflows
ByRicardo August 2, 2025

Experiment tracking is an essential part of modern machine learning workflows. Whether you’re tweaking hyperparameters, monitoring training metrics, or collaborating with colleagues, it’s crucial to have robust, flexible tools that make tracking experiments straightforward and insightful. However, many existing experiment tracking solutions require complex setup, come with licensing fees, or lock user data into proprietary…

Read More Meet Trackio: The Free, Local-First, Open-Source Experiment Tracker Python Library that Simplifies and Enhances Machine Learning Workflows
AI Shorts Applications

Alibaba Qwen Introduces Qwen3-MT: Next-Gen Multilingual Machine Translation Powered by Reinforcement Learning
ByRicardo July 25, 2025

Alibaba has introduced Qwen3-MT (qwen-mt-turbo) via Qwen API, its latest and most advanced machine translation model, designed to break language barriers with unprecedented accuracy, speed, and flexibility. Trained on trillions of multilingual tokens, Qwen3-MT supports over 92 languages—covering more than 95% of the global population. Leveraging cutting-edge architecture, reinforcement learning, and rich customization options, it delivers…

Read More Alibaba Qwen Introduces Qwen3-MT: Next-Gen Multilingual Machine Translation Powered by Reinforcement Learning
Agentic AI AI Paper Summary

This AI Paper Introduces ReaGAN: A Graph Agentic Network That Empowers Nodes with Autonomous Planning and Global Semantic Retrieval
ByRicardo August 16, 2025

How can we make every node in a graph its own intelligent agent—capable of personalized reasoning, adaptive retrieval, and autonomous decision-making? This is the core question explored by a group researchers from Rutgers University. The research team introduced ReaGAN—a Retrieval-augmented Graph Agentic Network that reimagines each node as an independent reasoning agent. Why Traditional GNNs…

Read More This AI Paper Introduces ReaGAN: A Graph Agentic Network That Empowers Nodes with Autonomous Planning and Global Semantic Retrieval

Google AI Releases C2S-Scale 27B Model that Translate Complex Single-Cell Gene Expression Data into ‘cell sentences’ that LLMs can Understand

Understanding the mannequin

Training information, stack, and launch

The key outcome: an interferon-conditional amplifier

Key Takeaways

Editorial Comments

NVIDIA AI Releases Nemotron-Elastic-12B: A Single AI Model that Gives You 6B/9B/12B Variants without Extra Training Cost

Nous Research Team Releases Hermes 4: A Family of Open-Weight AI Models with Hybrid Reasoning

Salesforce AI Research Releases CoDA-1.7B: a Discrete-Diffusion Code Model with Bidirectional, Parallel Token Generation

Meet Trackio: The Free, Local-First, Open-Source Experiment Tracker Python Library that Simplifies and Enhances Machine Learning Workflows

Alibaba Qwen Introduces Qwen3-MT: Next-Gen Multilingual Machine Translation Powered by Reinforcement Learning

This AI Paper Introduces ReaGAN: A Graph Agentic Network That Empowers Nodes with Autonomous Planning and Global Semantic Retrieval

Curated by experts. Filtered for relevance.

Resources

About

Subscribe & learn more every day!

Understanding the mannequin

Training information, stack, and launch

The key outcome: an interferon-conditional amplifier

Key Takeaways

Editorial Comments

Similar Posts

Curated by experts. Filtered for relevance.

Resources

About

Subscribe & learn more every day!