Google AI Research Introduce a Novel Machine Learning Approach that Transforms TimesFM into a Few-Shot Learner

Google Research introduces in-context fine-tuning (ICF) for time-series forecasting named as ‘TimesFM-ICF): a continued-pretraining recipe that teaches TimesFM to use a number of associated collection supplied straight within the immediate at inference time. The result’s a few-shot forecaster that matches supervised fine-tuning whereas delivering +6.8% accuracy over the bottom TimesFM throughout an OOD benchmark—no per-dataset coaching loop required.

What ache level in forecasting is being eradicated?

Most manufacturing workflows nonetheless commerce off between (a) one mannequin per dataset through supervised fine-tuning (accuracy, however heavy MLOps) and (b) zero-shot basis fashions (easy, however not domain-adapted). Google’s new strategy retains a single pre-trained TimesFM checkpoint however lets it adapt on the fly utilizing a handful of in-context examples from associated collection throughout inference, avoiding per-tenant coaching pipelines.

How does in-context fine-tuning work below the hood?

Start with TimesFM—a patched, decoder-only transformer that tokenizes 32-point enter patches and de-tokenizes 128-point outputs through a shared MLP—and proceed pre-training it on sequences that interleave the goal historical past with a number of “assist” collection. Now the important thing change launched is a learnable widespread separator token, so cross-example causal consideration can mine construction throughout examples with out conflating traits. The coaching goal stays next-token prediction; what’s new is the context development that teaches the mannequin to motive throughout a number of associated collection at inference time.

https://analysis.google/weblog/time-series-foundation-models-can-be-few-shot-learners/

What precisely is “few-shot” right here?

At inference, the person concatenates the goal historical past with kk extra time-series snippets (e.g., comparable SKUs, adjoining sensors), every delimited by the separator token. The mannequin’s consideration layers at the moment are explicitly skilled to leverage these in-context examples, analogous to LLM few-shot prompting—however for numeric sequences reasonably than textual content tokens. This shifts adaptation from parameter updates to immediate engineering over structured collection.

Does it really match supervised fine-tuning?

On a 23-dataset out-of-domain suite, TimesFM-ICF equals the efficiency of per-dataset TimesFM-FT whereas being 6.8% extra correct than TimesFM-Base (geometric imply of scaled MASE). The weblog additionally reveals the anticipated accuracy–latency trade-off: extra in-context examples yield higher forecasts at the price of longer inference. A “simply make the context longer” ablation signifies that structured in-context examples beat naive long-context alone.

How is that this totally different from Chronos-style approaches?

Chronos tokenizes values into a discrete vocabulary and demonstrated robust zero-shot accuracy and quick variants (e.g., Chronos-Bolt). Google’s contribution right here just isn’t one other tokenizer or headroom on zero-shot; it’s making a time-series FM behave like an LLM few-shot learner—studying from cross-series context at inference. That functionality closes the hole between “train-time adaptation” and “prompt-time adaptation” for numeric forecasting.

What are the architectural specifics to look at?

The analysis workforce highlights: (1) separator tokens to mark boundaries, (2) causal self-attention over blended histories and examples, (3) continued patching and shared MLP heads, and (4) continued pre-training to instill cross-example habits. Collectively, these allow the mannequin to deal with assist collection as informative exemplars reasonably than background noise.

Summary

Google’s in-context fine-tuning turns TimesFM into a sensible few-shot forecaster: a single pretrained checkpoint that adapts at inference through curated assist collection, delivering fine-tuning-level accuracy with out per-dataset coaching overhead—helpful for multi-tenant, latency-bounded deployments the place collection of assist units turns into the primary management floor

FAQs

1) What is Google’s “in-context fine-tuning” (ICF) for time collection?
ICF is sustained pre-training that situations TimesFM to make use of a number of associated collection positioned within the immediate at inference, enabling few-shot adaptation with out per-dataset gradient updates.

2) How does ICF differ from normal fine-tuning and zero-shot use?
Standard fine-tuning updates weights per dataset; zero-shot makes use of a mounted mannequin with solely the goal historical past. ICF retains weights mounted at deployment however learns throughout pre-training how to leverage further in-context examples, matching per-dataset fine-tuning on reported benchmarks.

3) What architectural or coaching adjustments have been launched?
TimesFM is continued-pretrained with sequences that interleave goal historical past and assist collection, separated by particular boundary tokens so causal consideration can exploit cross-series construction; the remainder of the decoder-only TimesFM stack stays intact.

4) What do the outcomes present relative to baselines?
On out-of-domain suites, ICF improves over TimesFM base and reaches parity with supervised fine-tuning; it’s evaluated towards robust TS baselines (e.g., PatchTST) and prior FMs (e.g., Chronos).

The put up Google AI Research Introduce a Novel Machine Learning Approach that Transforms TimesFM into a Few-Shot Learner appeared first on MarkTechPost.

Google AI Research Introduce a Novel Machine Learning Approach that Transforms TimesFM into a Few-Shot Learner

Table of contents

What ache level in forecasting is being eradicated?

How does in-context fine-tuning work below the hood?

What precisely is “few-shot” right here?

Does it really match supervised fine-tuning?

How is that this totally different from Chronos-style approaches?

What are the architectural specifics to look at?

Summary

FAQs

Andrej Karpathy Releases ‘nanochat’: A Minimal, End-to-End ChatGPT-Style Pipeline You Can Train in ~4 Hours for ~$100

DeepSeek V3.2-Exp Cuts Long-Context Costs with DeepSeek Sparse Attention (DSA) While Maintaining Benchmark Parity

Building a Hybrid Rule-Based and Machine Learning Framework to Detect and Defend Against Jailbreak Prompts in LLM Systems

This AI Paper from Alibaba Introduces Lumos-1: A Unified Autoregressive Video Generator Leveraging MM-RoPE and AR-DF for Efficient Spatiotemporal Modeling

Liquid AI Releases LFM2-8B-A1B: An On-Device Mixture-of-Experts with 8.3B Params and a 1.5B Active Params per Token

Google AI Ships TimesFM-2.5: Smaller, Longer-Context Foundation Model That Now Leads GIFT-Eval (Zero-Shot Forecasting)

Curated by experts. Filtered for relevance.

Resources

About

Subscribe & learn more every day!

Table of contents

What ache level in forecasting is being eradicated?

How does in-context fine-tuning work below the hood?

What precisely is “few-shot” right here?

Does it really match supervised fine-tuning?

How is that this totally different from Chronos-style approaches?

What are the architectural specifics to look at?

Summary

FAQs

Similar Posts

Curated by experts. Filtered for relevance.

Resources

About

Subscribe & learn more every day!