Meet MemPrivacy: An Edge-Cloud Framework that Uses Local Reversible Pseudonymization to Protect User Data Without Breaking Memory Utility

As LLM-powered brokers transfer from analysis to manufacturing, one design stress is turning into tougher to ignore: the extra helpful cloud-hosted reminiscence turns into, the extra non-public consumer knowledge it exposes. Researchers from MemTensor (Shanghai), HONOR Device and Tongji University have launched MemPrivacy, a framework that makes an attempt to resolve this stress with out sacrificing the utility that makes customized reminiscence worthwhile within the first place.

The Core Problem With Cloud Memory

When you work together with an AI agent, your dialog usually accommodates delicate particulars like well being circumstances, electronic mail addresses, monetary figures, passwords, and extra. In a typical edge-cloud deployment, the consumer’s machine (the sting) handles enter, whereas computation-heavy reminiscence administration and reasoning occur within the cloud. This structure is environment friendly, nevertheless it means uncooked, unfiltered consumer knowledge travels to and persists in cloud methods.

The danger isn’t theoretical. Prior research present that multi-turn reminiscence assaults can induce privateness violations with success charges up to 69%, and leakage assaults towards reminiscence methods can attain 75% success. Indirect immediate injection may even manipulate brokers into actively eliciting non-public data from customers. Once delicate content material enters cloud logs, vector databases, or exterior reminiscence shops, it will probably stay accessible via subsequent storage, retrieval, and reuse phases nicely past the unique interplay.

Prior works have tried to tackle this with masking — changing delicate values with tokens like ***. The downside is that masking destroys semantics. If a consumer asks an agent to draft a physician’s electronic mail and their blood strain studying and electronic mail tackle are each changed with ***, the cloud mannequin can not full the duty meaningfully. More principled strategies reminiscent of differential privateness and cryptographic safety provide stronger ensures however are troublesome to combine into interactive reminiscence pipelines with out degrading response high quality.

What MemPrivacy Does Differently

Rather than masking non-public content material, MemPrivacy replaces it with typed placeholders — structured tokens like <Health_Info_1> or <Email_1> — earlier than the enter leaves the native machine. The cloud mannequin receives semantically intact textual content and may purpose and retailer recollections usually; it simply by no means sees the precise values. When the cloud returns a response containing placeholders, the native machine appears up the originals from a safe native database and substitutes them again in. The consumer sees a completely coherent, customized response.

This design is named native reversible pseudonymization, and the total pipeline operates in three phases. Stage 1 (Uplink Desensitization): A light-weight on-device mannequin identifies privacy-sensitive spans within the enter, classifies every by kind and sensitivity stage, and replaces them with typed placeholders. The original-to-placeholder mappings are saved regionally and persist throughout classes so the identical worth at all times will get the identical placeholder. Stage 2 (Cloud Processing): The sanitized enter is shipped to the cloud agent or reminiscence system. The typed placeholders protect sufficient semantic construction for reminiscence formation and retrieval to perform accurately. Stage 3 (Downlink Restoration): The cloud response, which can comprise placeholders, is restored regionally by way of light-weight database lookup and string substitution, including negligible latency.

A Four-Level Privacy Taxonomy

A key contribution by the analysis workforce is a four-level privateness taxonomy (PL1–PL4) that defines what will get protected and at what threshold:

PL1 covers basic preferences, habits, and stylistic selections that don’t establish an individual and carry low danger. These usually are not protected by default.
PL2 consists of identifiable PII — actual names, telephone numbers, electronic mail addresses, detailed addresses, account usernames, and combos that may establish or hint a selected particular person.
PL3 covers extremely delicate PII: authorities doc numbers, monetary account particulars, well being data, exact location and trajectory knowledge, biometrics, uncooked communication content material, and delicate id attributes reminiscent of spiritual beliefs or ethnicity.
PL4 is the very best tier — credentials and secrets and techniques that are instantly exploitable: passwords, PINs, verification codes, session tokens, API keys, non-public keys, seed phrases, and undisclosed enterprise supplies. Exposure at this stage can instantly end in account takeover, monetary loss, or large-scale knowledge exfiltration.

Users can configure the masking threshold for instance, defending solely PL3 and PL4, or making use of full safety throughout PL2–PL4 — giving granular management over the privateness–utility trade-off.

MemPrivacy-Bench and Model Training

To practice and consider their method, the analysis workforce constructed MemPrivacy-Bench, a dataset masking 200 artificial consumer profiles and over 155,000 privateness situations (125,776 coaching, 29,967 take a look at) throughout balanced Chinese and English dialogue, spanning 7 high-level state of affairs classes and 23 fine-grained subcategories. The take a look at set accommodates 615 question-answer pairs throughout six reminiscence process varieties: fundamental reminiscence, temporal reasoning, adversarial questioning, dynamic updating, implicit inference, and knowledge aggregation. Annotations had been first generated by a dual-model pipeline utilizing Gemini-3.1-Pro and GPT-5.2, then verified by six human annotators, reaching a closing annotation accuracy of 98.08%.

The MemPrivacy extraction fashions are fine-tuned from Qwen3 base fashions at 0.6B, 1.7B, and 4B parameter scales utilizing supervised fine-tuning (SFT) adopted by reinforcement studying with Group Relative Policy Optimization (GRPO). GRPO estimates benefits based mostly on relative rewards throughout a number of sampled outputs per enter, utilizing F1 rating because the reward sign, avoiding the computational overhead of a individually educated critic. Training used 160 customers for the coaching cut up and 40 customers for the take a look at cut up.

Experimental Results

On MemPrivacy-Bench, the best-performing mannequin — MemPrivacy-4B-RL — achieves an F1 rating of 85.97%, in contrast to 78.41% for Gemini-3.1-Pro, the strongest general-purpose mannequin examined. Even the smallest mannequin, MemPrivacy-0.6B-SFT, reaches 83.09% F1, outperforming all general-purpose fashions evaluated. On the out-of-distribution PersonaMem-v2 benchmark, MemPrivacy-4B-RL achieves 94.48% F1, in contrast to 92.18% for DeepSeek-V3.2-Think, the most effective basic mannequin on that set.

OpenAI’s lately launched Privacy-Filter, a bidirectional token-classification mannequin for PII detection open-sourced. It achieves 35.50% F1 on MemPrivacy-Bench, a spot of over 50 share factors behind the most effective MemPrivacy mannequin, although it operates at considerably decrease latency (0.34s versus roughly 2s for MemPrivacy fashions on MemPrivacy-Bench).

On downstream reminiscence utility, MemPrivacy was examined throughout three broadly used reminiscence methods: LangMem, Mem0, and Memobase. When defending all PL2–PL4 content material, accuracy drops on MemPrivacy-Bench are contained to 0.73%–1.30% and 0.71%–1.60% on PersonaMem-v2, relative to no-protection baselines. By distinction, irreversible masking causes accuracy drops of 16.99%–41.87% on MemPrivacy-Bench, whereas untyped placeholder masking causes drops of 4.72%–6.67% on MemPrivacy-Bench and a couple of.67%–8.71% on PersonaMem-v2.

Key Takeaways

MemPrivacy replaces delicate consumer knowledge with semantically typed placeholders (e.g., <Health_Info_1>) on-device earlier than cloud transmission, so the cloud reminiscence system by no means receives uncooked non-public values.
The framework introduces a four-level privateness taxonomy (PL1–PL4) starting from basic preferences to instantly exploitable credentials, with user-configurable masking thresholds.
MemPrivacy-4B-RL achieves 85.97% F1 on MemPrivacy-Bench and 94.48% on PersonaMem-v2, outperforming GPT-5.2 (68.99%) and Gemini-3.1-Pro (78.41%) on privateness span extraction.
Across LangMem, Mem0, and Memobase, making use of MemPrivacy on the PL2–PL4 stage limits reminiscence utility loss to inside 1.6%, in contrast to accuracy drops of up to 41.87% with irreversible masking.
Models vary from 0.6B to 4B parameters, with per-message inference beneath two seconds, making the framework appropriate for on-device deployment with out noticeable latency.

Marktechpost’s Visual Explainer

MemPrivacy

Developer Guide

01 / 07 — Overview

What is MemPrivacy?

MemPrivacy is a privacy-preserving customized reminiscence administration framework for edge-cloud LLM brokers, developed by MemTensor, HONOR, and Tongji University.

In a typical edge-cloud agent, your uncooked enter — together with delicate knowledge like well being data, emails, and passwords — will get despatched instantly to the cloud for reminiscence processing. MemPrivacy stops that.

User Input
Raw textual content with non-public values

→

On-Device
Detect & change with typed placeholders

→

Cloud
Sees solely placeholders, causes usually

→

Restore
Original values reinserted regionally

Key Idea

Privacy safety is decoupled from semantic destruction. The cloud will get sufficient construction to purpose — however by no means the precise non-public values.

02 / 07 — The Problem

Why present approaches fall brief

Most cloud reminiscence methods obtain your uncooked enter in plaintext. Once that knowledge enters cloud logs or vector databases, it will probably persist indefinitely and be retrieved later.

!
Multi-turn reminiscence assaults can extract consumer knowledge with up to 69% success charge in accordance to revealed analysis.
!
Memory leakage assaults towards cloud reminiscence methods attain up to 75% success in documented research.
!
Full masking (changing values with ***) protects privateness however destroys the semantic cues the mannequin wants to full duties.
!
Differential privateness & cryptography provide sturdy ensures however are exhausting to combine into interactive reminiscence pipelines with out main utility loss.

MemPrivacy’s reply

Use semantically-typed placeholders — not clean masks — so the cloud can nonetheless purpose concerning the kind and position of knowledge with out seeing the precise worth.

03 / 07 — Privacy Levels

The Four-Level Privacy Taxonomy (PL1—PL4)

MemPrivacy classifies each detected span into one in all 4 ranges. You can configure which ranges get masked — e.g. masks solely PL3+PL4, or all of PL2—PL4.

PL1 Low
PL2 Identifiable
PL3 Highly Sensitive
PL4 Critical

Level	What it covers	Examples
PL1	Preferences, habits, stylistic selections. Cannot establish an individual.	Food preferences, tone selections
PL2	Information that can establish or hint a selected particular person.	Full title, electronic mail, telephone, tackle, account ID
PL3	Data whose leakage may cause important hurt to security, well being, or funds.	Medical data, checking account, passport quantity, biometrics, exact location
PL4	Immediately exploitable secrets and techniques — usable for account takeover or monetary loss.	Passwords, PINs, OTPs, API keys, non-public keys, session tokens

04 / 07 — Typed Placeholders

How typed placeholders protect utility

When a privateness span is detected, it’s changed with a structured token that carries the semantic kind of the data — not only a clean masks.

// Original consumer enter:

"My blood strain right now was 160/110.

 Reply to [email protected].

 Never point out my restoration code RC-7291."

// After MemPrivacy uplink desensitization: "My blood strain right now was <Health_Info_1>. Reply to [email protected]. Never point out my <Recovery_Code_1>."

The cloud sees <Health_Info_1> and is aware of it’s well being knowledge. It can draft the e-mail accurately. It by no means sees 160/110 or RC-7291.

Session Persistence

The authentic—to—placeholder mapping is saved in a native safe database and persists throughout classes. The similar worth at all times will get the identical placeholder, enabling constant long-term reminiscence.

Multiple spans of the identical kind are distinguished by incremental indices: <Email_1>, <Email_2>, and many others.

05 / 07 — Getting Started

Installation & mannequin setup

MemPrivacy fashions can be found at three scales for various edge {hardware} budgets: 0.6B, 1.7B, and 4B parameters (all based mostly on Qwen3). The 4B-RL mannequin is the strongest.

# Clone the repository

git clone https://github.com/MemTensor/MemPrivacy

# Install dependencies cd MemPrivacy pip set up -r necessities.txt

# Load mannequin from HuggingFace

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "IAAR-Shanghai/MemPrivacy-4B-RL" tokenizer = AutoTokenizer.from_pretrained(model_id) mannequin = AutoModelForCausalLM.from_pretrained( model_id, torch_dtype="auto", device_map="auto" )

Model assortment

All six mannequin variants (0.6B/1.7B/4B × SFT/RL) can be found at:
huggingface.co/collections/IAAR-Shanghai/memprivacy

06 / 07 — Integration

Integrating with Mem0, LangMem, or Memobase

MemPrivacy sits between your user-facing software and the cloud reminiscence system. The three-stage pipeline maps instantly onto your present structure.

1
Uplink: Pass uncooked consumer enter via the MemPrivacy mannequin. It returns an inventory of detected spans with (original_text, privacy_level, privacy_type). Replace every span at or above your configured threshold with a typed placeholder. Store mappings regionally.
2
Cloud name: Send the desensitized enter to your present reminiscence system (Mem0, LangMem, Memobase) as regular. No modifications to the cloud-side configuration are wanted.
3
Downlink: Scan the cloud response for placeholders. Query your native mapping database and substitute every placeholder with its authentic worth earlier than displaying to the consumer.

Masking threshold config

Set lambda = "PL4" to shield solely credentials, "PL3" for PL3+PL4, or "PL2" for full safety. Utility loss at PL4-only is beneath 0.89% throughout all examined reminiscence methods.

07 / 07 — Results & Resources

Benchmark outcomes & the place to go subsequent

Model	F1 (MemPrivacy-Bench)	F1 (PersonaMem-v2)	Latency
MemPrivacy-4B-RL	85.97%	94.48%	~2s
MemPrivacy-0.6B-RL	84.66%	93.40%	~1.6s
Gemini-3.1-Pro	78.41%	86.59%	~33s
OpenAI-Privacy-Filter	35.50%	85.27%	0.34s

Utility loss when defending PL2—PL4 content material throughout LangMem, Mem0, and Memobase is inside 1.6% vs. no-protection baselines. Irreversible masking causes up to 41.87% accuracy drop on the identical methods.

↗
Code: github.com/MemTensor/MemPrivacy
↗
Models: huggingface.co/collections/IAAR-Shanghai/memprivacy
↗
Paper: arxiv.org/abs/2605.09530

Check out the Paper and Model Weights. Also, be happy to observe us on Twitter and don’t overlook to be part of our 150k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.

Need to accomplice with us for selling your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar and many others.? Connect with us

The publish Meet MemPrivacy: An Edge-Cloud Framework that Uses Local Reversible Pseudonymization to Protect User Data Without Breaking Memory Utility appeared first on MarkTechPost.

Meet MemPrivacy: An Edge-Cloud Framework that Uses Local Reversible Pseudonymization to Protect User Data Without Breaking Memory Utility

The Core Problem With Cloud Memory

What MemPrivacy Does Differently

A Four-Level Privacy Taxonomy

MemPrivacy-Bench and Model Training

Experimental Results

Key Takeaways

Marktechpost’s Visual Explainer

Hugging Face Releases ml-intern: An Open-Source AI Agent that Automates the LLM Post-Training Workflow

New AI Method From Meta and NYU Boosts LLM Alignment Using Semi-Online Reinforcement Learning

ByteDance Introduces Seed-Prover: An Advanced Formal Reasoning System for Automated Mathematical Theorem Proving

Meta AI Open-Sources OpenZL: A Format-Aware Compression Framework with a Universal Decoder

Apple Released FastVLM: A Novel Hybrid Vision Encoder which is 85x Faster and 3.4x Smaller than Comparable Sized Vision Language Models (VLMs)

Step by Step Guide to Build an End-to-End Model Optimization Pipeline with NVIDIA Model Optimizer Using FastNAS Pruning and Fine-Tuning

Curated by experts. Filtered for relevance.

Resources

About

Subscribe & learn more every day!

The Core Problem With Cloud Memory

What MemPrivacy Does Differently

A Four-Level Privacy Taxonomy

MemPrivacy-Bench and Model Training

Experimental Results

Key Takeaways

Marktechpost’s Visual Explainer

Similar Posts

Curated by experts. Filtered for relevance.

Resources

About

Subscribe & learn more every day!