Meet MemPrivacy: An Edge-Cloud Framework that Uses Local Reversible Pseudonymization to Protect User Data Without Breaking Memory Utility
As LLM-powered brokers transfer from analysis to manufacturing, one design stress is turning into tougher to ignore: the extra helpful cloud-hosted reminiscence turns into, the extra non-public consumer knowledge it exposes. Researchers from MemTensor (Shanghai), HONOR Device and Tongji University have launched MemPrivacy, a framework that makes an attempt to resolve this stress with out sacrificing the utility that makes customized reminiscence worthwhile within the first place.
The Core Problem With Cloud Memory
When you work together with an AI agent, your dialog usually accommodates delicate particulars like well being circumstances, electronic mail addresses, monetary figures, passwords, and extra. In a typical edge-cloud deployment, the consumer’s machine (the sting) handles enter, whereas computation-heavy reminiscence administration and reasoning occur within the cloud. This structure is environment friendly, nevertheless it means uncooked, unfiltered consumer knowledge travels to and persists in cloud methods.
The danger isn’t theoretical. Prior research present that multi-turn reminiscence assaults can induce privateness violations with success charges up to 69%, and leakage assaults towards reminiscence methods can attain 75% success. Indirect immediate injection may even manipulate brokers into actively eliciting non-public data from customers. Once delicate content material enters cloud logs, vector databases, or exterior reminiscence shops, it will probably stay accessible via subsequent storage, retrieval, and reuse phases nicely past the unique interplay.
Prior works have tried to tackle this with masking — changing delicate values with tokens like ***. The downside is that masking destroys semantics. If a consumer asks an agent to draft a physician’s electronic mail and their blood strain studying and electronic mail tackle are each changed with ***, the cloud mannequin can not full the duty meaningfully. More principled strategies reminiscent of differential privateness and cryptographic safety provide stronger ensures however are troublesome to combine into interactive reminiscence pipelines with out degrading response high quality.

What MemPrivacy Does Differently
Rather than masking non-public content material, MemPrivacy replaces it with typed placeholders — structured tokens like <Health_Info_1> or <Email_1> — earlier than the enter leaves the native machine. The cloud mannequin receives semantically intact textual content and may purpose and retailer recollections usually; it simply by no means sees the precise values. When the cloud returns a response containing placeholders, the native machine appears up the originals from a safe native database and substitutes them again in. The consumer sees a completely coherent, customized response.
This design is named native reversible pseudonymization, and the total pipeline operates in three phases. Stage 1 (Uplink Desensitization): A light-weight on-device mannequin identifies privacy-sensitive spans within the enter, classifies every by kind and sensitivity stage, and replaces them with typed placeholders. The original-to-placeholder mappings are saved regionally and persist throughout classes so the identical worth at all times will get the identical placeholder. Stage 2 (Cloud Processing): The sanitized enter is shipped to the cloud agent or reminiscence system. The typed placeholders protect sufficient semantic construction for reminiscence formation and retrieval to perform accurately. Stage 3 (Downlink Restoration): The cloud response, which can comprise placeholders, is restored regionally by way of light-weight database lookup and string substitution, including negligible latency.
A Four-Level Privacy Taxonomy
A key contribution by the analysis workforce is a four-level privateness taxonomy (PL1–PL4) that defines what will get protected and at what threshold:
- PL1 covers basic preferences, habits, and stylistic selections that don’t establish an individual and carry low danger. These usually are not protected by default.
- PL2 consists of identifiable PII — actual names, telephone numbers, electronic mail addresses, detailed addresses, account usernames, and combos that may establish or hint a selected particular person.
- PL3 covers extremely delicate PII: authorities doc numbers, monetary account particulars, well being data, exact location and trajectory knowledge, biometrics, uncooked communication content material, and delicate id attributes reminiscent of spiritual beliefs or ethnicity.
- PL4 is the very best tier — credentials and secrets and techniques that are instantly exploitable: passwords, PINs, verification codes, session tokens, API keys, non-public keys, seed phrases, and undisclosed enterprise supplies. Exposure at this stage can instantly end in account takeover, monetary loss, or large-scale knowledge exfiltration.
Users can configure the masking threshold for instance, defending solely PL3 and PL4, or making use of full safety throughout PL2–PL4 — giving granular management over the privateness–utility trade-off.

MemPrivacy-Bench and Model Training
To practice and consider their method, the analysis workforce constructed MemPrivacy-Bench, a dataset masking 200 artificial consumer profiles and over 155,000 privateness situations (125,776 coaching, 29,967 take a look at) throughout balanced Chinese and English dialogue, spanning 7 high-level state of affairs classes and 23 fine-grained subcategories. The take a look at set accommodates 615 question-answer pairs throughout six reminiscence process varieties: fundamental reminiscence, temporal reasoning, adversarial questioning, dynamic updating, implicit inference, and knowledge aggregation. Annotations had been first generated by a dual-model pipeline utilizing Gemini-3.1-Pro and GPT-5.2, then verified by six human annotators, reaching a closing annotation accuracy of 98.08%.
The MemPrivacy extraction fashions are fine-tuned from Qwen3 base fashions at 0.6B, 1.7B, and 4B parameter scales utilizing supervised fine-tuning (SFT) adopted by reinforcement studying with Group Relative Policy Optimization (GRPO). GRPO estimates benefits based mostly on relative rewards throughout a number of sampled outputs per enter, utilizing F1 rating because the reward sign, avoiding the computational overhead of a individually educated critic. Training used 160 customers for the coaching cut up and 40 customers for the take a look at cut up.
Experimental Results
On MemPrivacy-Bench, the best-performing mannequin — MemPrivacy-4B-RL — achieves an F1 rating of 85.97%, in contrast to 78.41% for Gemini-3.1-Pro, the strongest general-purpose mannequin examined. Even the smallest mannequin, MemPrivacy-0.6B-SFT, reaches 83.09% F1, outperforming all general-purpose fashions evaluated. On the out-of-distribution PersonaMem-v2 benchmark, MemPrivacy-4B-RL achieves 94.48% F1, in contrast to 92.18% for DeepSeek-V3.2-Think, the most effective basic mannequin on that set.
OpenAI’s lately launched Privacy-Filter, a bidirectional token-classification mannequin for PII detection open-sourced. It achieves 35.50% F1 on MemPrivacy-Bench, a spot of over 50 share factors behind the most effective MemPrivacy mannequin, although it operates at considerably decrease latency (0.34s versus roughly 2s for MemPrivacy fashions on MemPrivacy-Bench).
On downstream reminiscence utility, MemPrivacy was examined throughout three broadly used reminiscence methods: LangMem, Mem0, and Memobase. When defending all PL2–PL4 content material, accuracy drops on MemPrivacy-Bench are contained to 0.73%–1.30% and 0.71%–1.60% on PersonaMem-v2, relative to no-protection baselines. By distinction, irreversible masking causes accuracy drops of 16.99%–41.87% on MemPrivacy-Bench, whereas untyped placeholder masking causes drops of 4.72%–6.67% on MemPrivacy-Bench and a couple of.67%–8.71% on PersonaMem-v2.
Key Takeaways
- MemPrivacy replaces delicate consumer knowledge with semantically typed placeholders (e.g.,
<Health_Info_1>) on-device earlier than cloud transmission, so the cloud reminiscence system by no means receives uncooked non-public values. - The framework introduces a four-level privateness taxonomy (PL1–PL4) starting from basic preferences to instantly exploitable credentials, with user-configurable masking thresholds.
- MemPrivacy-4B-RL achieves 85.97% F1 on MemPrivacy-Bench and 94.48% on PersonaMem-v2, outperforming GPT-5.2 (68.99%) and Gemini-3.1-Pro (78.41%) on privateness span extraction.
- Across LangMem, Mem0, and Memobase, making use of MemPrivacy on the PL2–PL4 stage limits reminiscence utility loss to inside 1.6%, in contrast to accuracy drops of up to 41.87% with irreversible masking.
- Models vary from 0.6B to 4B parameters, with per-message inference beneath two seconds, making the framework appropriate for on-device deployment with out noticeable latency.
Marktechpost’s Visual Explainer
MemPrivacy
Developer Guide
Check out the Paper and Model Weights. Also, be happy to observe us on Twitter and don’t overlook to be part of our 150k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.
Need to accomplice with us for selling your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar and many others.? Connect with us
The publish Meet MemPrivacy: An Edge-Cloud Framework that Uses Local Reversible Pseudonymization to Protect User Data Without Breaking Memory Utility appeared first on MarkTechPost.
