OpenAI Releases Privacy Filter: A 1.5B-Parameter Open-Source PII Redaction Model with 50M Active Parameters
OpenAI simply quietly dropped one thing price paying shut consideration to. Released on Hugging Face underneath an Apache 2.0 license, Privacy Filter is an open, bidirectional token-classification mannequin purpose-built for detecting and redacting personally identifiable data (PII) in textual content. It is sufficiently small to run in an internet browser or on a laptop computer and quick sufficient for high-throughput knowledge sanitization pipelines.
What It Does
Privacy Filter is a Named Entity Recognition (NER) mannequin however one tuned particularly for the privateness use case. It detects eight classes of delicate spans: account_number, private_address, private_email, private_person, private_phone, private_url, private_date, and secret. The secret class covers credential codecs, project-specific token patterns, and high-entropy strings — the mannequin card explicitly calls out missed detection of ‘novel credential codecs’ and ‘secrets and techniques cut up throughout surrounding syntax’ as recognized failure modes, which indicators what the class is educated to focus on.
The meant use case is evident: dev groups that want to scrub datasets, scrub logs, or pre-process user-generated content material earlier than it enters a coaching pipeline or will get saved in a knowledge warehouse. Because it runs on-premises and on commodity {hardware}, it suits squarely into the rising set of edge-deployable AI instruments that organizations can undertake with out routing delicate knowledge to a third-party API.
The Architecture is the Real Story
Privacy Filter has 1.5 billion complete parameters however solely 50 million lively parameters at inference time. That hole, which is roughly 30x, is defined completely by the mannequin’s sparse mixture-of-experts (MoE) feed-forward design.
Architecturally, the mannequin is ‘just like gpt-oss, albeit of a smaller dimension.’ It is constructed on 8 pre-norm transformer blocks with a residual stream width (d_model) of 640. Attention makes use of grouped-query consideration (GQA) with rotary positional embeddings (RoPE) — 14 question heads over 2 KV heads, which means 7 question heads share every KV head — which reduces the reminiscence footprint of the key-value cache considerably in comparison with customary multi-head consideration. RoPE can also be what permits the mannequin’s 128,000-token context window. The feed-forward layers use sparse MoE with 128 complete consultants and top-4 routing per token: for every token, 4 of the 128 consultants are activated, and all different skilled parameters stay dormant. This is precisely the mechanism that produces the 30x hole between complete and lively parameter counts.
A Three-Phase Training Pipeline
What makes this mannequin architecturally uncommon isn’t just its dimension, however the way it was constructed. Privacy Filter was produced in three distinct phases.
First, it was pretrained autoregressively as a regular next-token prediction language mannequin — within the custom of GPT-style decoders. Second, that checkpoint was architecturally transformed: the language-model head was changed with a token-classification head over the privateness label taxonomy, and the eye mechanism was switched from causal (unidirectional) to bidirectional banded consideration with a band dimension of 128, giving every token an efficient context window of 257 tokens (the token itself plus 128 on either side). Third, the transformed mannequin was post-trained with a supervised classification loss — a definite fine-tuning part utilizing labeled PII knowledge, separate from the architectural conversion step.
The autoregressive pretraining offers the mannequin wealthy language representations realized from much more knowledge and compute than any task-specific finances would help. The architectural conversion permits bidirectional context, which is important for NER — a reputation like ‘Alice’ in ‘Alice Smith known as’ is unambiguous, however with solely left context it may very well be missed. The supervised post-training then specializes these representations for the privateness detection process.
Compared to classical masked-language-model approaches like BERT, this can be a post-training conversion of an autoregressive mannequin relatively than a local masked-LM setup — a significant distinction in how the bottom representations had been shaped.
Constrained Viterbi Decoding Instead of Argmax
The label scheme Privacy Filter makes use of is BIOES — Begin, Inside, Outside, End, Single. Each of the 8 privateness classes will get 4 boundary-tagged token lessons (B-, I-, E-, S-) plus the background class O, yielding 33 complete output lessons per token. For a sequence of size T, the output logits have form [T, 33].
Rather than taking a per-token argmax over these 33 logits, which may produce incoherent label sequences like B- adopted instantly by S-, the mannequin runs a constrained Viterbi decoder at inference time. The decoder makes use of linear-chain transition scoring and enforces legitimate BIOES boundary transitions. It scores full label paths utilizing begin, transition, and finish phrases, alongside with six transition-bias parameters that particularly management: background persistence, span entry, span continuation, span closure, and boundary-to-boundary handoff. This international path optimization improves span coherence and boundary stability by making every token resolution rely upon sequence-level construction, not simply native logits — notably beneficial in noisy or mixed-format textual content.
Those six transition-bias parameters are additionally user-tunable at runtime. This brings AI builders to push towards broader, extra contiguous masking for improved recall, or tighten boundaries for improved precision, with out retraining the mannequin.
Key Takeaways
- OpenAI launched Privacy Filter, an open-source PII redaction mannequin underneath Apache 2.0, able to detecting eight delicate span classes together with
account_number,private_person,secret, and extra — deployable on-premises with out routing knowledge to an exterior API. - The mannequin has 1.5B complete parameters however solely 50M lively at inference, due to a sparse MoE feed-forward design with 128 consultants and top-4 routing per token — making it light-weight sufficient to run in a browser or on a laptop computer.
- The spine is architecturally just like gpt-oss: 8 pre-norm transformer blocks,
d_model=640, grouped-query consideration with RoPE, and a sparse MoE FFN — first pretrained autoregressively, then transformed to a bidirectional banded consideration encoder, then post-trained with a supervised classification loss. - At inference, it runs constrained Viterbi decoding over a BIOES label scheme relatively than per-token argmax, producing coherent span boundaries with six tunable transition-bias parameters that permit engineers regulate the precision/recall tradeoff at runtime with out retraining.
Check out the Model Weights. Also, be happy to comply with us on Twitter and don’t neglect to hitch our 130k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.
Need to associate with us for selling your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar and so forth.? Connect with us
The put up OpenAI Releases Privacy Filter: A 1.5B-Parameter Open-Source PII Redaction Model with 50M Active Parameters appeared first on MarkTechPost.
