Technology & AI

OpenAI Releases Privacy Filter: A 1.5B-Parameter Open-Source PII Redaction Model with 50M Valid Parameters

OpenAI just quietly dropped something worth paying attention to. Released from Hugging Face under the Apache 2.0 license, Privacy Filter is an open, bi-directional tokenization model designed to identify and reconstruct personally identifiable information (PII) in text. It’s small enough to run in a web browser or on a laptop and fast enough for high-end data cleaning pipelines.

What it does

The Privacy Filter is a model named Enterprise Recognition (NER) but one tuned specifically for the privacy use case. It detects eight categories of critical spaces: account_number, private_address, private_email, private_person, private_phone, private_url, private_dateagain secret. I secret the class includes authentication formats, project-specific token patterns, and high-entropy strings – the model card clearly calls out the missed detections of ‘novel authentication formats’ and ‘secrets separated from the surrounding syntax’ as known failure modes, indicating what the class is trained to guide.

The intended use case is clear: dev teams need to clean datasets, scrub logs, or preprocess user-generated content before it enters a training pipeline or is stored in a data warehouse. Because it works on premise and on commodity hardware, it fits seamlessly into a growing set of easy-to-use AI tools that organizations can adopt without moving sensitive data to a third-party API.

Architecture is a Real Story

The Privacy Filter has a total of 1.5 billion parameters but only 50 million active parameters at the time of determination. That gap, which is almost 30x, is completely explained by the feed-forward design of the mixed model of expertise (MoE).

Architecturally, the model is ‘similar to gpt-oss, albeit smaller in size.’ It is built on 8 blocks of transformer pre-norm with residual width of the stream (d_model) out of 640. Attention uses grouped-query attention (GQA) with rotary positional embedding (RoPE) – 14 query heads over 2 KV heads, which means that 7 query heads share each KV head – which reduces the memory of the key value cache significantly compared to the usual multi-head attention. RoPE is also the one that generates the core window of 128,000 model tokens. Forwarding layers use a small MoE with 128 experts in total and top-4 routers per token: for each token, 4 experts out of 128 are active, and all other expert parameters remain inactive. This is the same machine that produces a 30x gap between the absolute and effective parameter values.

A Three-Stage Training Pipeline

What makes this model unique is not just its size, but how it is built. The Privacy Filter is built-in three different categories.

First of allwas automatically pretrained as a standard token prediction language model — in the tradition of GPT-style decoders. Second timethat test area was structurally changed: the head of the language model was replaced by the head of token classification over the privacy label taxonomy, and the direction of attention was changed from causal (unidirectional) bidirectional combined attention with a band size of 128, giving each token an active context window of 257 tokens (the token itself and 128 on each side). The third timeThe modified model was after training and loss of classes supervised — a separate fine-tuning phase using PII-labeled data, separate from the architecture transformation step.

Autoregressive pretraining gives the model richer language representations learned from far more data and computation than any task-specific budget can support. The transformation enables bidirectional context, which is important in NER – a word like ‘Alice’ in ‘Alice Smith called’ is not changed, but only in the left context can be missed. Post-supervised training then specialized in those presentations of the privacy work.

Compared to classical hidden language model methods like BERT, this is a post-training modification of the default model instead of the native LM masked setup – a logical difference in how the underlying representations are constructed.

Constrained Viterbi Decoding Instead of Argmax

Privacy Filter label scheme using BIOES – Start, Inside, Outside, End, Alone. Each of the 8 privacy classes receives four token token classes (B-, I-, E-, S-) and an O background class, bringing a total of 33 output classes per token. For sequences of length T, the output logs are shaped [T, 33].

Rather than taking the argmax of each token over those 33 logs, which would produce a disjoint label sequence such as B- followed immediately by S-, the model uses a time-bound Viterbi decoder. The decoder uses linear-chain transition points and enforces BIOES valid boundary transitions. It scores the complete label paths using the start, transition, and end terms, as well six transformation parameters which specifically controls: lateral persistence, span insertion, span continuity, span closure, and boundary-to-boundary provision. This global method improvement improves span consistency and boundary stability by making the decision of each token dependent on the sequence-level structure, not just local logs – which is especially important for noisy or mixed-format text.

Those six transition-bias parameters are also usable at runtime. This prompts AI developers to push for broader, convergent masking for improved recall, or to tighten parameters for improved accuracy, without retraining the model.

Key Takeaways

  • OpenAI has released a Privacy Filteropen source model of PII under Apache 2.0, which is able to find eight critical sections including account_number, private_person, secretand more — can be used locally without transferring data to an external API.
  • The model has a total of 1.5B parameters but only 50M are valid for inferencethanks to MoE’s feed forwarding design with 128 experts and a maximum of 4 lanes per token — making it simple enough to run in a browser or on a laptop.
  • The backbone is similar to the gpt-oss structure: 8 pre-norm transformer blocks, d_model=640attention for questions collected with RoPE, and a small MoE FFN — first pre-trained automatically, then converted to a bidirectional band-aided attention encoder, and then trained after supervised categorical loss.
  • Considered, it uses restricted Viterbi recording over the BIOES label scheme than argmax for each token, it generates corresponding span parameters with six adjustable parameters that allow developers to adjust the precision/recall tradeoff at runtime without retraining.

Check it out Model weights. Also, feel free to follow us Twitter and don’t forget to join our 130k+ ML SubReddit and Subscribe to Our newspaper. Wait! are you on telegram? now you can join us on telegram too.

Need to work with us on developing your GitHub Repo OR Hug Face Page OR Product Release OR Webinar etc.? contact us


Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button