privacy-filter-nemotron
A fine-grained English PII token-classification model: a fine-tune of
openai/privacy-filter by OpenMed on NVIDIA's Nemotron-PII dataset. It labels
every token with a BIOES tag over 55 PII categories (221 classes), trading
the multilingual sibling's language breadth for category depth - identity,
contact, address, dates, government IDs, financial, healthcare, enterprise,
vehicle and digital entities (including api_key, ipv4/ipv6 and mac_address).
For multilingual text prefer privacy-filter-multilingual instead.
In LocalAI this is a PII detector for the NER redactor tier: set
known_usecases to [token_classify] (as below), and any model opts into
redaction by listing this one under pii.detectors. The detection policy
(which categories to mask vs block, and the score threshold) lives on this
model's own pii_detection block - see the overrides below. It runs locally
with no Python, served by the standalone privacy-filter backend's
TokenClassify RPC (constrained BIOES Viterbi decode into UTF-8 byte-offset
entity spans).
Architecture: gpt-oss-style sparse MoE (8 layers, d_model 640, 128 experts
top-4, ~1.5B total / ~50M active per token), bidirectional banded attention,
o200k tokenizer and a 221-way token-classification head; served via the
openai-privacy-filter architecture. F16, ~2.8 GB. (A smaller Q8_0 quant
exists on the GGUF repo for RAM-constrained use - validate it on your own
data, since for PII a single dropped span is a leak.)