Pseudonymize sensitive data before any LLM gets to see it.
noirdoc is an open-source Python library with a CLI plus a plugin for Claude Code — MIT-licensed and running entirely on your machine.
noirdoc is an open-source Python library and CLI for local, reversible pseudonymization of personal data in documents. The engine runs on your machine, is MIT-licensed, and is maintained by Nextaim.
Here's what it looks like before anything leaves your machine.
Anna Müller, born March 12, 1985, granted her tax advisor Markus Schmidt in Munich a comprehensive power of attorney on April 3, 2024.
<<PERSON_1>>, born <<DATE_1>>, granted her tax advisor <<PERSON_2>> in <<CITY_1>> a comprehensive power of attorney on <<DATE_2>>.
What the CLI does for you.
Four building blocks that together give you what honest local pseudonymization actually needs.
Reliably detects German PII.
The detectors are trained on German contracts, letters, and HR documents and reliably find names, addresses, IBANs, tax IDs, and phone numbers — even when the formatting isn't quite clean.
Pseudonymization with a mapping.
Every token points back to the original value. You use the redacted document with your LLM workflow and restore the real data in the response afterwards — all locally.
PDF, DOCX, TXT, and Markdown.
Pseudonymized documents keep their original format. The LLM sees a cleaned copy while you keep working with the original file.
Runs entirely on your machine.
The models run locally and the mapping stays on your disk. No API call ever leaves your machine — not even to us.
Here's how you get noirdoc onto your machine.
Install via pip, then call it from Python or straight from the shell.
1$ pip install noirdoc
2$ pip install noirdoc[full] with all optional detectors
3$ noirdoc models pull 1from noirdoc import Redactor
2
3r = Redactor(namespace="mandant-mueller")
4
5r.redact_file("vertrag.pdf", output="vertrag-clean.pdf")
6r.redact_file("brief.docx", output="brief-clean.docx")
7
8translate responses back
9original = r.reveal_text(llm_response) 1one-shot — mapping is discarded
2$ noirdoc redact vertrag.pdf -o vertrag-clean.pdf
3
4persistent — mapping is preserved
5$ noirdoc redact --namespace mandant-mueller brief.docx -o brief-clean.docx
6$ noirdoc reveal --namespace mandant-mueller brief-clean.docx -o brief-revealed.docx
7$ noirdoc lookup --namespace mandant-mueller "<>" MIT License · github.com/nextaim-de/noirdoc
The same engine powers our chat for sensitive data.
If you'd rather not deal with pip, Python, and models yourself — Noirdoc Chat is the managed version: same pseudonymization code, multiple models, and a GDPR-grade DPA, with no setup on your end.
Claude Code, without your data ever reaching Claude.
The plugin pseudonymizes your inputs locally before they reach Claude — and restores the responses automatically afterwards.
# add the marketplace once
$ /plugin marketplace add nextaim-de/noirdoc-claude-plugin
# install the plugin inside Claude Code
$ /plugin install noirdoc@nextaim Redacts without lifting a finger.
As soon as you open or read a protected file in Claude Code, the plugin replaces names, IBANs, and IDs with placeholders locally — before Claude gets to see anything.
Real values stay in your own terminal.
Run `noirdoc reveal` to see the original — but only in your own shell, never inside the Claude Code transcript. The conversation stays clean.
You decide what's protected.
Glob rules like `./incoming/**` or `*.contract.*` decide which files are pseudonymized automatically. Everything else stays untouched.
Mapping stays on your machine.
Pseudonymized copies live in `.noirdoc/cache/` and the reversible mapping stays local. No API call ever leaves your machine — not even to us.
MIT License · github.com/nextaim-de/noirdoc-claude-plugin
Pick the path that fits you.
Locally with the OSS CLI, inside your editor with the plugin, or managed through Noirdoc Chat — the pseudonymization underneath is always the same.