> ## Documentation Index > Fetch the complete documentation index at: https://docs.blindfold.dev/llms.txt > Use this file to discover all available pages before exploring further. # RAG Pipeline Protection > Protect PII in Retrieval-Augmented Generation pipelines with selective ingestion redaction and query-time tokenization Learn how to build RAG pipelines where personal data never reaches your LLM provider. Blindfold provides two protection layers: **selective ingestion redaction** (strip contact info before indexing, keep names for searchability) and **query-time tokenization** (protect context and questions before the LLM, restore real data in responses). ## Why RAG Needs PII Protection RAG pipelines are the #1 pattern where PII leaks into LLMs. Documents retrieved from your knowledge base — support tickets, customer records, internal reports — often contain personal data. When those documents are embedded, stored, and retrieved, the PII flows through multiple systems: 1. **Retrieval results** — documents with PII are injected into LLM prompts 2. **LLM provider logs** — your provider sees the full prompt, including retrieved PII The privacy boundary is at the **LLM API call**, not the vector store. Your vector store is internal infrastructure; the LLM provider is an external third party. Blindfold protects data at both layers: selectively strip contact info from documents before they enter the vector store, and tokenize everything before it reaches the LLM. ## Security Trade-offs There is no one-size-fits-all approach to PII in RAG pipelines. The right choice depends on your threat model: | Approach | Names in vector store | Name-based search | PII at LLM boundary | Complexity | | ------------------------------------- | --------------------- | ------------------------ | ------------------- | ---------- | | **Selective redaction** (recommended) | Yes | Yes | No (tokenized) | Low | | **Full redaction** | No | No — content-based only | No | Low | | **Tokenize with stored mapping** | No (tokens only) | Yes (via reverse lookup) | No | High | ### Selective Redaction (Recommended) Redact **contact info** (emails, phones, IBANs) at ingestion — **keep person names** for searchability. At query time, search with the original question (names match), then tokenize context + question in a single call before the LLM. This is the approach used in all cookbook examples and described below. ### Full Redaction Redact **all PII** at ingestion. Strongest privacy — no personal data anywhere — but you lose the ability to search by name. The vector store can only match based on surrounding content. ### Tokenize with Stored Mapping (Advanced) Tokenize at ingestion and store the mapping. Build a reverse lookup to translate real names in queries to tokens. No PII in the vector store **and** name-based search works. See the [advanced section below](#advanced-tokenize-with-stored-mapping) for details. ## Two Protection Layers ### Layer 1: Selective Ingestion Redaction Redact contact info from documents before embedding and indexing. Names are kept so the vector store can match name-based queries. ```python theme={null} from blindfold import Blindfold blindfold = Blindfold(api_key="your-api-key") documents = [ "Customer John Smith (john@example.com) reported a billing error.", "Maria Garcia (+34 612 345 678) requested a data export.", ] safe_documents = [] for doc in documents: # Redact contact info only — keep names searchable result = blindfold.redact(doc, entities=["email address", "phone number"]) safe_documents.append(result.text) # "Customer John Smith ([EMAIL_ADDRESS]) reported a billing error." # Index safe_documents into your vector store ``` ```javascript theme={null} import { Blindfold } from '@blindfold/sdk'; const blindfold = new Blindfold({ apiKey: 'your-api-key' }); const documents = [ 'Customer John Smith (john@example.com) reported a billing error.', 'Maria Garcia (+34 612 345 678) requested a data export.', ]; const safeDocuments = []; for (const doc of documents) { // Redact contact info only — keep names searchable const result = await blindfold.redact(doc, { entities: ['email address', 'phone number'], }); safeDocuments.push(result.text); } // Index safeDocuments into your vector store ``` ```python theme={null} from langchain_blindfold import BlindfoldPIITransformer from langchain_core.documents import Document # Redact contact info only — keep names searchable transformer = BlindfoldPIITransformer( pii_method="redact", entities=["email address", "phone number"], ) docs = [ Document(page_content="Customer John Smith (john@example.com) reported a billing error."), Document(page_content="Maria Garcia (+34 612 345 678) requested a data export."), ] safe_docs = transformer.transform_documents(docs) # Index safe_docs into your vector store ``` **Why keep names?** At ingestion, person names are replaced with `[PERSON]`. At query time, names are tokenized to ``. Neither placeholder matches the other — so searching for "Hans Mueller" cannot find `[PERSON]` in the vector store. Keeping names at ingestion solves this and lets users search by name. Contact info (emails, phones) is rarely searched for and should always be redacted. ### Layer 2: Query-Time Tokenization After retrieval, tokenize the context and question **in a single call** before they reach the LLM. Then detokenize the response to restore real data. ```python theme={null} from blindfold import Blindfold from openai import OpenAI blindfold = Blindfold(api_key="your-api-key") openai_client = OpenAI() question = "What happened with John Smith's billing issue?" # Step 1: Search with original question — names match in vector store results = collection.query(query_texts=[question], n_results=3) context = "\n\n".join(results["documents"][0]) # Step 2: Single tokenize call — consistent token numbering prompt_text = f"Context:\n{context}\n\nQuestion: {question}" tokenized = blindfold.tokenize(prompt_text) # Step 3: Send to LLM — no PII in the prompt response = openai_client.chat.completions.create( model="gpt-4o-mini", messages=[ {"role": "system", "content": "Answer using the provided context."}, {"role": "user", "content": tokenized.text}, ], ) ai_response = response.choices[0].message.content # Step 4: Detokenize — restore real names in the response final = blindfold.detokenize(ai_response, tokenized.mapping) print(final.text) ``` ```javascript theme={null} import { Blindfold } from '@blindfold/sdk'; import OpenAI from 'openai'; const blindfold = new Blindfold({ apiKey: 'your-api-key' }); const openai = new OpenAI(); const question = "What happened with John Smith's billing issue?"; // Step 1: Search with original question — names match const results = await collection.query({ queryTexts: [question], nResults: 3, }); const context = results.documents[0].join('\n\n'); // Step 2: Single tokenize call — consistent token numbering const promptText = `Context:\n${context}\n\nQuestion: ${question}`; const tokenized = await blindfold.tokenize(promptText); // Step 3: Send to LLM — no PII const response = await openai.chat.completions.create({ model: 'gpt-4o-mini', messages: [ { role: 'system', content: 'Answer using the provided context.' }, { role: 'user', content: tokenized.text }, ], }); const aiResponse = response.choices[0].message.content; // Step 4: Detokenize const final = blindfold.detokenize(aiResponse, tokenized.mapping); console.log(final.text); ``` ```python theme={null} from blindfold import Blindfold from langchain_openai import ChatOpenAI from langchain_core.runnables import RunnableLambda blindfold_client = Blindfold(api_key="your-api-key") def retrieve_and_tokenize(question: str) -> dict: # Retrieve with original question — names match docs = retriever.invoke(question) context = "\n\n".join(doc.page_content for doc in docs) # Single tokenize call — consistent token numbering prompt_text = f"Context:\n{context}\n\nQuestion: {question}" tokenized = blindfold_client.tokenize(prompt_text) return {"tokenized_text": tokenized.text, "mapping": tokenized.mapping} chain = RunnableLambda(retrieve_and_tokenize) | ... # LLM + detokenize ``` **Why a single tokenize call?** If you tokenize the context and question separately, each call produces independent token numbering. Context might map `` to "Hans Mueller" while the question maps `` to "Marie Dupont" — creating mapping conflicts. A single call on the combined text ensures consistent numbering. ## Protection Method Comparison Choose the right protection method for your RAG use case: | Method | Reversible | Best for | Example output | | ------------ | -------------- | --------------------------------------- | --------------------------------- | | **Redact** | No | Ingestion — permanent PII removal | `[PERSON]`, `[EMAIL_ADDRESS]` | | **Tokenize** | Yes | Queries — protect input, restore output | ``, `` | | **Encrypt** | Yes (with key) | Regulated data requiring audit trail | `ENC_a8f3b2...` | | **Hash** | No | Analytics — consistent pseudonymous IDs | `HASH_a3f8b9c2d4e5` | **Recommended pattern:** Use `redact` with `entities` at ingestion time (Layer 1) to strip contact info while keeping names. At query time (Layer 2), search with the original question and `tokenize` the combined context + question before the LLM call. This gives you searchability by name and full PII protection at the LLM boundary. ## Advanced: Tokenize with Stored Mapping For the strongest privacy with full searchability — no PII in the vector store **and** name-based search — tokenize at ingestion and store the mapping. This is the most complete architecture but requires managing a mapping store. **How it works:** 1. **Ingestion**: `tokenize()` each document → store tokenized text in vector store + store mapping securely 2. **Query**: Build a reverse lookup from stored mappings. Replace real names in the query with their tokens before searching 3. **LLM**: Tokenized context + tokenized query → LLM sees only tokens 4. **Response**: Detokenize using stored mappings ```python theme={null} from blindfold import Blindfold blindfold = Blindfold(api_key="your-api-key") # === Ingestion === documents = [...] mapping_store = {} # In production: encrypted DB or secrets manager for doc in documents: result = blindfold.tokenize(doc) # Store tokenized text in vector store vectorstore.add(result.text) # Store mapping securely (keyed by doc ID or merged globally) mapping_store.update(result.mapping) # Build reverse lookup: real value → token reverse_lookup = {v: k for k, v in mapping_store.items()} # === Query === question = "What happened with Hans Mueller?" # Replace known real values with their tokens for real_value, token in reverse_lookup.items(): question = question.replace(real_value, token) # question: "What happened with ?" # Search with tokenized query — tokens match tokens in vector store results = vectorstore.query(question, n_results=3) # Context is already tokenized, question is already tokenized # Send directly to LLM — no PII response = llm.generate(context=results, question=question) # Detokenize for the user final = blindfold.detokenize(response, mapping_store) ``` **Trade-offs:** * Requires managing a mapping store (encrypted DB, secrets manager) * Reverse lookup needs exact string matching (partial names may not match) * More complex than the selective-redaction approach * But: **strongest privacy with full searchability** — no PII in the vector store at all `detokenize()` is a free local operation — no API call. This means the mapping store is the only infrastructure you need to manage. ## Policy Recommendations Match your compliance policy to your use case: | Use case | Policy | Region | Key entities detected | | ---------------- | ---------- | ------ | -------------------------------------------------- | | General RAG | `basic` | — | Names, emails, phones, addresses, credit cards | | EU customer data | `gdpr_eu` | `eu` | Names, emails, IBANs, national IDs, DOB, addresses | | US healthcare | `hipaa_us` | `us` | All 18 HIPAA identifiers (SSN, MRN, DOB, etc.) | | Payment data | `pci_dss` | — | Credit cards, CVVs, expiration dates | | Maximum coverage | `strict` | — | All supported entity types, lowest threshold | ```python theme={null} # GDPR-compliant RAG — redact contact info, keep names blindfold = Blindfold(api_key="your-key", region="eu") result = blindfold.redact(document, policy="gdpr_eu", entities=[ "email address", "phone number", "iban", "credit card number", "address", "date of birth", "national id number", ]) # HIPAA-compliant RAG blindfold = Blindfold(api_key="your-key", region="us") result = blindfold.redact(document, policy="hipaa_us") ``` ## Performance Tips * **Batch redaction at ingestion** — use `blindfold.redact_batch()` for processing multiple documents in one API call * **Async processing** — use `AsyncBlindfold` for concurrent document processing during ingestion * **Detokenization is free** — `detokenize()` is a local string replacement, no API call required * **Cache redacted documents** — once documents are redacted and indexed, no further Blindfold calls are needed for retrieval ## Cookbook Examples Complete, runnable examples for every RAG framework: Selective redaction + search-first tokenization TypeScript OpenAI + ChromaDB RAG pipeline BlindfoldPIITransformer + retrieve-then-tokenize LangChain.js RAG with inline PII protection Retrieve-then-tokenize with LlamaIndex LlamaIndex.TS with single tokenize call Multi-turn EU support chatbot with gdpr\_eu policy TypeScript multi-turn EU support chatbot ### Strategy Deep-Dives Standalone examples for each ingestion strategy — compare trade-offs side by side: Keep names, redact contact info — simplest approach TypeScript version of the selective redact strategy Tokenize everything, store per-document mappings TypeScript version of the stored mapping strategy Same person = same token everywhere — best search quality TypeScript version of the consistent registry strategy All 3 strategies side by side with CLI selection TypeScript version — all 3 strategies with CLI selection ### Role-Based Access Control (RBAC) Use Blindfold policies to implement role-based PII control — same vector store, different privacy levels per user role: Doctor, nurse, billing, researcher — each role sees different PII levels TypeScript version of the role-based PII control example