> ## Documentation Index
> Fetch the complete documentation index at: https://docs.blindfold.dev/llms.txt
> Use this file to discover all available pages before exploring further.

# RAG Pipeline Protection

> Protect PII in Retrieval-Augmented Generation pipelines with selective ingestion redaction and query-time tokenization

Learn how to build RAG pipelines where personal data never reaches your LLM provider. Blindfold provides two protection layers: **selective ingestion redaction** (strip contact info before indexing, keep names for searchability) and **query-time tokenization** (protect context and questions before the LLM, restore real data in responses).

## Why RAG Needs PII Protection

RAG pipelines are the #1 pattern where PII leaks into LLMs. Documents retrieved from your knowledge base — support tickets, customer records, internal reports — often contain personal data. When those documents are embedded, stored, and retrieved, the PII flows through multiple systems:

1. **Retrieval results** — documents with PII are injected into LLM prompts
2. **LLM provider logs** — your provider sees the full prompt, including retrieved PII

The privacy boundary is at the **LLM API call**, not the vector store. Your vector store is internal infrastructure; the LLM provider is an external third party. Blindfold protects data at both layers: selectively strip contact info from documents before they enter the vector store, and tokenize everything before it reaches the LLM.

## Security Trade-offs

There is no one-size-fits-all approach to PII in RAG pipelines. The right choice depends on your threat model:

| Approach                              | Names in vector store | Name-based search        | PII at LLM boundary | Complexity |
| ------------------------------------- | --------------------- | ------------------------ | ------------------- | ---------- |
| **Selective redaction** (recommended) | Yes                   | Yes                      | No (tokenized)      | Low        |
| **Full redaction**                    | No                    | No — content-based only  | No                  | Low        |
| **Tokenize with stored mapping**      | No (tokens only)      | Yes (via reverse lookup) | No                  | High       |

### Selective Redaction (Recommended)

Redact **contact info** (emails, phones, IBANs) at ingestion — **keep person names** for searchability. At query time, search with the original question (names match), then tokenize context + question in a single call before the LLM.

This is the approach used in all cookbook examples and described below.

### Full Redaction

Redact **all PII** at ingestion. Strongest privacy — no personal data anywhere — but you lose the ability to search by name. The vector store can only match based on surrounding content.

### Tokenize with Stored Mapping (Advanced)

Tokenize at ingestion and store the mapping. Build a reverse lookup to translate real names in queries to tokens. No PII in the vector store **and** name-based search works. See the [advanced section below](#advanced-tokenize-with-stored-mapping) for details.

## Two Protection Layers

### Layer 1: Selective Ingestion Redaction

Redact contact info from documents before embedding and indexing. Names are kept so the vector store can match name-based queries.

<Tabs>
  <Tab title="Python">
    ```python theme={null}
    from blindfold import Blindfold

    blindfold = Blindfold(api_key="your-api-key")

    documents = [
        "Customer John Smith (john@example.com) reported a billing error.",
        "Maria Garcia (+34 612 345 678) requested a data export.",
    ]

    safe_documents = []
    for doc in documents:
        # Redact contact info only — keep names searchable
        result = blindfold.redact(doc, entities=["email address", "phone number"])
        safe_documents.append(result.text)
        # "Customer John Smith ([EMAIL_ADDRESS]) reported a billing error."

    # Index safe_documents into your vector store
    ```
  </Tab>

  <Tab title="JavaScript">
    ```javascript theme={null}
    import { Blindfold } from '@blindfold/sdk';

    const blindfold = new Blindfold({ apiKey: 'your-api-key' });

    const documents = [
      'Customer John Smith (john@example.com) reported a billing error.',
      'Maria Garcia (+34 612 345 678) requested a data export.',
    ];

    const safeDocuments = [];
    for (const doc of documents) {
      // Redact contact info only — keep names searchable
      const result = await blindfold.redact(doc, {
        entities: ['email address', 'phone number'],
      });
      safeDocuments.push(result.text);
    }

    // Index safeDocuments into your vector store
    ```
  </Tab>

  <Tab title="LangChain">
    ```python theme={null}
    from langchain_blindfold import BlindfoldPIITransformer
    from langchain_core.documents import Document

    # Redact contact info only — keep names searchable
    transformer = BlindfoldPIITransformer(
        pii_method="redact",
        entities=["email address", "phone number"],
    )

    docs = [
        Document(page_content="Customer John Smith (john@example.com) reported a billing error."),
        Document(page_content="Maria Garcia (+34 612 345 678) requested a data export."),
    ]

    safe_docs = transformer.transform_documents(docs)
    # Index safe_docs into your vector store
    ```
  </Tab>
</Tabs>

<Info>
  **Why keep names?** At ingestion, person names are replaced with `[PERSON]`. At query time, names are tokenized to `<Person_1>`. Neither placeholder matches the other — so searching for "Hans Mueller" cannot find `[PERSON]` in the vector store. Keeping names at ingestion solves this and lets users search by name. Contact info (emails, phones) is rarely searched for and should always be redacted.
</Info>

### Layer 2: Query-Time Tokenization

After retrieval, tokenize the context and question **in a single call** before they reach the LLM. Then detokenize the response to restore real data.

<Tabs>
  <Tab title="Python">
    ```python theme={null}
    from blindfold import Blindfold
    from openai import OpenAI

    blindfold = Blindfold(api_key="your-api-key")
    openai_client = OpenAI()

    question = "What happened with John Smith's billing issue?"

    # Step 1: Search with original question — names match in vector store
    results = collection.query(query_texts=[question], n_results=3)
    context = "\n\n".join(results["documents"][0])

    # Step 2: Single tokenize call — consistent token numbering
    prompt_text = f"Context:\n{context}\n\nQuestion: {question}"
    tokenized = blindfold.tokenize(prompt_text)

    # Step 3: Send to LLM — no PII in the prompt
    response = openai_client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": "Answer using the provided context."},
            {"role": "user", "content": tokenized.text},
        ],
    )
    ai_response = response.choices[0].message.content

    # Step 4: Detokenize — restore real names in the response
    final = blindfold.detokenize(ai_response, tokenized.mapping)
    print(final.text)
    ```
  </Tab>

  <Tab title="JavaScript">
    ```javascript theme={null}
    import { Blindfold } from '@blindfold/sdk';
    import OpenAI from 'openai';

    const blindfold = new Blindfold({ apiKey: 'your-api-key' });
    const openai = new OpenAI();

    const question = "What happened with John Smith's billing issue?";

    // Step 1: Search with original question — names match
    const results = await collection.query({
      queryTexts: [question], nResults: 3,
    });
    const context = results.documents[0].join('\n\n');

    // Step 2: Single tokenize call — consistent token numbering
    const promptText = `Context:\n${context}\n\nQuestion: ${question}`;
    const tokenized = await blindfold.tokenize(promptText);

    // Step 3: Send to LLM — no PII
    const response = await openai.chat.completions.create({
      model: 'gpt-4o-mini',
      messages: [
        { role: 'system', content: 'Answer using the provided context.' },
        { role: 'user', content: tokenized.text },
      ],
    });
    const aiResponse = response.choices[0].message.content;

    // Step 4: Detokenize
    const final = blindfold.detokenize(aiResponse, tokenized.mapping);
    console.log(final.text);
    ```
  </Tab>

  <Tab title="LangChain">
    ```python theme={null}
    from blindfold import Blindfold
    from langchain_openai import ChatOpenAI
    from langchain_core.runnables import RunnableLambda

    blindfold_client = Blindfold(api_key="your-api-key")

    def retrieve_and_tokenize(question: str) -> dict:
        # Retrieve with original question — names match
        docs = retriever.invoke(question)
        context = "\n\n".join(doc.page_content for doc in docs)

        # Single tokenize call — consistent token numbering
        prompt_text = f"Context:\n{context}\n\nQuestion: {question}"
        tokenized = blindfold_client.tokenize(prompt_text)
        return {"tokenized_text": tokenized.text, "mapping": tokenized.mapping}

    chain = RunnableLambda(retrieve_and_tokenize) | ...  # LLM + detokenize
    ```
  </Tab>
</Tabs>

<Warning>
  **Why a single tokenize call?** If you tokenize the context and question separately, each call produces independent token numbering. Context might map `<Person_1>` to "Hans Mueller" while the question maps `<Person_1>` to "Marie Dupont" — creating mapping conflicts. A single call on the combined text ensures consistent numbering.
</Warning>

## Protection Method Comparison

Choose the right protection method for your RAG use case:

| Method       | Reversible     | Best for                                | Example output                    |
| ------------ | -------------- | --------------------------------------- | --------------------------------- |
| **Redact**   | No             | Ingestion — permanent PII removal       | `[PERSON]`, `[EMAIL_ADDRESS]`     |
| **Tokenize** | Yes            | Queries — protect input, restore output | `<Person_1>`, `<Email Address_1>` |
| **Encrypt**  | Yes (with key) | Regulated data requiring audit trail    | `ENC_a8f3b2...`                   |
| **Hash**     | No             | Analytics — consistent pseudonymous IDs | `HASH_a3f8b9c2d4e5`               |

<Tip>
  **Recommended pattern:** Use `redact` with `entities` at ingestion time (Layer 1) to strip contact info while keeping names. At query time (Layer 2), search with the original question and `tokenize` the combined context + question before the LLM call. This gives you searchability by name and full PII protection at the LLM boundary.
</Tip>

## Advanced: Tokenize with Stored Mapping

For the strongest privacy with full searchability — no PII in the vector store **and** name-based search — tokenize at ingestion and store the mapping. This is the most complete architecture but requires managing a mapping store.

**How it works:**

1. **Ingestion**: `tokenize()` each document → store tokenized text in vector store + store mapping securely
2. **Query**: Build a reverse lookup from stored mappings. Replace real names in the query with their tokens before searching
3. **LLM**: Tokenized context + tokenized query → LLM sees only tokens
4. **Response**: Detokenize using stored mappings

```python theme={null}
from blindfold import Blindfold

blindfold = Blindfold(api_key="your-api-key")

# === Ingestion ===
documents = [...]
mapping_store = {}  # In production: encrypted DB or secrets manager

for doc in documents:
    result = blindfold.tokenize(doc)
    # Store tokenized text in vector store
    vectorstore.add(result.text)
    # Store mapping securely (keyed by doc ID or merged globally)
    mapping_store.update(result.mapping)

# Build reverse lookup: real value → token
reverse_lookup = {v: k for k, v in mapping_store.items()}

# === Query ===
question = "What happened with Hans Mueller?"

# Replace known real values with their tokens
for real_value, token in reverse_lookup.items():
    question = question.replace(real_value, token)
# question: "What happened with <Person_1>?"

# Search with tokenized query — tokens match tokens in vector store
results = vectorstore.query(question, n_results=3)

# Context is already tokenized, question is already tokenized
# Send directly to LLM — no PII
response = llm.generate(context=results, question=question)

# Detokenize for the user
final = blindfold.detokenize(response, mapping_store)
```

**Trade-offs:**

* Requires managing a mapping store (encrypted DB, secrets manager)
* Reverse lookup needs exact string matching (partial names may not match)
* More complex than the selective-redaction approach
* But: **strongest privacy with full searchability** — no PII in the vector store at all

<Info>
  `detokenize()` is a free local operation — no API call. This means the mapping store is the only infrastructure you need to manage.
</Info>

## Policy Recommendations

Match your compliance policy to your use case:

| Use case         | Policy     | Region | Key entities detected                              |
| ---------------- | ---------- | ------ | -------------------------------------------------- |
| General RAG      | `basic`    | —      | Names, emails, phones, addresses, credit cards     |
| EU customer data | `gdpr_eu`  | `eu`   | Names, emails, IBANs, national IDs, DOB, addresses |
| US healthcare    | `hipaa_us` | `us`   | All 18 HIPAA identifiers (SSN, MRN, DOB, etc.)     |
| Payment data     | `pci_dss`  | —      | Credit cards, CVVs, expiration dates               |
| Maximum coverage | `strict`   | —      | All supported entity types, lowest threshold       |

```python theme={null}
# GDPR-compliant RAG — redact contact info, keep names
blindfold = Blindfold(api_key="your-key", region="eu")
result = blindfold.redact(document, policy="gdpr_eu", entities=[
    "email address", "phone number", "iban", "credit card number",
    "address", "date of birth", "national id number",
])

# HIPAA-compliant RAG
blindfold = Blindfold(api_key="your-key", region="us")
result = blindfold.redact(document, policy="hipaa_us")
```

## Performance Tips

* **Batch redaction at ingestion** — use `blindfold.redact_batch()` for processing multiple documents in one API call
* **Async processing** — use `AsyncBlindfold` for concurrent document processing during ingestion
* **Detokenization is free** — `detokenize()` is a local string replacement, no API call required
* **Cache redacted documents** — once documents are redacted and indexed, no further Blindfold calls are needed for retrieval

## Cookbook Examples

Complete, runnable examples for every RAG framework:

<CardGroup cols={2}>
  <Card title="OpenAI + ChromaDB (Python)" icon="python" href="https://github.com/blindfold-dev/blindfold-cookbook/tree/main/examples/rag-openai-python">
    Selective redaction + search-first tokenization
  </Card>

  <Card title="OpenAI + ChromaDB (Node.js)" icon="js" href="https://github.com/blindfold-dev/blindfold-cookbook/tree/main/examples/rag-openai-node">
    TypeScript OpenAI + ChromaDB RAG pipeline
  </Card>

  <Card title="LangChain + FAISS (Python)" icon="python" href="https://github.com/blindfold-dev/blindfold-cookbook/tree/main/examples/rag-langchain-python">
    BlindfoldPIITransformer + retrieve-then-tokenize
  </Card>

  <Card title="LangChain + FAISS (Node.js)" icon="js" href="https://github.com/blindfold-dev/blindfold-cookbook/tree/main/examples/rag-langchain-node">
    LangChain.js RAG with inline PII protection
  </Card>

  <Card title="LlamaIndex (Python)" icon="python" href="https://github.com/blindfold-dev/blindfold-cookbook/tree/main/examples/rag-llamaindex-python">
    Retrieve-then-tokenize with LlamaIndex
  </Card>

  <Card title="LlamaIndex (Node.js)" icon="js" href="https://github.com/blindfold-dev/blindfold-cookbook/tree/main/examples/rag-llamaindex-node">
    LlamaIndex.TS with single tokenize call
  </Card>

  <Card title="GDPR Customer Support (Python)" icon="shield-check" href="https://github.com/blindfold-dev/blindfold-cookbook/tree/main/examples/rag-customer-support-python">
    Multi-turn EU support chatbot with gdpr\_eu policy
  </Card>

  <Card title="GDPR Customer Support (Node.js)" icon="shield-check" href="https://github.com/blindfold-dev/blindfold-cookbook/tree/main/examples/rag-customer-support-node">
    TypeScript multi-turn EU support chatbot
  </Card>
</CardGroup>

### Strategy Deep-Dives

Standalone examples for each ingestion strategy — compare trade-offs side by side:

<CardGroup cols={2}>
  <Card title="Selective Redact (Python)" icon="python" href="https://github.com/blindfold-dev/blindfold-cookbook/tree/main/examples/rag-selective-redact-python">
    Keep names, redact contact info — simplest approach
  </Card>

  <Card title="Selective Redact (Node.js)" icon="js" href="https://github.com/blindfold-dev/blindfold-cookbook/tree/main/examples/rag-selective-redact-node">
    TypeScript version of the selective redact strategy
  </Card>

  <Card title="Stored Mapping (Python)" icon="python" href="https://github.com/blindfold-dev/blindfold-cookbook/tree/main/examples/rag-stored-mapping-python">
    Tokenize everything, store per-document mappings
  </Card>

  <Card title="Stored Mapping (Node.js)" icon="js" href="https://github.com/blindfold-dev/blindfold-cookbook/tree/main/examples/rag-stored-mapping-node">
    TypeScript version of the stored mapping strategy
  </Card>

  <Card title="Consistent Registry (Python)" icon="python" href="https://github.com/blindfold-dev/blindfold-cookbook/tree/main/examples/rag-consistent-registry-python">
    Same person = same token everywhere — best search quality
  </Card>

  <Card title="Consistent Registry (Node.js)" icon="js" href="https://github.com/blindfold-dev/blindfold-cookbook/tree/main/examples/rag-consistent-registry-node">
    TypeScript version of the consistent registry strategy
  </Card>

  <Card title="Strategy Comparison (Python)" icon="code-compare" href="https://github.com/blindfold-dev/blindfold-cookbook/tree/main/examples/rag-strategy-comparison-python">
    All 3 strategies side by side with CLI selection
  </Card>

  <Card title="Strategy Comparison (Node.js)" icon="code-compare" href="https://github.com/blindfold-dev/blindfold-cookbook/tree/main/examples/rag-strategy-comparison-node">
    TypeScript version — all 3 strategies with CLI selection
  </Card>
</CardGroup>

### Role-Based Access Control (RBAC)

Use Blindfold policies to implement role-based PII control — same vector store, different privacy levels per user role:

<CardGroup cols={2}>
  <Card title="RBAC with Policies (Python)" icon="shield-check" href="https://github.com/blindfold-dev/blindfold-cookbook/tree/main/examples/rag-rbac-policies-python">
    Doctor, nurse, billing, researcher — each role sees different PII levels
  </Card>

  <Card title="RBAC with Policies (Node.js)" icon="shield-check" href="https://github.com/blindfold-dev/blindfold-cookbook/tree/main/examples/rag-rbac-policies-node">
    TypeScript version of the role-based PII control example
  </Card>
</CardGroup>
