FAQ - Blindfold

General Questions

What happens to my data when I use Blindfold?

Your data never leaves your control.

Text is processed in real-time and not stored
PII mappings are returned to you (not stored by us)
No training on your data
EU data residency (GDPR compliant)
SOC 2 compliant infrastructure

Data flow:

You send text → 2. We detect PII → 3. Return protected text → 4. Data deleted

We only store metadata (request counts, API usage) for billing purposes.

How accurate is the PII detection?

Very high accuracy across 60+ entity types:

Email addresses: ~99% accuracy
Phone numbers: ~95% accuracy
Names: ~90-95% accuracy (varies by language)
Credit cards: ~98% accuracy (with Luhn validation)
Medical records: ~92% accuracy

Detection uses GLiNER, a state-of-the-art AI model trained specifically for PII detection.Tip: Use policy="strict" for maximum detection or adjust score_threshold for your needs.

What languages are supported?

15+ languages with automatic detection:Native Support (Highest Accuracy):

English, German, French, Spanish, Italian, Portuguese, Dutch, Polish, Russian

Zero-Shot Support (High Accuracy):

Czech, Slovak, Danish, Swedish, Norwegian, Romanian

Experimental:

Chinese, Japanese, Arabic

No configuration needed - the engine automatically detects the language.

See All Languages

Complete language support details

Can I detect custom entity types?

Yes! Use zero-shot detection with natural language descriptions:

# Detect custom entities
result = client.tokenize(
    "Order #ORD-2024-XYZ, SKU: PROD-789",
    entities=["order number", "product sku"]
)

No training required - just describe what you want to detect in plain English:

"order number", "booking reference", "employee id"
"internal code", "project name", "case number"
Industry-specific identifiers

Mix custom entities with standard ones for complete protection.

What's the difference between policies?

Policies are pre-configured entity sets for compliance:

Policy	Entity Count	Use Case
`basic`	3 types	General PII (names, emails, phones)
`gdpr_eu`	15+ types	European data protection
`hipaa_us`	11+ types	US healthcare compliance
`pci_dss`	8+ types	Payment card industry
`strict`	60+ types	Maximum protection

Use policies instead of listing entities manually:

# ✅ Easy with policy
result = client.tokenize(text, policy="gdpr_eu")

# ❌ Manual (harder to maintain)
result = client.tokenize(text, entities=["person", "email", ...15 more])

How do I handle false positives?

Several strategies to reduce false positives:1. Increase Detection Threshold

# Only detect high-confidence matches
result = client.tokenize(
    text,
    entities=["person", "email address"],
    score_threshold=0.80  # Higher threshold for fewer false positives
)

2. Filter Specific Entity Types

# Only detect specific entities
result = client.tokenize(
    text,
    entities=["email address", "phone number"]  # Skip names
)

3. Post-Process Results

# Review detected entities before using
for entity in result.detected_entities:
    if entity.score < 0.70:
        # Skip low-confidence detections
        continue

4. Use Allowlists

# Skip known safe values (implement client-side)
safe_values = ["John Doe", "support@company.com"]
if original_value not in safe_values:
    # Apply protection

Technical Questions

What are the API limits?

Limits by plan:

	Free	Pay As You Go
Characters	500K / month	Unlimited
Max text per request	5K chars	500K chars
Price	$0	$0.50 / 1M chars

Handling rate limits:

import time

try:
    result = client.tokenize(text)
except APIError as e:
    if e.status_code == 429:
        # Rate limited - wait and retry
        time.sleep(60)
        result = client.tokenize(text)

Can I use Blindfold in the browser?

Not recommended - API keys should stay server-side.❌ Bad (API key exposed):

// Client-side code - NEVER do this
const client = new Blindfold({ apiKey: 'sk-...' });

✅ Good (Server-side API route):

// Client
fetch('/api/protect', {
  method: 'POST',
  body: JSON.stringify({ text: userInput })
});

// Server (Next.js API route)
import { Blindfold } from '@blindfold/sdk';

export async function POST(req) {
  const client = new Blindfold({
    apiKey: process.env.BLINDFOLD_API_KEY  // Server-side only
  });

  const { text } = await req.json();
  const result = await client.tokenize(text);

  return Response.json(result);
}

Use edge functions, serverless functions, or backend API routes.

How do I restore tokenized data?

Use the mapping returned from tokenize():

# Step 1: Tokenize
protected = client.tokenize("John Doe, john@example.com")

print(protected.text)
# "< person_1>, <email_address_1>"

print(protected.mapping)
# {"<person_1>": "John Doe", "<email_address_1>": "john@example.com"}

# Step 2: Send protected text to AI
ai_response = send_to_ai(protected.text)

# Step 3: Detokenize AI response
original = client.detokenize(
    text=ai_response,
    mapping=protected.mapping
)

print(original.text)
# "Hello John Doe, I received your message at john@example.com"

Important:

Store mapping securely (Redis, encrypted DB, session)
Set expiration (e.g., 24 hours)
Without mapping, data cannot be restored

What's the difference between tokenize, mask, and redact?

Choose the right method for your use case:

Method	Reversible	Example	Use Case
Tokenize	✅ Yes	`<person_1>`	AI processing, chatbots
Mask	❌ No	`***3456`	Display to users
Redact	❌ No	“ (removed)	Permanent removal
Hash	❌ No	`ID_a3f8b9`	Analytics, matching
Encrypt	✅ Yes	`gAAAAABh...`	Secure storage
Synthesize	❌ No	`Jane Smith` (fake)	Testing, demos

Example workflows:

# AI Chatbot → Use tokenize (reversible)
protected = client.tokenize(user_input)
ai_response = send_to_ai(protected.text)
final = client.detokenize(ai_response, protected.mapping)

# Display to User → Use mask (show last 4)
masked = client.mask("Card: 4532-7562-9102-3456")
# "Card: ***************3456"

# Audit Logs → Use redact (permanent)
logged = client.redact("User SSN: 123-45-6789")
# "User SSN: "

# Analytics → Use hash (consistent IDs)
hashed = client.hash("user@example.com")
# "ID_a3f8b9c2d4e5f6g7" (always same for same input)

Does Blindfold work with all AI providers?

Yes! Blindfold is provider-agnostic. The pattern is always the same: tokenize, send to AI, detokenize.

✅ OpenAI (GPT-4o, GPT-4, o1)
✅ Anthropic (Claude Sonnet, Opus, Haiku)
✅ Google (Gemini 2.5 Flash, Pro)
✅ AWS Bedrock, Azure OpenAI
✅ LangChain, LlamaIndex, Vercel AI SDK
✅ Cohere, Hugging Face, self-hosted models
✅ Any LLM API

OpenAI
Anthropic Claude
Google Gemini
Any Provider

from blindfold import Blindfold
from openai import OpenAI

bf = Blindfold()
client = OpenAI()

safe = bf.tokenize(user_input)
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": safe.text}]
)
result = bf.detokenize(response.choices[0].message.content, safe.mapping)

from blindfold import Blindfold
import anthropic

bf = Blindfold()
client = anthropic.Anthropic()

safe = bf.tokenize(user_input)
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[{"role": "user", "content": safe.text}]
)
result = bf.detokenize(response.content[0].text, safe.mapping)

from blindfold import Blindfold
from google import genai

bf = Blindfold()
client = genai.Client()

safe = bf.tokenize(user_input)
response = client.models.generate_content(
    model="gemini-2.5-flash", contents=safe.text
)
result = bf.detokenize(response.text, safe.mapping)

from blindfold import Blindfold

bf = Blindfold()

safe = bf.tokenize(user_input)
response = your_ai_provider.chat(safe.text)
result = bf.detokenize(response, safe.mapping)

See All Integrations

Code examples for every major AI provider and framework

Can I use Blindfold without an API key?

Yes! All SDKs include local mode with 80+ regex-based entity types, zero dependencies, and no API key required.In local mode, no data ever leaves your infrastructure — everything runs in-process with no network calls.

from blindfold import Blindfold

# No API key needed
client = Blindfold()
result = client.tokenize("Contact john@example.com or call +1-555-1234")

Local mode vs Cloud API:

	Local Mode	Cloud API
Entity types	80+ (regex-based)	60+ NLP + 80+ regex
API key	Not needed	Required
Data privacy	Never leaves your infrastructure	Processed in EU/US, not stored
Names & addresses	Not supported	NLP-powered detection
Compliance policies	Not available	GDPR, HIPAA, PCI DSS
Audit logs	Not available	Full audit trail

Upgrade path: When you need NLP-powered detection (names, addresses, organizations), compliance policies, or audit logs, add an API key to switch to the Cloud API.

Compliance & Privacy

Does Blindfold support HIPAA compliance?

Yes, for healthcare applications:

✅ Use policy="hipaa_us" for healthcare data
✅ Detects PHI (Protected Health Information)
✅ Business Associate Agreement (BAA) available
✅ Encrypted data transmission

Protected entities:

Names, SSN, medical record numbers
Health insurance IDs
Medical conditions, medications
Dates of birth

# HIPAA-compliant processing
result = client.tokenize(
    patient_data,
    policy="hipaa_us"
)

Can I get a custom Data Processing Agreement (DPA)?

Yes, DPAs are available for all paid plans.What’s included:

Data processing terms
Security measures
Subprocessor list
Your rights and obligations
Incident response procedures

To request a DPA:

Email: hello@blindfold.dev
Subject: “DPA Request”
Include: Company name, plan tier

Standard DPAs provided within 2 business days.

Pricing & Plans

Is there a free tier?

Yes! Free tier includes:

✅ 500K characters per month
✅ All 60+ entity types
✅ All global policies
✅ 18 languages supported
✅ 3 team members, 2 API keys
✅ Dashboard & audit logs

Perfect for:

Testing and development
Proof of concepts
Small projects

Sign Up Free

Get started in 5 minutes

How is usage calculated?

Usage is measured in input characters processed:

Each API call counts the number of characters in the text field
Batch requests count total characters across all texts
Policy management and dashboard usage are free

Example:

“Hello, my name is John Doe” = 26 characters
A 1,000-word email ≈ 5,000 characters

Free plan: 500K characters/month included. Pay As You Go: $0.50 per 1M characters, no limit. Billed monthly via Stripe.

Still Have Questions?

Contact Support

Email us at hello@blindfold.dev - we typically respond within 24 hours

​General Questions

See All Languages

​Technical Questions