Quick Start

Get up and running with LLM Sanitizer in four steps.

Step 1 — Create an Account

Register for a free account at /signup.html or via the API:

curl -X POST https://api.llmsanitizer.com/api/v1/auth/register \
  -H "Content-Type: application/json" \
  -d '{"email": "you@example.com", "password": "your-password"}'

Step 2 — Get Your API Key

Login to receive a JWT token, then create an API key:

# Login
curl -X POST https://api.llmsanitizer.com/api/v1/auth/login \
  -H "Content-Type: application/json" \
  -d '{"email": "you@example.com", "password": "your-password"}'

# Create API key (use JWT from login response)
curl -X POST https://api.llmsanitizer.com/api/v1/keys \
  -H "Authorization: Bearer YOUR_JWT_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"name": "my-app-key"}'

Step 3 — Configure a Policy

Optionally create a custom security policy. The default policy uses balanced mode.

curl -X POST https://api.llmsanitizer.com/api/v1/user-policies \
  -H "Authorization: Bearer YOUR_JWT_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "strict-no-pii",
    "mode": "strict",
    "categories": {
      "prompt_injection": true,
      "pii": true,
      "profanity": false
    }
  }'

Step 4 — Sanitize Your First Input

curl -X POST https://api.llmsanitizer.com/proxy/v1/sanitize \
  -H "Content-Type: application/json" \
  -H "X-API-Key: YOUR_API_KEY" \
  -d '{"input": "Hello, can you help me with my project?"}'

Authentication

LLM Sanitizer uses API key authentication for all proxy and sanitization endpoints. Include your API key in the X-API-Key header with every request.

Header Format

X-API-Key: sk_live_abc123def456...

API keys are scoped to your account and can be managed via the dashboard or the /api/v1/keys endpoints. Each key can be named for easy identification and revoked independently.

JWT Authentication

Account management endpoints (creating keys, managing policies) require JWT authentication. Obtain a JWT by logging in via /api/v1/auth/login and pass it in the Authorization: Bearer header.

Endpoints

POST /proxy/v1/sanitize

Sanitize user input before sending to your LLM. Returns a risk assessment with category breakdowns.

Request Body:

  • inputstring, required
    The user message to sanitize.
  • policystring, optional
    Policy mode: "strict", "balanced", or "permissive". Default: "balanced".
  • policyIdstring, optional
    ID of a custom policy to use instead of a preset mode.

Response:

{
  "allowed": true,
  "risk": "low",
  "score": 0.12,
  "categories": [],
  "sanitizedInput": "Hello, can you help me?",
  "piiDetected": [],
  "processingMs": 3.8
}

POST /proxy/v1/sanitize/output

Validate LLM output before showing to users. Detects system prompt leaks, harmful content, and PII in responses.

  • outputstring, required
    The LLM response to validate.
  • policystring, optional
    Policy mode. Default: "balanced".

POST /proxy/v1/chat

Proxy endpoint that sanitizes input, forwards to your LLM provider, then validates the output. A complete round-trip protection pipeline.

  • messagesarray, required
    Chat messages array in OpenAI format.
  • modelstring, optional
    LLM model to use. Default: "gpt-4".

Policies

Security policies control which threat categories are active and how sensitive detection should be. Three preset modes are available:

ModeDescription
strictMaximum protection. All categories enabled, lowest thresholds. Best for customer-facing applications.
balancedDefault. Good detection with fewer false positives. Suitable for most applications.
permissiveMinimal blocking. Only high-confidence threats are flagged. For internal tools and testing.

Threat Categories

Each category can be individually enabled or disabled in custom policies:

CategoryDescription
prompt_injectionOverride instructions, context switching, instruction manipulation
jailbreakDAN prompts, developer mode, character roleplay exploits
system_prompt_extractionAttempts to reveal system prompt or configuration
piiEmails, SSNs, credit cards, phone numbers, addresses
profanityObscene language, vulgarities, slurs
hate_speechDiscriminatory, racist, or bigoted content
threatsThreats of violence, intimidation, harm
harassmentBullying, personal attacks, targeted abuse
sexual_contentExplicit sexual material, NSFW content
criminalIllegal activities, drug manufacturing, weapon instructions
self_harmSuicide, self-injury, eating disorder promotion
misinformationDemonstrably false claims, conspiracy theories
social_engineeringManipulation, phishing, confidence tricks
encoding_attacksBase64, hex, Unicode obfuscation of malicious content
multilingual_injectionInjection attempts in non-English languages
data_exfiltrationAttempts to extract training data or model information
toxicityGeneral toxic or abusive language
spamRepetitive, promotional, or meaningless content

PII Types

LLM Sanitizer detects and optionally redacts the following personally identifiable information:

TypeDescription
emailEmail addresses (user@example.com)
ssnUS Social Security Numbers (XXX-XX-XXXX)
credit_cardCredit/debit card numbers (Visa, Mastercard, Amex, etc.)
phonePhone numbers (US, international formats)
api_keyAPI keys and secrets (AWS, OpenAI, Stripe, etc.)
ip_addressIPv4 and IPv6 addresses
addressPhysical mailing addresses
date_of_birthBirth dates in common formats
passportPassport numbers
drivers_licenseDriver's license numbers

When PII is detected, the response includes a piiDetected array with type, location, and redacted value. If using the strict policy, inputs containing PII are blocked by default.

Response Format

All sanitization endpoints return a consistent JSON response:

{
  "allowed": false,
  "risk": "critical",
  "score": 0.94,
  "categories": [
    {
      "name": "prompt_injection",
      "score": 0.94,
      "severity": "critical",
      "details": "Override instruction pattern detected"
    }
  ],
  "sanitizedInput": null,
  "piiDetected": [],
  "message": "Input blocked: prompt injection detected",
  "processingMs": 4.2
}

Field Reference

  • allowedboolean
    Whether the input passed all policy checks.
  • riskstring
    Risk level: "none", "low", "medium", "high", "critical".
  • scorenumber
    Overall risk score from 0.0 (safe) to 1.0 (maximum risk).
  • categoriesarray
    Detected threat categories with individual scores and details.
  • sanitizedInputstring | null
    The cleaned input with PII redacted. Null if input was blocked.
  • piiDetectedarray
    List of PII items found, with type and redacted values.
  • processingMsnumber
    Processing time in milliseconds.

Error Codes

CodeDescription
400Bad Request. Invalid JSON, missing required fields, or malformed input. Check the error message for details.
401Unauthorized. Missing or invalid API key. Ensure your X-API-Key header is present and correct.
403Forbidden. Your API key does not have permission for this operation or your account is suspended.
429Rate Limited. You have exceeded your plan's request limit. Upgrade your plan or wait for the limit to reset.
500Internal Server Error. An unexpected error occurred. If this persists, contact support with the request ID from the response headers.

All error responses follow this format:

{
  "error": "Rate Limited",
  "message": "You have exceeded 1000 requests this month. Upgrade to Pro for higher limits.",
  "code": 429
}