Quick Start
Get up and running with LLM Sanitizer in four steps.
Step 1 — Create an Account
Register for a free account at /signup.html or via the API:
curl -X POST https://api.llmsanitizer.com/api/v1/auth/register \ -H "Content-Type: application/json" \ -d '{"email": "you@example.com", "password": "your-password"}'
Step 2 — Get Your API Key
Login to receive a JWT token, then create an API key:
# Login curl -X POST https://api.llmsanitizer.com/api/v1/auth/login \ -H "Content-Type: application/json" \ -d '{"email": "you@example.com", "password": "your-password"}' # Create API key (use JWT from login response) curl -X POST https://api.llmsanitizer.com/api/v1/keys \ -H "Authorization: Bearer YOUR_JWT_TOKEN" \ -H "Content-Type: application/json" \ -d '{"name": "my-app-key"}'
Step 3 — Configure a Policy
Optionally create a custom security policy. The default policy uses balanced mode.
curl -X POST https://api.llmsanitizer.com/api/v1/user-policies \ -H "Authorization: Bearer YOUR_JWT_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "name": "strict-no-pii", "mode": "strict", "categories": { "prompt_injection": true, "pii": true, "profanity": false } }'
Step 4 — Sanitize Your First Input
curl -X POST https://api.llmsanitizer.com/proxy/v1/sanitize \ -H "Content-Type: application/json" \ -H "X-API-Key: YOUR_API_KEY" \ -d '{"input": "Hello, can you help me with my project?"}'
Authentication
LLM Sanitizer uses API key authentication for all proxy and sanitization endpoints. Include your API key in the X-API-Key header with every request.
Header Format
X-API-Key: sk_live_abc123def456...
API keys are scoped to your account and can be managed via the dashboard or the /api/v1/keys endpoints. Each key can be named for easy identification and revoked independently.
JWT Authentication
Account management endpoints (creating keys, managing policies) require JWT authentication. Obtain a JWT by logging in via /api/v1/auth/login and pass it in the Authorization: Bearer header.
Endpoints
POST /proxy/v1/sanitize
Sanitize user input before sending to your LLM. Returns a risk assessment with category breakdowns.
Request Body:
-
inputstring, required
The user message to sanitize.
-
policystring, optional
Policy mode: "strict", "balanced", or "permissive". Default: "balanced".
-
policyIdstring, optional
ID of a custom policy to use instead of a preset mode.
Response:
{
"allowed": true,
"risk": "low",
"score": 0.12,
"categories": [],
"sanitizedInput": "Hello, can you help me?",
"piiDetected": [],
"processingMs": 3.8
}POST /proxy/v1/sanitize/output
Validate LLM output before showing to users. Detects system prompt leaks, harmful content, and PII in responses.
-
outputstring, required
The LLM response to validate.
-
policystring, optional
Policy mode. Default: "balanced".
POST /proxy/v1/chat
Proxy endpoint that sanitizes input, forwards to your LLM provider, then validates the output. A complete round-trip protection pipeline.
-
messagesarray, required
Chat messages array in OpenAI format.
-
modelstring, optional
LLM model to use. Default: "gpt-4".
Policies
Security policies control which threat categories are active and how sensitive detection should be. Three preset modes are available:
| Mode | Description |
|---|---|
| strict | Maximum protection. All categories enabled, lowest thresholds. Best for customer-facing applications. |
| balanced | Default. Good detection with fewer false positives. Suitable for most applications. |
| permissive | Minimal blocking. Only high-confidence threats are flagged. For internal tools and testing. |
Threat Categories
Each category can be individually enabled or disabled in custom policies:
| Category | Description |
|---|---|
| prompt_injection | Override instructions, context switching, instruction manipulation |
| jailbreak | DAN prompts, developer mode, character roleplay exploits |
| system_prompt_extraction | Attempts to reveal system prompt or configuration |
| pii | Emails, SSNs, credit cards, phone numbers, addresses |
| profanity | Obscene language, vulgarities, slurs |
| hate_speech | Discriminatory, racist, or bigoted content |
| threats | Threats of violence, intimidation, harm |
| harassment | Bullying, personal attacks, targeted abuse |
| sexual_content | Explicit sexual material, NSFW content |
| criminal | Illegal activities, drug manufacturing, weapon instructions |
| self_harm | Suicide, self-injury, eating disorder promotion |
| misinformation | Demonstrably false claims, conspiracy theories |
| social_engineering | Manipulation, phishing, confidence tricks |
| encoding_attacks | Base64, hex, Unicode obfuscation of malicious content |
| multilingual_injection | Injection attempts in non-English languages |
| data_exfiltration | Attempts to extract training data or model information |
| toxicity | General toxic or abusive language |
| spam | Repetitive, promotional, or meaningless content |
PII Types
LLM Sanitizer detects and optionally redacts the following personally identifiable information:
| Type | Description |
|---|---|
| Email addresses (user@example.com) | |
| ssn | US Social Security Numbers (XXX-XX-XXXX) |
| credit_card | Credit/debit card numbers (Visa, Mastercard, Amex, etc.) |
| phone | Phone numbers (US, international formats) |
| api_key | API keys and secrets (AWS, OpenAI, Stripe, etc.) |
| ip_address | IPv4 and IPv6 addresses |
| address | Physical mailing addresses |
| date_of_birth | Birth dates in common formats |
| passport | Passport numbers |
| drivers_license | Driver's license numbers |
When PII is detected, the response includes a piiDetected array with type, location, and redacted value. If using the strict policy, inputs containing PII are blocked by default.
Response Format
All sanitization endpoints return a consistent JSON response:
{
"allowed": false,
"risk": "critical",
"score": 0.94,
"categories": [
{
"name": "prompt_injection",
"score": 0.94,
"severity": "critical",
"details": "Override instruction pattern detected"
}
],
"sanitizedInput": null,
"piiDetected": [],
"message": "Input blocked: prompt injection detected",
"processingMs": 4.2
}Field Reference
-
allowedboolean
Whether the input passed all policy checks.
-
riskstring
Risk level: "none", "low", "medium", "high", "critical".
-
scorenumber
Overall risk score from 0.0 (safe) to 1.0 (maximum risk).
-
categoriesarray
Detected threat categories with individual scores and details.
-
sanitizedInputstring | null
The cleaned input with PII redacted. Null if input was blocked.
-
piiDetectedarray
List of PII items found, with type and redacted values.
-
processingMsnumber
Processing time in milliseconds.
Error Codes
| Code | Description |
|---|---|
| 400 | Bad Request. Invalid JSON, missing required fields, or malformed input. Check the error message for details. |
| 401 | Unauthorized. Missing or invalid API key. Ensure your X-API-Key header is present and correct. |
| 403 | Forbidden. Your API key does not have permission for this operation or your account is suspended. |
| 429 | Rate Limited. You have exceeded your plan's request limit. Upgrade your plan or wait for the limit to reset. |
| 500 | Internal Server Error. An unexpected error occurred. If this persists, contact support with the request ID from the response headers. |
All error responses follow this format:
{
"error": "Rate Limited",
"message": "You have exceeded 1000 requests this month. Upgrade to Pro for higher limits.",
"code": 429
}