Security 10 March 2026 · 12 min read

Self-Hosted KYC vs Cloud KYC:
Privacy, Cost, and Control

Every cloud KYC provider processes your users' faces on someone else's servers. Self-hosted identity verification keeps biometric data on infrastructure you control. Here's the complete comparison — data sovereignty, GDPR compliance, real cost analysis, and an honest look at when each approach actually makes sense.

The Cloud KYC Default

If you search "KYC API" today, you'll find Onfido, Jumio, Veriff, Sumsub, and a dozen others. They all work roughly the same way: your application collects an ID photo and a selfie, sends them to the provider's cloud infrastructure, their servers run face matching and document verification, and you get back a result — passed, failed, or needs review.

The integration is straightforward. Drop in an SDK or iframe, call an API, handle the webhook. Most providers offer a polished dashboard, compliance documentation, and support teams. For a product manager evaluating options, it's the path of least resistance. Ship in a week, deal with compliance later.

But there's a detail buried in every cloud KYC architecture diagram that rarely gets the attention it deserves: your users' faces and identity documents are processed on someone else's servers. Their biometric data — the mathematical representation of their face, the high-resolution image of their passport — transits a network you don't control, is decoded in memory you can't inspect, and is processed by models running on hardware you'll never see.

For a long time, this was the only option. Running your own face recognition pipeline required a team of ML engineers, GPU clusters, and months of development. The cloud providers solved a genuinely hard problem, and the trade-off felt acceptable.

That trade-off has changed. Open-source models have caught up. ONNX Runtime runs inference on commodity CPUs. INT8 quantization makes it fast enough for production. The question is no longer can you run identity verification on your own infrastructure — it's whether the cloud default still makes sense for your use case.

The Problem with Sending Faces to the Cloud

Biometric data isn't like a shipping address or an email. You can change your password. You can get a new credit card number. You cannot change your face. A biometric data breach is permanent and irrevocable. That's why regulators treat it differently — and why the architecture of your KYC pipeline matters more than most teams realize.

GDPR Article 28 requires that data controllers (you) only use processors (your KYC provider) that provide "sufficient guarantees" about data protection. But it goes deeper: when your KYC provider sends photos to a cloud AI service for face matching, that cloud service is a sub-processor. You now have a chain of custody that looks like this:

Sub-processor chain

Your app (controller)
  → KYC provider (processor)
    → AWS Rekognition (sub-processor)
      → AWS internal infrastructure (sub-sub-processor?)
        → Data retention policies you've never read

Each link in that chain is a liability surface. Each has its own data processing agreement, its own retention policy, its own incident response procedure. Under GDPR, you — the controller — are responsible for all of them. Under Brazil's LGPD and India's DPDPA, the obligations are similar: you must know where personal data goes, who processes it, and under what terms.

Then there's data residency. The EU's Schrems II ruling invalidated the Privacy Shield framework and made cross-border data transfers legally precarious. If your users are in Germany but your KYC provider routes face matching through a US-based cloud API, you may have an Article 44 violation. India's data localization requirements under the DPDPA are even stricter for certain categories of data. Brazil requires that data subjects be informed of international transfers.

The Clearview AI precedent. Clearview scraped billions of faces from the public internet and sold face recognition to law enforcement. They were fined €20 million by Italy, £7.5 million by the UK, and €20 million by Greece — all for processing biometric data without adequate consent or legal basis. The lesson: regulators are paying attention to face data, and "we sent it to a cloud API" is not a defense.

None of this means cloud KYC providers are acting in bad faith. Most have solid security teams and legitimate data processing agreements. The point is structural: every additional processor in the chain is an additional risk surface, an additional compliance obligation, and an additional entity that has seen your users' faces. If the architecture can avoid that chain entirely, it should.

What Self-Hosted KYC Means

Self-hosted KYC means running the entire identity verification pipeline — face detection, face matching, document OCR, MRZ extraction, anti-spoofing, fraud detection — on infrastructure you own or control. No photos leave your servers. No biometric data transits a third party's network. The ML models run locally, the data stays local, and you control every aspect of retention, encryption, and deletion.

In practice, this means Docker containers running ONNX models on your own hardware. The key technologies that make this viable in 2026:

ONNX Runtime

Microsoft's open-source inference engine runs neural networks on standard CPUs with near-GPU performance for many workloads. No CUDA, no specialized hardware, no cloud dependency.

Open-source models

ArcFace (face matching), OnnxTR (document OCR), Depth Anything (liveness depth), SCRFD (face detection) — production-grade models, freely available, no license restrictions for commercial use.

INT8 quantization

Quantized models are 3–4x smaller and roughly 2x faster than their FP32 counterparts, with negligible accuracy loss. This is what makes CPU-only inference viable at production scale.

Container orchestration

Docker Compose or Kubernetes handles deployment. The entire ML pipeline, API, database, and key management system ship as containers. Deploy to any Linux server, on any cloud provider, or on bare metal in your own data center.

FaceVault's architecture is built on exactly this stack. We don't use any cloud AI APIs. Every neural network — ArcFace for face matching, OnnxTR for OCR, Depth Anything for liveness, MediaPipe for face detection — runs locally via ONNX Runtime. The data flow for a verification is:

Self-hosted data flow

User's device
  → Your server (TLS 1.3)
    → Local face detection (ONNX)
    → Local face matching (ONNX)
    → Local OCR + MRZ (ONNX + PassportEye)
    → Local anti-spoofing (multi-signal fusion)
    → Local document fraud analysis
  → Result (never left the server)

Zero third parties. Zero sub-processors. Zero cross-border data transfers. The photo arrives, gets processed, gets encrypted at rest, and eventually gets purged — all on the same machine.

The Real Cost Comparison

The cost structure of cloud KYC and self-hosted KYC are fundamentally different. Cloud KYC is a variable cost that scales linearly with volume. Self-hosted KYC is a fixed cost that gets cheaper per verification as volume increases. Understanding where the lines cross is critical for making the right decision.

Factor	Cloud KYC	Self-Hosted KYC
Per-verification cost	$0.50 – $5.00	$0 (after infrastructure)
Infrastructure cost	$0 (provider handles it)	$50 – $200/mo (capable server)
Cost at 100 verifications/mo	$50 – $500	$50 – $200 (fixed)
Cost at 1,000 verifications/mo	$500 – $5,000	$50 – $200 (fixed)
Cost at 10,000 verifications/mo	$5,000 – $50,000	$100 – $400 (may need scaling)
Cost scaling	Linear (O(n))	Flat then step-wise
Engineering cost	Low (SDK integration)	Medium (deployment + ops)

The break-even point depends on your cloud provider's per-verification price, but it's typically in the 500 – 1,000 verifications per month range. Below that, cloud is often cheaper when you factor in the engineering time to deploy and maintain a self-hosted solution. Above that, self-hosted wins — and the gap widens rapidly.

At 10,000 verifications per month, a cloud KYC provider charging $1.50 per check costs you $15,000/month. A self-hosted solution on a $200/month dedicated server handles the same volume for $200/month. That's $177,600 per year in savings. At enterprise scale, the numbers become absurd.

Hidden costs to watch for. Cloud KYC providers often charge extra for features that are table stakes in a self-hosted deployment: data retention beyond 30 days, webhook delivery, API access to raw results, additional geographic regions, and "premium" anti-spoofing checks. Read the pricing page carefully.

Self-hosted isn't free, of course. You need someone who understands Docker, ONNX models, and basic ML operations. But you don't need a team of ML engineers. The models are pre-trained and pre-quantized. Deployment is docker compose up. The ongoing maintenance is monitoring, updating container images, and occasionally tuning thresholds.

Privacy Architecture: How FaceVault Does It

Self-hosted KYC is only as private as the implementation. Running models locally is the foundation, but a complete privacy architecture requires encryption, key management, access control, and data lifecycle enforcement. Here's how FaceVault layers these controls:

AES-256-GCM encryption at rest

Every photo, every face embedding, every JSON cache file is encrypted with AES-256-GCM before it touches the filesystem. Unique 12-byte nonce per file. PII fields in the database are encrypted with the same scheme. Photos on disk are .jpg.enc — not JPEG, not openable, just noise.

HashiCorp Vault key management

The master encryption key lives inside HashiCorp Vault's Transit engine — never exported, never in an env var, never on disk. The API uses a Data Encryption Key (DEK) generated by Vault, cached in memory, and re-derived on restart. Scoped token: encrypt/decrypt only, no admin access.

BYOK (Bring Your Own Key)

Pro-tier and above customers can provide their own AES-256-GCM key. Photos and PII are encrypted with the customer's key, wrapped via Vault Transit. Delete your key and all your data becomes cryptographically irrecoverable — instant crypto-shredding, no data recovery possible, even by us.

Auto-purge with configurable retention

Photos are automatically purged after the retention window (7–90 days, tier-dependent). Purge is irreversible: files deleted from disk, paths NULLed in the database, face embeddings wiped, PII fields cleared. Not a soft delete — a shutil.rmtree() on the session directory.

Tor hidden service

For maximum privacy, FaceVault is accessible via a Tor hidden service. End-to-end encrypted at the network layer, no exit nodes, no DNS leaks. Users who need it — journalists, activists, whistleblowers — can verify their identity without revealing their network location.

Zero telemetry, zero cloud AI

No cloud AI APIs in the ML pipeline. No analytics. No telemetry. No phone-home. The server doesn't call out to any external service during verification. The only network traffic is between the user's device and the API server.

The privacy stack in summary: local inference (no third-party processors) + encryption at rest (AES-256-GCM + Vault Transit) + BYOK (customer-controlled keys) + auto-purge (configurable retention) + Tor (network anonymity). Each layer is independent. Each assumes the others might fail. Defense in depth, not a single point of trust.

When Cloud KYC Makes Sense

It would be dishonest to pretend self-hosted is always the right answer. Cloud KYC exists for good reasons, and there are legitimate scenarios where it's the better choice:

Low volume (<100 verifications/month)

At low volume, the fixed cost of running your own infrastructure doesn't make economic sense. A cloud provider charging $2 per verification costs you $200/month for 100 checks — less than a decent server, and with zero ops burden.

No engineering team

If you don't have anyone who can deploy Docker containers, monitor a server, and debug ML pipeline issues, a managed cloud service is pragmatically the right call. Self-hosted requires some infrastructure competency.

Speed to market

You're a startup that needs KYC live by next Tuesday. A cloud SDK integration can ship in hours. A self-hosted deployment takes days, at minimum. If time is your scarcest resource, cloud buys you that time.

Regulatory acceptance of cloud processing

Some jurisdictions and compliance frameworks accept cloud processing as long as adequate DPAs are in place. If your regulator is comfortable with it and your users don't care, the cloud path has less friction.

The honest answer is: cloud KYC is easier for small teams with low volume and no privacy-sensitive users. If that describes you today, start with a cloud provider. You can always migrate to self-hosted later when volume grows or regulatory requirements tighten. The important thing is understanding the trade-offs you're making, not pretending they don't exist.

When Self-Hosted KYC Wins

Self-hosted isn't for everyone. But for the scenarios where it matters, it matters a lot:

✓

High volume (cost savings) — Above 500–1,000 verifications per month, the economics favor self-hosted decisively. At 10K+ per month, you're saving tens of thousands of dollars annually. At 100K+, the savings fund your entire engineering team.

✓

Regulated industries — Banking, healthcare, insurance, government. Regulators increasingly ask where biometric data is processed, not just whether it's protected. "It never leaves our infrastructure" is the cleanest answer to a compliance audit.

✓

GDPR and data sovereignty requirements — If you need to guarantee that biometric data never leaves a specific jurisdiction, self-hosted is the only architecture that can make that guarantee without caveats. Deploy the server in Frankfurt, and the data stays in Frankfurt. Full stop.

✓

Privacy-sensitive users — Cryptocurrency exchanges, VPN providers, privacy-focused platforms, whistleblower systems. Your users chose your product because of privacy. Sending their faces to a cloud AI API undermines your entire value proposition.

✓

Military and government applications — Air-gapped networks. SCIF requirements. FedRAMP compliance. These environments cannot call out to external APIs by definition. Self-hosted isn't a preference — it's the only option.

✓

Multi-region deployments — If you serve users globally and need to guarantee data residency per region, self-hosted gives you a simple solution: deploy a server in each region. EU data stays on the EU server. APAC data stays on the APAC server. No routing tables, no complex DPAs, no cross-border transfer mechanisms.

The pattern: self-hosted KYC wins when data control matters more than integration speed. If your users, regulators, or security team asks "where are the faces processed?" — you want the answer to be "on our servers, and only our servers."

The Hybrid Approach

There's a false binary in the "cloud vs self-hosted" debate. Most teams don't actually want to run their own Kubernetes cluster with ML models. They want the privacy benefits of self-hosted without the operational burden. This is where a hybrid approach — or more precisely, an API-hosted service built on self-hosted principles — fills the gap.

FaceVault's model works like this:

FaceVault: API-hosted, self-hosted principles

✓ API-hosted: We run the infrastructure. You call our API. No Docker, no server management, no ML ops on your side.

✓ All ML on our metal: Every neural network runs on servers we own. Not AWS, not GCP, not Azure. Bare metal, in a European data center, under our physical control.

✓ Zero cloud sub-processors: No Rekognition, no Cloud Vision, no Textract. The entire ML pipeline is self-contained. Your data touches exactly one entity: us.

✓ Encryption at rest: AES-256-GCM with Vault Transit key management. BYOK for Pro+ customers. Crypto-shredding on key deletion.

✓ Auto-purge: Configurable retention with irreversible deletion. GDPR Article 17 compliance built into the architecture, not bolted on after.

✓ Self-hosted option: Enterprise customers who need data on their own infrastructure can deploy FaceVault on-premise. Same codebase, same models, your hardware.

This gives most teams the best of both worlds: the convenience of an API integration (no ML ops, no infrastructure management) with the privacy guarantees of a self-hosted solution (no cloud sub-processors, encryption at rest, data sovereignty). The only third party in the chain is FaceVault — and we're transparent about exactly what we do with the data, how it's encrypted, and when it's purged.

For teams that need absolute control — air-gapped networks, government classified environments, or compliance frameworks that prohibit any external processor — the self-hosted deployment option exists. Same Docker containers, same ONNX models, your hardware, your network, your rules.

Migration Checklist

If you're currently using a cloud KYC provider and considering a move to self-hosted, here's what the migration involves. This isn't trivial, but it's well-defined work with a clear end state.

Hardware requirements

A modest dedicated server or VPS handles it comfortably. No GPU required — INT8 quantized ONNX models run efficiently on commodity x86 CPUs. Scale vertically by adding cores and RAM as volume grows.

Data migration

Export verification results from your current provider. Most providers offer CSV or API exports. You probably don't need to migrate the raw photos — just the verification decisions, extracted data, and session metadata. Historical photos should be purged, not migrated.

Testing pipeline accuracy

Before going live, run your self-hosted pipeline against a test set of known-good and known-bad verifications. Compare acceptance rates, rejection rates, and false positive/negative ratios against your cloud provider's historical performance. Tune thresholds until your accuracy meets your requirements.

Compliance documentation

Update your privacy policy, data processing agreements, and DPIA (Data Protection Impact Assessment). The good news: your documentation gets simpler, because you're removing processors from the chain. Update your ROPA (Records of Processing Activities) to reflect the new architecture.

Monitoring and alerting

Set up monitoring for inference latency, memory usage, disk space, and verification success rates. The ML pipeline is deterministic — if accuracy drops, something changed (model corruption, resource exhaustion, configuration drift). Structured logging with timestamps makes debugging straightforward.

Team training

Your ops team needs to understand the basics: how to restart the pipeline, how to read verification logs, how to handle failed verifications, where to find trust scores and anti-spoofing signals. This is a day of training, not a month. The system is designed to be operationally simple.

Parallel run (recommended)

Run both systems in parallel for 2–4 weeks. Route a percentage of traffic to the self-hosted pipeline and compare results. This catches edge cases, calibration issues, and operational gaps before you fully cut over. Cloud providers typically bill per verification, so the overlap cost is manageable.

Realistic timeline: For a team with basic Docker/infrastructure experience, expect 1–2 weeks for initial deployment and testing, plus 2–4 weeks of parallel run. Total migration: about a month. The engineering cost is front-loaded — once deployed, the ongoing operational burden is minimal.

Making the Decision

Here's the framework, distilled:

If you...	Consider...
Do <100 verifications/month	Cloud KYC or FaceVault Free tier (50/month)
Do 100–1,000/month with privacy needs	FaceVault API-hosted (self-hosted principles, no ops burden)
Do 1,000+/month and want cost savings	Self-hosted or FaceVault Pro (break-even point)
Are in a regulated industry	Self-hosted or FaceVault with BYOK
Need air-gapped / on-premise deployment	Self-hosted (enterprise license)
Serve privacy-sensitive users (crypto, VPN)	FaceVault API-hosted (Tor support, zero cloud AI)
Need maximum speed to market	Cloud KYC (ship now, migrate later if needed)

The question isn't "is self-hosted better?" in the abstract. It's "does the privacy, cost, and control benefit of self-hosted KYC outweigh the operational simplicity of cloud KYC for your specific situation?" For many teams, the answer is yes — especially as regulatory pressure around biometric data continues to tighten globally.

The trend is clear. The EU AI Act classifies biometric identification as high-risk. India's DPDPA requires data localization for sensitive data. Brazil's LGPD imposes strict consent requirements for biometric processing. Illinois' BIPA has generated over $5 billion in settlements. Every year, the regulatory bar for handling face data gets higher. Self-hosted KYC — or at minimum, KYC with no cloud sub-processors — is increasingly the path of least regulatory friction.

Start with the question regulators will eventually ask: "Where are your users' faces processed, who has access, and can you prove it?" If your current architecture can answer that cleanly, you're in good shape. If it can't, it might be time to consider a different approach.

Ready to See How It Works?

FaceVault gives you the privacy of self-hosted KYC with the convenience of an API. No cloud AI. Encryption at rest. Auto-purge. BYOK. Tor support. Start with 50 free verifications per month — no credit card required.

Your First Verification View Pricing

All posts

References & Further Reading

Why We Don't Use Cloud AI APIs — local inference, zero third-party processors

Your Face Is Encrypted Before It Hits Disk — AES-256-GCM + Vault Transit key management

Building Privacy-First KYC: Why We Delete Your Face — auto-purge, retention windows, verify-then-forget

FaceVault Is Now on Tor — .onion hidden service for maximum network privacy

We Made Our AI 3x Faster by Making It Dumber — INT8 quantization makes CPU-only inference viable

Deepfake Defense: An IDS/IPS for Identity Verification — the 12-signal anti-spoofing pipeline

GDPR Article 28 — Processor Obligations — sub-processor chains and controller responsibilities

GDPR Article 44 — Transfer of Personal Data — cross-border data transfer requirements

ONNX Runtime — open-source ML inference engine for CPU and GPU

Self-Hosted KYC vs Cloud KYC: Privacy, Cost, and Control