All posts
Engineering 24 February 2026 · 12 min read

We Can See Your Heartbeat Through Your Camera

Your face changes colour 72 times a minute. You can't see it. We can. This is the story of how we use remote photoplethysmography — detecting blood flow through a standard webcam — to prove that the person on the other side of the camera is alive.

The Invisible Pulse

Every time your heart beats, it pushes a wave of oxygenated blood through your capillaries. When that wave reaches the capillaries in your face, it changes the amount of light your skin absorbs — specifically in the green channel. The change is tiny: about 0.1-0.5% of the total reflected light. Invisible to the human eye.

But not invisible to a camera sensor recording at 30 frames per second.

If you extract the average green pixel intensity from a face region across 90 frames, you get a noisy time series. Apply a bandpass filter to isolate the 0.7–4.0 Hz range (42–240 BPM), run an FFT, and a peak emerges at the person's heart rate. That peak is a physiological fingerprint that cannot be faked by a photograph, a screen replay, or a deepfake video that doesn't model subcutaneous blood flow.

This technique is called remote photoplethysmography — rPPG. It was first demonstrated by Verkruysse et al. in 2008 using a $30 webcam and ambient light. Nearly two decades later, it's one of the most powerful liveness signals available — and FaceVault runs it on every camera-based verification session.

The intuition. A pulse oximeter clips to your finger and shines a light through it to measure blood flow. rPPG does the same thing — except the "light source" is ambient room lighting, the "sensor" is the camera, and the "finger" is your face. Same physics, no contact required.

Why It Matters for KYC

The core problem in identity verification isn't matching faces. ArcFace does that with 99.83% accuracy. The problem is: is the face real?

A printed photograph has no pulse. A screen replay of a recorded video has no pulse. A deepfake generated frame-by-frame has no pulse. Even a sophisticated real-time face swap running through OBS has no pulse — the face swapping algorithm preserves skin colour but doesn't model the micro-fluctuations caused by blood flow.

rPPG is the one signal that requires an actual cardiovascular system on the other side of the camera. You can't fake physics.

DEAD

Printed photo

Zero temporal variation. FFT shows flat noise floor. No peak anywhere in the 0.7–4.0 Hz band.

DEAD

Screen replay

Screen refresh rate (60 Hz) dominates. Any residual "pulse" from the original video is buried under display artefacts and moire patterns.

DEAD

Deepfake / face swap

GAN-generated frames model skin tone, not hemodynamics. The colour variations that encode blood flow are treated as noise and smoothed out by the generator.

Capturing 90 Frames in 3 Seconds

rPPG capture happens client-side, inside the user's browser, during the selfie step. While the user is looking at the camera for their selfie photo, we're quietly recording 90 frames at 30 FPS in the background. The user doesn't notice — it takes exactly 3 seconds.

Client-side capture (simplified)
// 128x96 canvas — tiny frames, fast upload
const canvas = document.createElement('canvas');
canvas.width = 128;
canvas.height = 96;

// Capture at 30 FPS for 3 seconds
const interval = setInterval(() => {
  ctx.drawImage(videoStream, 0, 0, 128, 96);
  canvas.toBlob(blob => frames.push(blob),
    'image/jpeg', 0.6);
}, 33);  // ~30 FPS

// SHA-256 hash chain for tamper detection
// hash[i] = SHA256(hash[i-1] || frame[i])

Each frame is 128×96 pixels at JPEG quality 0.6 — roughly 4 KB per frame. The entire 90-frame payload is about 360 KB. That's less than a single high-res selfie.

The frames are uploaded in a single batch to POST /{session_id}/rppg. The server validates each frame has a valid JPEG SOI header (0xFF 0xD8 0xFF) and silently discards any that don't — no error messages, no oracle for attackers to probe.

Hash chain integrity. Every frame is incrementally hashed using the Web Crypto API: hash[i] = SHA256(hash[i-1] || frame[i]). This creates a tamper-evident chain — reordering, dropping, or substituting even a single frame breaks the chain. The final hash is sent alongside the frames for server-side verification.

The POS Method: Extracting Pulse from Pixels

The server receives 90 tiny JPEG frames. Now we need to extract a blood volume pulse signal from pixel data. We use the Plane Orthogonal to Skin (POS) method by Wang et al. (2017) — a colour-space projection specifically designed for rPPG.

Here's the pipeline, step by step:

Step 1: Extract colour signals

For each frame, crop the centre 50% of the face region (avoiding hair and background). Compute the mean R, G, and B pixel values. This gives us three time series of 90 values each: R(t), G(t), B(t).

Step 2: Normalise

Divide each channel by its temporal mean to remove the static skin colour component. What remains is the relative fluctuation — the tiny periodic changes caused by blood flow. A person with dark skin and a person with light skin will produce different absolute RGB values, but after normalisation, the pulse signal has comparable amplitude.

Step 3: POS projection

This is the key insight from Wang et al. The blood volume pulse affects R, G, and B channels differently. POS projects the normalised signals onto two orthogonal axes that separate pulse from motion noise:

S1 = G(t) - B(t)
S2 = G(t) + B(t) - 2×R(t)
α = std(S1) / std(S2)
pulse = S1 + α × S2

The adaptive α parameter balances the two components based on the signal's own statistics. This makes POS robust to different skin tones, lighting conditions, and camera white balance settings.

Step 4: Bandpass filter

Apply a 3rd-order Butterworth bandpass filter (0.7–4.0 Hz). The low cutoff at 0.7 Hz (42 BPM) removes breathing and slow lighting changes. The high cutoff at 4.0 Hz (240 BPM) removes camera noise and electrical interference. What survives this filter is, almost exclusively, the cardiac pulse signal.

Why Butterworth? Maximally flat magnitude response in the passband — no ripples that could distort the pulse waveform. We use scipy's filtfilt (forward-backward filtering) for zero phase distortion.

Finding the Heartbeat in the Frequency Domain

After filtering, we have a clean-ish pulse signal in the time domain. But we need to measure it. Is there actually a heartbeat in here, or just noise?

We run an FFT (Fast Fourier Transform) on the filtered signal, zero-padded to the next power of 2 (at least 256 points). This transforms the time-domain pulse into a frequency spectrum — a graph showing how much energy is present at each frequency.

What the FFT reveals

Live person
1.2 Hz (72 BPM)

Clear dominant peak at the cardiac frequency. SNR > 5 dB.

Printed photo

Flat noise floor. No dominant frequency. SNR < 1 dB.

We measure two things from the power spectrum:

Peak frequency — the dominant frequency in the 0.7–4.0 Hz band. Multiply by 60 to get BPM. A real person at rest will produce a peak between 50–120 BPM.

Signal-to-Noise Ratio (SNR) — how much the peak stands out above the surrounding noise. A real pulse typically produces an SNR above 5 dB. A photograph produces an SNR below 1 dB — there's nothing to detect but sensor noise.

From Spectrum to Score

The FFT gives us a peak frequency and an SNR. The scoring function converts these into a single rPPG score from 0.0 to 1.0:

SNR scoring (70% weight)

SNR > 7 dB
1.0
SNR > 5 dB
0.8
SNR > 3 dB
0.6
SNR < 3 dB
0.2

BPM plausibility (30% weight)

50–120 BPM
1.0
40–150 BPM
0.7
Outside range
0.3
Final score
rppg_score = snr_score × 0.70 + bpm_score × 0.30

# Example: SNR = 6.5 dB (score 0.8), BPM = 72 (score 1.0)
# rppg_score = 0.8 × 0.7 + 1.0 × 0.3 = 0.86

SNR carries 70% of the weight because it's the primary indicator of whether a genuine cardiac signal is present. BPM plausibility is a sanity check — even a noisy signal should produce a biologically plausible heart rate if it's real.

One Signal in a 12-Signal Orchestra

rPPG doesn't work alone. It's one of 12 anti-spoofing signals in FaceVault's fusion engine, each exploiting a different physical property that attacks can't simultaneously satisfy:

rPPG

Blood flow

Eye specular

Corneal reflections

Blendshapes

Micro-expressions

GAN texture

Spectral forensics

Depth

3D geometry

Blink

Eye closure

Moire FFT

Screen detection

ELA

Splice detection

+4 more

EXIF, noise, colour, JPEG

rPPG carries a 10% weight in the fusion. That might seem low — but it's by design. Not every session can generate rPPG data (file uploads from older devices, browsers that block camera access). The fusion engine normalises weights across available signals only, so when rPPG is present, it pulls its weight. When it's absent from a camera session, a missing rPPG penalty of 0.15 is injected — because a live camera session that produces no heartbeat signal is suspicious.

Defense in depth. An attacker who somehow fools the rPPG signal still has to simultaneously produce correct corneal reflections, micro-expression dynamics, 3D depth geometry, and texture patterns that don't match known GAN fingerprints. Read more in our Deepfake Defense deep dive.

What It Defeats

Printed photographs

BLOCKED

Zero temporal variation in skin colour. The FFT shows nothing but flat noise. SNR < 1 dB, score ≈ 0.14.

Screen replays

BLOCKED

The screen's refresh rate (50/60 Hz) adds a dominant artefact far above the cardiac band. Any residual pulse from the original recording is destroyed by the display's gamma curve and backlight PWM.

Deepfakes & face swaps

BLOCKED

GAN generators model appearance, not hemodynamics. The sub-pixel colour fluctuations that encode blood flow are treated as noise by the generator and smoothed away. No generator architecture in production today preserves rPPG signals.

Silicone masks

WEAKENED

Thin masks may transmit some blood flow signal from the wearer's face underneath. However, the signal is severely attenuated (SNR 2–4 dB vs typical 5–7 dB), pushing the score into the review band. Combined with depth and texture signals, masks are consistently flagged.

Where It Struggles

rPPG isn't magic. It has real limitations, and we think being honest about them matters more than marketing:

Low light

In very dim environments (< 50 lux), the camera sensor's noise floor overwhelms the pulse signal. SNR drops below 3 dB and the score bottoms out. This is physics — no algorithm can extract signal from noise that isn't there.

Excessive head movement

The current implementation uses a fixed ROI (centre 50%). Large head movements shift the face out of the ROI, causing discontinuities in the colour signal. Face-tracking ROI would help — it's on the roadmap.

Browser compatibility

rPPG requires getUserMedia() and a stable video stream. Some browsers, privacy extensions, or corporate firewalls block camera access. When rPPG frames can't be captured, the system falls back to the remaining 11 signals. No user is ever rejected solely because rPPG was unavailable.

Graceful degradation. FaceVault's fusion engine is designed for missing signals. If rPPG fails, the weights redistribute proportionally across the remaining signals. A session can still pass with a high trust score — rPPG just makes the decision more confident when it's available.

By the Numbers

90

Frames captured

3s

Capture duration

360

KB total payload

0.7–4

Hz bandpass range

5+

dB SNR (real pulse)

50ms

Server analysis time

Your Face Proves You're Alive

Remote photoplethysmography is one of those technologies that sounds like science fiction until you see the math. Extract colour channels from video frames. Normalise. Project onto a colour space tuned for hemodynamics. Filter. FFT. Read the peak.

And there it is — a heartbeat, measured through a webcam, from 90 frames captured in 3 seconds, weighing less than a single photograph. No special hardware. No infrared sensors. No finger clips. Just physics, signal processing, and the fact that your skin changes colour with every beat of your heart.

That's the liveness signal that deepfakes can't fake. And it's running on every FaceVault verification session, right now.

References & Further Reading

Algorithmic Principles of Remote PPG — Wang et al., IEEE TBME 2017 (the POS method used by FaceVault)

Remote plethysmographic imaging using ambient light — Verkruysse et al., Optics Express 2008 (first webcam rPPG demonstration)

DeepFake Detection: A Survey — Tolosana et al., IEEE TIFS 2020

DeepFakesON-Phys: rPPG for Face Forgery Detection — Hernández-Ortega et al., 2020 (rPPG as a deepfake detector)

Deepfake Defense: An IDS/IPS for Identity Verification — the full 12-signal anti-spoofing pipeline

How FaceVault Verifies a Face in Under 30 Seconds — the verification pipeline this signal feeds into