Your Devices Disagree.
Here's Why That Matters.
If you wore an Apple Watch and Oura Ring to bed last night, you got two different sleep times. Your Garmin and Apple Watch show different resting heart rates. This is not a failure of any single device - it is a fundamental reality of consumer health measurement that nobody talks about.
The Measurement Problem
Every health metric you track is an estimate, not a measurement. A polysomnography lab measures sleep. Your Apple Watch infers sleep from movement patterns and heart rate. Different sensors, different algorithms, different estimates. This is physics, not a bug.
The problem compounds with multiple data sources. If Apple Health has sleep data from your watch, phone, mattress sensor, and a third-party app, it timestamps everything but does not reconcile. You end up with four overlapping sleep records and zero clarity on which one is right.
Same night. Three devices. Three different answers.
35-minute spread in total sleep time. 38-minute spread in deep sleep. 4 bpm spread in resting heart rate. All from the same night, on the same person.
Sleep: The Hardest Reconciliation Problem
Sleep staging is the single hardest inference problem in consumer health. No wrist sensor or ring can directly measure brain waves. Everything you see in your sleep app is a probability estimate built on proxy signals.
Polysomnography
EEG electrodes on scalp measure brain waves directly. Delta waves indicate deep sleep. Mixed frequency with rapid eye movement indicates REM.
Only method that truly measures sleep stages. Everything else is inference from proxy signals.
Wrist PPG + Accelerometer
Optical heart rate sensor on wrist plus motion detection. Algorithms classify stillness combined with low HR and specific HRV patterns into stages.
Validation studies show 78-82% agreement with PSG for sleep/wake detection, dropping to 60-70% for stage classification.
Ring PPG + Temperature
Finger arteries sit closer to the skin surface, producing a cleaner PPG signal. Skin temperature naturally drops during deep sleep, providing an additional classification signal.
Slight edge on some overnight metrics due to finger vascular anatomy, but carries its own biases in position detection.
Mattress Ballistocardiography
Pressure sensors embedded in the mattress detect micro-movements from heartbeat and breathing patterns. No skin contact required.
Good at total sleep time estimation. Weaker on stage classification because it lacks direct cardiovascular measurement.
Phone Proximity
Detects when the phone is placed down and picked up. Uses ambient light sensor and touch screen inactivity. No biometric data whatsoever.
Least accurate by a wide margin. Cannot detect when you actually fell asleep, only when you stopped using your phone.
Where They Disagree - and Why
The same night of sleep, measured by three devices. Each bar shows the reported value from each source. The spread reveals how much disagreement exists for every single metric.
Total Sleep Time
35 min spreadApple Watch counts extended motionless periods as light sleep even when awake. Oura may miss sleep onset in unusual positions. iPhone has no idea when you actually fell asleep.
Deep Sleep
38 min spreadMost contentious metric. A 2023 validation study found Apple Watch overestimated deep sleep by ~18 min/night vs PSG, while Oura underestimated by ~12 min. Without EEG, both are guessing from proxy signals.
REM Sleep
22 min spreadHeart rate becomes more variable during REM, which helps detection. But some devices misclassify light sleep with elevated HRV as REM, inflating the number.
Sleep Latency
14 min spreadEach device uses different thresholds for the transition from "lying still but awake" to "asleep." iPhone just tracks when you put the phone down, not when sleep actually began.
The Naive Solutions That Don't Work
When people first discover their devices disagree, the instinct is to apply simple heuristics. None of them hold up under scrutiny.
“Pick the most expensive device”
Accuracy does not scale linearly with price. The $300 Oura Ring Gen 3 and $800 Apple Watch Ultra have comparable validation numbers in peer-reviewed studies. A 2023 Sleep Medicine Reviews analysis found no significant correlation between device price and PSG agreement rates.
“Average across all devices”
The average of two wrong numbers is still wrong. If Apple Watch reports 1h 20m of deep sleep and Oura reports 42m, averaging to 61 minutes has no physiological basis. Each error has different causes and magnitudes. Averaging hides the errors instead of correcting them.
“Let Apple Health sort it out”
Apple Health stores all data sources with timestamps but performs no reconciliation. It picks the most recent write or your preferred source. If your mattress sensor, watch, phone, and a third-party app all log sleep, Apple Health shows all four overlapping records with no resolution.
What Intelligent Reconciliation Looks Like
Vora does not pick a winner. It does not average. It builds a reconciled timeline that is more accurate than any individual source.
Sensor-Aware Weighting
Each sensor excels at something specific. Oura Ring finger-based PPG produces cleaner overnight HRV signals. Wrist accelerometers catch micro-awakenings that rings miss. Manual sleep/wake times add context no sensor provides.
Context Normalization
A 3am wrist PPG reading carries different confidence than an 11pm reading when you are still moving. Vora weighs each data point by measurement context: time of night, motion artifacts, skin contact quality, and recent activity.
Timeline Construction
Rather than choosing one device, Vora builds a unified minute-by-minute sleep timeline. Where devices agree, confidence is high. Where they diverge, the system uses sensor-specific knowledge to resolve conflicts.
Multi-Source Amplification
The more data sources you connect, the more accurate your data becomes. Two devices are better than one. Three are better than two. Each additional source adds signal that helps resolve ambiguity in the others.
When 5 BPM Changes the Story
Resting heart rate trends are early indicators of overtraining, illness onset, and cardiovascular fitness changes. But a 4-6 bpm spread between devices can turn a genuine physiological signal into noise.
Resting Heart Rate: Same Person, Same Day
Why it matters: A genuine 4 bpm increase over 2 weeks is a classic overtraining signal. A spurious 4 bpm increase from switching measurement windows is noise. If your app cannot distinguish these, your trends are unreliable.
HRV Is Not One Metric
Heart Rate Variability is a family of metrics. Comparing numbers across devices without understanding which variant they report is meaningless.
Measurement Window Matters
HRV during deep sleep at 3am is physiologically different from HRV at 7am while standing. Comparing a WHOOP reading to an Apple Watch reading means comparing different things measured at different times.
Vora's approach: Normalize by measurement method, time window, and device before comparing across time. When you connect a new device, Vora calibrates it against your existing baseline so trends reflect genuine physiological change, not device artifacts.
Why This Matters for Everything Else
Data reconciliation is not just a technical exercise. Every downstream decision in your health app depends on the accuracy of the data feeding it. Bad data in means bad recommendations out.
Sleep & HRV Data Quality
If your sleep data is wrong by 35 minutes and your HRV is comparing incompatible metrics, the foundation is cracked.
Recovery Score Accuracy
Recovery depends on sleep quality and HRV trends. Inaccurate inputs produce recovery scores that do not reflect your actual readiness.
Training Recommendations
If recovery is miscalculated, your app either pushes you too hard on low-recovery days or holds you back when you are ready to perform.
Nutrition Adjustments
Caloric and macronutrient targets depend on training load and recovery status. Wrong recovery data cascades into wrong nutrition advice.
Health Score & Long-Term Trends
Your overall Health Score and trend analysis depend on every upstream metric being accurate. One weak link compromises the entire intelligence chain.
Most health apps treat data ingestion as plumbing - a problem that is already solved. Vora treats it as the foundation. Every recommendation, every score, every trend insight is only as good as the data it is built on.
Frequently Asked Questions
Explore More Technology
Connect your devices. See better data.
Stop guessing which device is right. Vora reconciles every source into one accurate timeline - so your health decisions are built on data you can trust.