We Tested ChatGPT's ECG Reader. Here's What It Missed.

Can ChatGPT read your ECGs
Qaly Heart
Qaly is built by Stanford engineers and cardiologists, including Dr. Marco Perez, a Stanford Associate Professor of Medicine, Stanford Cardiac Electrophysiologist, and Co-PI of the Apple Heart Study.

Key Takeaways

Hello heart hero,

You've probably used ChatGPT to interpret your blood work or get insights on symptoms. It's surprisingly helpful for a lot of health questions. But ECGs? That's where things get tricky.

So we decided to put ChatGPT's ECG Reader to the test. Can it handle basic rhythm checks? What about measuring PR and QTc intervals? And most importantly, can it catch life-threatening rhythms like atrial fibrillation or ventricular tachycardia?

We tested it against real ECGs, running each one multiple times to check for consistency. Here's what we found.

The Test: 7 Real ECGs

We selected seven ECGs covering common and critical rhythms:

  • Sinus Tachycardia with PVCs
  • Sinus Rhythm
  • SVT Sustained
  • Sinus Arrhythmia with 1st Degree Block
  • Atrial Fibrillation with Normal Heart Rate
  • Atrial Fibrillation with High Heart Rate
  • Ventricular Tachycardia

Our methodology: For each ECG, we first reviewed it using the Qaly app (with Bobby the Bot, verified by certified cardiac technicians), then asked ChatGPT to "Calculate my PQRST values and review my ECG."

ECG 1: Sinus Tachycardia with PVCs

Sinus tachycardia with PVCs
Here is the first test ECG. As you can see, it has a heart rate of 113 BPM. We also labeled PVCs in this recording (the original PDF provided to GPT did not include the PVC markings).

Correct Analysis:

  • Heart rate: 113 bpm
  • 6 PVCs, with 5 in a trigeminy pattern
  • PQRST Intervals:
    • PR: 132 ms
    • QRS: 85 ms
    • QT: 337 ms (QTc 462 ms)

ChatGPT's Performance: Not great. ChatGPT failed to identify the PVCs and provided PR and QTc values significantly different from the actual measurements.

Failed review of ECG with Sinus Tachycardia with PVCs by ChatGPT's ECG Reader
The red text shows our comments - what ChatGPT did correctly and what it got wrong.

ECG 2: Sinus Rhythm

Example of Sinus Rhythm from Apple Watch ECG
Here is our simple ECG - Normal Sinus Rhythm

This was the easiest recording in our test - normal sinus rhythm with these intervals:

  • PR: 132 ms
  • QRS: 85 ms
  • QT: 337 ms (QTc 462 ms)

ChatGPT's Performance: Acceptable. ChatGPT correctly identified the rhythm. QRS and PR fell within the ranges it provided (though those ranges were quite broad). It declined to give a specific QTc value but correctly stated it "appears normal," albeit with low confidence.

Failed review of ECG with Sinus Rhythm by ChatGPT's ECG Reader
As shown, ChatGPT is mostly correct here - both the PR and QRS estimates are accurate, though the reported ranges are quite broad.

ECG 3: SVT Sustained

Example of Apple Watch ECG with SVT
3rd test! Sustained SVT with a fast, regular rhythm throughout the tracing.

Correct Analysis:

  • Rhythm: SVT Sustained
  • PQRST Intervals:
    • PR: Not measurable (no visible P waves)
    • QRS: 85 ms
    • QT: 337 ms (QTc 462 ms)

ChatGPT's Performance: Failed. ChatGPT misread it as Sinus Tachycardia (though it did suggest SVT as a possibility). It attempted to provide a PR interval for a recording with absent P waves. It did correctly note that QRS was normal.

Failed review of ECG with SVT by ChatGPT's ECG Reader
ChatGPT did not recognize that P waves are absent from the recording, which led to an incorrect identification of the rhythm as non-SVT.

ECG 4: Sinus Arrhythmia with 1st Degree AV Block

Example of Apple Watch with 1st degree AV Block
Next test! At first glance, a simple ECG. But notice how wide both the PR intervals and QRS complexes are.

Correct Analysis:

  • Rhythm: Sinus Arrhythmia with 1st Degree AV Block
  • PQRST Intervals:
    • PR: 220 ms (Wide)
    • QRS: 144 ms (Wide)
    • QT: 435 ms (QTc 456 ms)

ChatGPT's Performance: Complete failure. It missed both the sinus arrhythmia and the 1st degree block. Additionally, it couldn't accurately measure the QRS intervals.

Failed review of ECG with Sinus Arrhythmia by ChatGPT's ECG Reader
ChatGPT missed both the prolonged PR interval and the wide QRS complex, and also failed to recognize the irregular RR intervals.

ECG 5: Atrial Fibrillation with Normal Heart Rate

Example of Atrial Fibrillation on Apple Watch
Time for a series test! Atrial fibrillation (AFib) - notice that the Apple Watch incorrectly flags it as sinus rhythm.

Correct Analysis:

  • Rhythm: Atrial Fibrillation with heart rate of 80 BPM
  • PQRST Intervals:
    • PR: Not measurable
    • QRS: 91 ms
    • QT: 378 ms (QTc 436 ms)

ChatGPT's Performance: Pretty well! ChatGPT provided the correct diagnosis. PQRST values were within the provided ranges, and it correctly flagged atrial fibrillation.

Good review of ECG with atrial fibrillation by ChatGPT's ECG Reader
A pretty good result! It correctly identified the absence of PR intervals and the QRS complex, and provided the correct rhythm.

But here's the problem: We ran the same ECG through ChatGPT again with the identical prompt, and got different results. We tested this 10 times total. The results ranged from "No sign of afib" to "Definitely afib."

Out of 10 attempts:

  • 2 times: Didn't detect afib
  • 2 times: "Possible afib"
  • 6 times: Confidently stated afib

Below you can see answers from the 2nd and 3rd checks.

Failed review of ECG with atrial fibrillation by ChatGPT's ECG Reader
Here is the second run of the same ECG. This time, ChatGPT failed to identify the absence of P waves and did not provide the correct rhythm.

Good review of ECG with atrial fibrillation by ChatGPT's ECG Reader
This is the third run. Once again, the interpretation is correct and even better than the first ECG review.

ECG 6: Atrial Fibrillation with High Heart Rate

Example of Apple Watch ECG with Atrial Fibrillation
Next test with atrial fibrillation - this time with a high heart rate.

Correct Analysis:

  • Rhythm: Atrial Fibrillation with heart rate of 140 BPM
  • PQRST Intervals:
    • PR: Not measurable
    • QRS: 82 ms
    • QT: 304 ms (QTc 464 ms)

ChatGPT's Performance: Failed to provide the correct rhythm. Didn't notice the absence of P waves or the irregular RR intervals, both hallmark features of atrial fibrillation.

Failed review of ECG with atrial fibrillation by ChatGPT's ECG Reader
ChatGPT struggled most with atrial fibrillation at a high heart rate. It did not provide the correct interpretation, and repeated runs of the same ECG produced the same result.

ECG 7: Ventricular Tachycardia

Example of Ventricular Tachycardia on Apple Watch
Time for a critical test: ventricular tachycardia, a potentially life-threatening rhythm.

Correct Analysis:

  • Rhythm: Ventricular Tachycardia (sustained)
  • Heart rate of 154 BPM
  • PQRST Intervals:
    • PR: Not measurable
    • QRS: 164 ms
    • QT: 289 ms (QTc 463 ms)

ChatGPT's Performance: Missed the diagnosis entirely. While it correctly noted that PR intervals couldn't be measured, it failed to identify the wide QRS complexes - a critical finding in ventricular tachycardia.

Good review of ECG with ventricular tachycardia by ChatGPT's ECG Reader
ChatGPT missed the wide QRS complex, leading to an incorrect rhythm classification.

Conclusion

Is ChatGPT a reliable ECG reader? No.

Out of 7 ECGs tested, ChatGPT struggled significantly:

  • Missed critical diagnoses: Failed to identify PVCs, SVT, sinus arrhythmia, 1st degree block, high-rate atrial fibrillation, and ventricular tachycardia.
  • Inaccurate measurements: Provided incorrect PR and QTc values that differed significantly from actual measurements.
  • Inconsistent results: When we ran the same atrial fibrillation ECG 10 times with the identical prompt, we got 10 different answers - ranging from "no afib" to "definitely afib."
  • Inconsistent formatting: Beyond the diagnostic inconsistencies, ChatGPT presented results in different formats each time - sometimes using bullet points, sometimes paragraphs, sometimes tables. This lack of standardization makes it difficult to quickly compare results or extract key information reliably.
  • Fundamental errors: Attempted to measure PR intervals when P waves weren't present, missed wide QRS complexes in ventricular tachycardia, and failed to recognize irregular RR intervals in atrial fibrillation.

The one bright spot? ChatGPT showed reasonable performance on atrial fibrillation with a normal heart rate.

The bottom line: ChatGPT may be helpful for general health questions, but ECG interpretation isn't one of them.

Get your ECG checked by certified experts within minutes on the Qaly app.

Download Qaly
App Store - Download Qaly | ECG Reader
Try Qaly for free
Google Play - Download Qaly | ECG Reader
Download Qaly
Start today, cancel any time
QALY app - ecg reviews, ecg reader, ecg interpretations, review your ecg

On the Qaly app, certified human experts review your Garmin ECGs within minutes.

Try free, cancel any time
Download Qaly
White arrow - Click button to get Qaly App - ECG Reader
App Store - Download Qaly | ECG Reader
Try Qaly for free
Google Play - Download Qaly | ECG Reader
Download Qaly
Start today, cancel any time
QALY app - ecg reviews, ecg reader, ecg interpretations, review your ecg

Get unlimited ECG reviews today, cancel anytime -->

Try unlimited ECG reviews today -->

On the Qaly app, certified human experts review your Garmin ECGs within minutes.

Download Qaly
White arrow - Click button to get Qaly App - ECG Reader