Working on the ambulance a couple of days ago, I had my first scene flight in quite a while. A 27 year old male reported classic stroke-like signs and symptoms (slurred speech, facial droop, unilateral weakness). It was a bit of a confusing case because the patient was so young (but had several CVA risk factors) and not hypertensive. However, my partner and I were able to rule out all those common CVA mimics like hypoglycemia, Todd's paralysis, drug intoxication, etc. So we flew him (less than an hour from symptom onset, great times!).
And the patient was discharged from the ED later that day.
It was an uncommon enough case that I presented it to the paramedic class, and we all got into a great discussion about their experiences with CVA mimics, the Cincinnati Prehospital Stroke Scale, and so forth. As a spinoff of that, I started digging into the research surrounding prehospital stroke scales and screens. This post, I'll examine the research behind the most-commonly used scale (I think), the Cincinnati Prehospital Stroke Scale (CPSS).
Origin of the CPSS
The CPSS was first published in 1997, in Academic Emergency Medicine. Here's a link to the free full-text PDF:
http://onlinelibrary.wiley.com/doi/10.1111/j.1553-2712.1997.tb03665.x/pdf
There are some issues I have with the methodology of the study, and the patient population. The abstract says a "prospective, observational, cohort study" was performed. When you read the methods section, however, both the "stroke" and "non-stroke" groups were pulled from a previously published study done on thrombolytic therapy. What criteria did the authors use to include patients in the thrombolytic study? Did that somehow bias the patient selection for this study? I don't know.
Looking at the demographics of the two groups, there are a couple of anomalies. The "non-stroke" group is almost exactly 3 times the "stroke" group. Curiously, the "non-stroke" group was almost twice as old as well (the authors didn't report a range or standard deviation, so I'm not sure how wide the ages of the two groups were). That's opposite of what I would expect, considering the epidemiology and risk factors for CVA.
Nonetheless, the authors pushed ahead and found, through statistical analysis, that presence of facial palsy, difference in arm strength, and dysarthria when assessed together in this small group of stroke patients, was 100% sensitive and 92% specific for predicting the presence of stroke. They decided that dysarthria was going to be difficult to distinguish from aphasia, so they changed their model to "abnormal speech", reran the numbers, and decided that the modified test had 100% sensitivity and 88% specificity.
Heady claims.
My take on the article: This was a small group of patients (299 total, only 74 of which were diagnosed with stroke). The inclusion criteria may have been biased by grabbing patient data from a thrombolytic therapy trial (which presumably had some pretty tight inclusion criteria because of the inherent risks of tPA and the like). Overall, I'm not really comfortable with how the CPSS was created.
Validation Studies
Of course, like any good clinical prediction rule, after being created, the CPSS needed to be validated. So three of the original authors grabbed two other MDs and published a study entitled "Cincinnati Prehospital Stroke Scale: Reproducibility and Validity". Here's a link to the PubMed citation; unfortunately, the article isn't available free full-text:
http://www.ncbi.nlm.nih.gov/pubmed/10092713
Since the article isn't available full-text, let me summarize. The authors took a total of 2 MDs and 24 EMTs and paramedics, and had them score patients identified as "stroke" or "non-stroke". The patients were drawn as a convenience sample from the ED, and from patients on the neurology ward (with CVA, TIA, and several other neuro conditions). The convenience sample in the ED? The authors wrote "An attempt was made to identify patients with chief complaints that were suggestive of stroke or of other diseases that could be mistaken for stroke". The numbers, again, were a little disproportionate; 49 in the "stroke" group and 122 in the "non-stroke" group. Interestingly, the mean age between the two groups was flip-flopped from the original study; 55.8 in the non-stroke group and 62.5 in the stroke group. (That sounds a little more like what I'd expect). For the analysis of sensitivity and specificity, the authors eliminated 11 patients with a diagnosis of TIA, further lowering the "stroke" group to 38 (vs. 122 non-stroke patients). For results, the authors stated that a single abnormality on the CPSS had a sensitivity and specificity of 66% and 87% for physicians and 59% and 89% for prehospital providers, respectively. 3 abnormalities has values of 11% and 99% for docs and 13% and 98% for medics (also respectively). Their conclusion from the abstract: "The CPSS has excellent reproducibility among prehospital personnel and physicians. It has good validity in identifying patients with stroke who are candidates for thrombolytic therapy, especially those with anterior circulation stroke."
My take on the article: Reproducibility, sure. Validation I'm not so sure of. The patient group sizes were pretty similarly unequal. The study doesn't identify any specific inclusion criteria for the ED patients; whoever the doc thought the test might work on in the ED at that particular time made the cut. I think that the age differences between the patient groups were a little more realistic. However, the study was conducted in the hospital, not in the environment that EMTs and paramedics would be using the CPSS. And the sensitivity values for this study were far different from the original published sensitivity from the chi-square calculator; 66% and 59% instead of 100%. (Additionally, the range in sensitivity for the MD group for 1CPSS abnormal finding was 49-80% to reach a 95% confidence interval. That seems like a pretty broad range!)
And thus, the CPSS became "validated" and approved for prime-time use. There were a few other studies I found in my PubMed/CINAHL search that gave me some data on the CPSS; these were studies published to test a different stroke screening device. I'll point out the other tests in the "Part 2" of this topic, but here's a quick table summarizing the predictive values of the CPSS in those studies:
Study
|
#
Patients
|
CPSS
Sensitivity
|
CPSS
Specificity
|
Kothari R, et al (1997) (original
study)
|
299
|
100%
|
88%
|
Kothari R, et al (1999) (validation
study mentioned above)
|
160 (those that made the analysis)
|
66% (best of two groups)
|
89% (best of two groups)
|
Bray J, et al (2005)
|
100
|
95%
|
56%
|
Mingfeng H, et al (2012)
|
540
|
88.77%
|
68.79%
|
Studnek JR, et al (2013)
|
416
|
79%
|
23.9%
|
Frendl, et al (2009)
|
154
|
74%
|
41%
|
One caveat; some studies reported sensitivity and specificity values for 1, 2, or 3 items on the CPSS being abnormal. On the chart above, I listed the values for 1 criteria abnormality. Actually, let's look at the two studies that did examine multiple-variation sensitivities and specificities:
Study
|
CPSS-1
Sensitivity
|
CPSS-1
Specificity
|
CPSS-2
Sensitivity
|
CPSS-2
Specificity
|
CPSS-3
Sensitivity
|
CPSS-3
Specificity
|
Kothari, et al (1999)
|
59%
|
88%
|
27%
|
96%
|
13%
|
98%
|
Frendl, et al (2009)
|
74%
|
41%
|
37%
|
64%
|
21%
|
73%
|
As you can see, the specificities from a study performed in the field (Frendl) don't even come close to those reported in the ED-based study (Kothari). I think that's important because we're using these scales/screens to identify patients who might benefit from thrombolytic therapy and get them preferentially to a hospital capable of doing that. That's a common goal; however, depending on the service you work for, it takes different logistical forms. At the EMS job I used to have, this entailed driving the patient about 15 minutes further. At the EMS job I currently have, it entails bringing in a helicopter. The greater the risk to the patient, the more sure I want to be that the juice is worth the squeeze; for my old job, high sensitivity and lots of false positives are acceptable in the face of the risk. At my current one, false positives are risky for everyone; I want something with great specificity.
So what does all this data mean? In my opinion (and this is just that, an opinion. Read the articles yourself and draw your own conclusions. And bear in mind I have very little formal training in statistics or research methodology; so if I'm drawing incorrect conclusions, let me know!)....
I think the original patient group that the CPSS was derived from is a little "hinky". I don't think it was validated as conclusively as the original authors claimed. Other studies with more subjects have shown a wide variety of specificity values from 23-68%. I question whether or not the CPSS has actually been "validated" at all. And I'm looking for an alternative that offers greater specificity, more consistently based on the risks to my patients.
In part 2, I'll post some info about alternative prehospital stroke scales/screens and discuss the strengths and weaknesses of those.