Circumstantial Evidence-based Medicine

I was reading a journal article recently, covering a topic for which there is precious little information available. The topic, for purposes here, doesn’t matter. The point is that at the end of the article, when the author included the obligatory self-criticism, he lamented, “the primary weakness of our study is that it is not evidence-based.” I wanted to throw up. The article had done a masterful job of filling an information void, yet political correctness using the buzz phrase “evidence-based medicine,” has become so powerful that it turns bright-shining humans into dim-witted lackeys scrambling for acceptance.

What the author should have said is this: “the primary weakness of our study is that it was not a prospective, randomized controlled trial.” Clearly stated. Specifically stated. More accurately stated. Yet, not quite sufficient if one is looking for politically correct self-flagellation.

How did it come to this? What is evidence-based medicine? Haven’t we been doing this all along?

The buzz phrase emerged about 1990, but did not gain a foothold until a decade later when it was linked to the sister-concept of “guideline-based” medicine, sometimes referred to disparagingly as “cookbook medicine.” If you’ve ever wondered why your doctor spends more time in the exam room examining his or her computer more than you, it’s because they are under pressure to document layers of trivia to allow 1) more effective billing, 2) to meet oppressive accreditation guidelines, and 3) to weather the storm of criticism if someone (peer, attorney, patient) should question whether or not proper guidelines were followed.

How did this sociologic transition begin? Here are some reasons – Rising costs and limited resources. The decision of physician-led groups to police themselves rather than wait for the government to step in (of course, it’s always “other specialties” that are causing the problems). Accreditation organizations ballooning like any bureaucracy, generating more and more requirements to follow guidelines, and more importantly, to document whether it’s true or not. The computer revolution that allowed full access to published literature, creating awareness of relative ignorance. Scientific developments occurring so rapidly, no human can keep up, again deferring to computer science. The rising status of epidemiologists (medicine without the blood) and public health specialists who anointed themselves as the only neutral parties worthy of establishing guidelines (part of their ascension to the throne included studies that demonstrated the inability of many physicians to take conclusions from randomized controlled trials and counsel an individual patient correctly). As usual, the causes are multi-factorial.

What is the definition of “evidence-based medicine” anyway? You may have heard the claim that “Medicine is both an art and a science.” When I heard this term growing up in the home of a “general practitioner,” I was told that “art” pertained to bedside manner. Today, the “art” means something different, in my view. Bedside manner is in its own class, better described as “ethics” or “humanity” or plain ol’ kindness and empathy. Instead, the “art” of medicine is filling in the blanks that are left by pure science, using logic and wisdom derived from available facts. Alternatively stated, the “art” is using reason to fill the gaps left by empiricism. It is impossible to settle every issue in medicine with a prospective, randomized trial. Therefore, there will always be blanks that need to be filled in, sometimes using that nebulous tool “judgement.”

Francis Bacon (1561-1626) is sometimes referred to as the “father of empiricism” (and/or “father of scientific method”) based on his philosophical stance that inductive reasoning should guide science, not the old-fashioned syllogism or rational deductive reasoning. But even Bacon cringed at schools of thought based on pure empiricism, claiming that this approach “gives birth to dogmas more deformed and monstrous than the Sophistical or Rational School.” His famous parable of the spider (pure rationalism, spewing forth silk from within to weave a web), the ant (pure empiricism, collecting grains of dirt, but nothing from within), and the ideal of the honey bee (a blend of both empiricism and rationalism, collecting pollen and offering honey in return) reveals that regardless of his devotion to inductive reasoning and empiricism, one should be well-grounded in reason, i.e., rational thought. It is the blend of empiricism and rationalism that generates honey.

Evidence-based medicine uses a process that is admirable, organizing what was already known about high quality evidence vs. low quality, into systematic rankings for these quality levels, used both for individual publications as well as guidelines. While some claim the top of the pecking order is the prospective, randomized trial, there’s actually a qualifier that generates even higher quality evidence – double-blinded – that is, both the patient and the researcher are blinded to the intervention, be it pill or placebo. It should be evident that some trials cannot be double-blinded, and I’m referring to the area where I practice – up front medicine heavily focused on radiology and surgery. It’s difficult to ask a surgeon to perform a procedure blindfolded, or a radiologist to interpret an MRI without looking at the picture (though radiologists can be blinded to the final pathology).

Dropping down a notch on the evidence scale opens up all sorts of potential bias, too numerous to describe here. But even a well-designed clinical trial has one over-riding problem – it may not translate to the real world. The paradox here is that the greater the number of exclusion factors one uses to control for variables in the clinical trial (raising the quality of the data), the more restricted is the population to whom the results apply. While guidelines are sometimes careful to note these limitations, this does not always translate to actual practice.

Still, evidence-based medicine is inherently a worthy goal. The problem is going overboard. Academic departments in Evidence-based Medicine have emerged (maybe this is just re-christening of Epidemiology), and organizations sprout more and more guidelines, which are much more than suggestions – they are very strong recommendations that put a physician on the defensive if not followed. Even though these guidelines may be physician-generated, insurers don’t necessarily follow suit, instead, generating their own set of guidelines, differing from one insurer to another.

Rigid devotion to empiricism has many untoward side effects, including the development of guidelines that are logically inconsistent. For instance, guidelines for SERM risk reduction (pharmacologic risk reduction) in high-risk women are based on the inclusion/exclusion criteria of the clinical trials that proved effectiveness. Fine. Then, the use of high-risk screening with breast MRI are based on the criteria used in different clinical trials. Fine. But now the bottom line: inclusion/exclusion criteria were markedly different in these two available interventions. As a result, women who qualify as high-risk for SERM risk reduction may not qualify for MRI, and vice versa. The illogical result? “Here, take this pill every day for 5 years to lower your risk of breast cancer, and here’s the host of side effects you need to know about, including uterine cancer or even death due to pulmonary embolus. And by the way, you’re not at high enough risk to warrant recommending a breast MRI.” Really?

In 2013, one of the nation’s pre-eminent breast oncologists, Harold Burstein, MD (Dana-Farber/Harvard), wrote an engaging editorial in The Breast about his experience at the St. Gallen (Switzerland) breast cancer conference, entitled: “Expert opinion vs. guideline based care: The St. Gallen Case Study.” He wrote, “In contrast to the current American craze for detailed guidelines and pathways, the St. Gallen meeting unabashedly seeks to find expert consensus. There are no checklists of tests. No defined pathways. No lists of preferred regimens. No arrows pointing one of three ways based on a decision node….The tenor is to provide a direction for care that covers most of the patients rather than to script the design of care to be given to all patients with few exceptions…”

Dr. Burstein realized that this approach can be viewed in a negative light, “The looseness of the St. Gallen (conference) process alternately charms and appalls many observers from the U.S. This is particularly the case for those who look to St. Gallen to define standards of care that are transmittable to third-party payors, hospital administrators, and programmers who write electronic health record templates.”

Perhaps, Dr. Burstein was in the “charmed” group by virtue of his master’s degree in the history of science where one is exposed to the many philosophical theories as to what constitutes “the scientific method,” along with the fact that many major scientific discoveries used no methodology at all, other than rational thought. He closes his editorial with, “The current enthusiasm for guidelines and pathways has innumerable merits. But one necessary weakness is the assumption that clinical expertise can be fully bottled, packaged and shipped around the world. For those who cherish learning from wise colleagues and exploring the endless variations of clinical care, it is a delight that meetings such as St. Gallen continue to flourish.”

My thoughts on the topic are identical, but my spin a little more critical. Whereas these guidelines serve well to bring everyone up to a minimum standard, they do not encourage excellence above and beyond guidelines. Quite the contrary, the absence of a guideline can squash excellence. Witness the fact that it took Mel Silverstein, MD, arguably the most knowledgeable doctor in the U.S. on DCIS (Stage 0 breast cancer), 12 years to get his recommendation for “wide excision alone” into the NCCN guidelines. Why? His reasoning was superb, not to mention the cost-effectiveness of his approach, but his data was considered “low quality,” i.e., from non-randomized observational studies, even though he followed a strict protocol. As a breast surgeon, I adopted his system as supremely logical the first time I heard it. Those of us who accepted the Van Nuys protocol had to endure criticism from peers (for not irradiating everyone with DCIS) while Dr. Silverstein fought for recognition of his approach. Even after he managed to get his guidelines into print, there’s a notation that this is a “2b” recommendation, based on low level evidence — an asterisk, much like Roger Maris.

My beef is not with concept of evidence-based medicine and associated guidelines, in principle. My beef is with the by-product of obsessive preoccupation that seems to go hand-in-hand. I can offer many examples in my area of expertise (especially breast MRI) where excellence is squashed, and ignorance perpetrated, through slavish devotion to illogical guidelines.

Another twisted by-product of “neutral evidence-based medicine” is the fact that guidelines are no longer considered reliable when written by experts in an area who are also providers of health care in that same area. Now there’s a new concept. How do you find an expert to help establish guidelines who does not practice in that particular area? Answer: you don’t. You use experts in numbers and statistics, not experts who actually uses the proposed guidelines.

Understanding that one purpose of evidence-based medicine is to eradicate human bias (an impossible task), I can agree to go as far as restricting experts to non-voting status on guideline committees, or at an absolute minimum, allowing experts to testify at guideline meetings in order to put things into perspective through revealing nuance lost in raw statistics. Instead, some “think tanks” totally exclude practicing experts from the process. A good example is the U.S. Preventive Services Task Force on breast cancer screening, where they not only refused to consider any observational studies of screening mammography, but also refused to hear testimony from radiologic experts on screening mammography, much less have one serve as a non-voting member of the committee.

My beef is with the fact that while, in principle, evidence-based medicine is a worthy goal to provide a stronger basis for science in medicine, in fact, it is evolving with a more extended goal, that is, science to the exclusion of art. It is “high-quality data,” which may or may not correlate with Reality, to the exclusion of logic and wisdom. Ultimately, it will serve to control medical practice by those who don’t do it.

Is the Diagnosis of Breast Cancer Subjective?

A recent article in J.A.M.A. (Journal of the American Medical Association) prompted national media coverage followed by fleeting anxiety in the breast cancer community. Why “fleeting?” Because the same problem has been exposed every few years since 1991, but the ramifications are so overwhelming that it’s easier to ignore the problem entirely. The title of the article was misleading – “Diagnostic Concordance Among Pathologists Interpreting Breast Biopsy Specimens.” A more accurate title would have replaced “Concordance” with “Discordance,” given that the findings were shocking (unless you’ve followed this controversy for the past 24 years). In brief summation of the study, pathologists don’t agree on which patients have atypical hyperplasia (AH) vs. ductal carcinoma in situ (DCIS) even though the clinical implications are huge. For the former (AH), at most, the recommendation is a wide excision at the site of the AH. A comprehensive breast center will also refer the AH patient for high-risk counseling and interventional options such as aggressive screening. But if the flipped coin lands on DCIS, it’s “cancer,” and that includes radiation therapy and possible endocrine therapy. Some women even opt for bilateral mastectomies.

 

In lieu of going to the animal lab my “research year” of surgical residency, I opted to spend the academic year of 1977-78 in a surgical pathology fellowship at UCLA, in what turned out to be the pivotal year of my career. If I had to describe, in one word, my most lasting impression from that experience, I would choose “subjective.” Clinicians without pathology experience believe that the findings under the microscope are completely objective and as close to pure science as anything in medicine. Often, this is the case. But not always. And certain problems, such as ADH vs. DCIS, are highly subjective.

 

In the 1970s, David Page, MD (Vanderbilt) introduced breast cancer risk levels associated with various benign biopsy findings, which brought “atypical hyperplasia” out of the lab and into the clinic. Critics countered with an article by Rosai et al in the American Journal of Surgical Pathology in 1991 that revealed the classification system as being too subjective for clinical use, with wide disagreement in diagnoses among experts. Dr. Page and other experts responded in 1992 with an article in that same journal showing that strong agreement could be achieved after consensus training – among experts, that is. There was no extrapolation to general pathologists. Even then, the agreement was in the eye of the beholder. In the view of pathologists, concordance was excellent. But from the perspective of clinicians, not so much. When the distinction between AH and DCIS was considered, at least one expert disagreed with the other 5 in most of the cases. Yet, clinicians have been treating these diagnoses as black-and-white entities for decades.

 

So, in the 2015 article now making a splash, it’s a double whammy. If experts don’t agree, how did these researchers establish the “true” diagnosis for each biopsy by which the “average” pathologist was to be compared? The fine print reveals that 3 experts were unanimous on the diagnosis in only 75% on the first try, though differences were eventually hashed out to form a consensus-derived diagnosis. In the second whammy, the same biopsy slides were reviewed by the study group, that is, 115 pathologists who then proved to be disturbingly discordant from the consensus, especially when it came to differentiating AH from DCIS. (I won’t belabor here the shocking discordance in 5 of 72 cases of completely benign findings where a significant number of pathologists called the lesions “invasive cancer.” Nor will I address here the equally shocking finding that at least one pathologist labeled 22 of these 72 completely benign cases as DCIS. Those findings are a different problem than “subjectivity.”) In short, the 2015 conclusion is identical to what many have been saying all along, only using remarkably sophisticated techniques and statistics to add punctuation to a sentence that was written 24 years ago.

 

In 1991, still fresh from my subjective enlightenment at UCLA, I made a 35mm teaching slide for academia that claimed serious trouble would eventually brew if we didn’t acknowledge this problem and merge AH and low-grade DCIS into one diagnosis – call it “borderline” if you will. Treatment, I claimed, should be the same for both entities, i.e., wide excision alone. The gynecologists had the same issue going on with “severe dysplasia” of the cervix vs. “carcinoma in situ,” but they had already done the smart thing, recommending the same treatment (cervical cone) for both diagnoses. The distinction is subjective, the treatment is not. In 1993, results of the NSABP B-17 clinical trial indicated that all women with mammographically-discovered DCIS should undergo lumpectomy and radiation therapy. Other studies confirmed the same, always using the dichotomous approach of separating ADH from DCIS as distinct entities, treating the latter aggressively.

 

Dr. Mel Silverstein and Dr. Mike Lagios have done more than any of us with regard to this problem by introducing a scoring system (Van Nuys Prognostic Index in 1997) that ends up guiding treatment so that small areas of low grade DCIS are excised as you would AH, without radiation. But acceptance of “no radiation” came only after 12 years of mud-slinging conflict, and even today, the evidence-based medicine forces make sure that everyone knows that “excision alone” for selected cases of DCIS is only “2b” evidence (weak evidence, as opposed to prospective, randomized trials).

 

The controversy has far-reaching implications, not only with regard to correct diagnosis and treatment, but also when it comes to screening. Anti-screening activists love to parade this issue around in its nakedness when discussing the harms of screening. And, in fact, they are correct. As long as we stumble over the subjectivity of AH vs. DCIS, as long as we keep irradiating women with borderline lesions, as long as women undergo bilateral mastectomy for these marginal lesions, then this controversy is truly the greatest harm of screening, as it is mostly a by-product of widespread mammography followed by subjective pathology. Forget mammography call-backs, “unnecessary biopsies” and the like, highly overstated as harms by anti-screening critics. The real potential for harm is not with radiologic standards, but with our unwillingness to adopt a “borderline lesion” approach in this problem of AH vs. low-grade DCIS, thus avoiding overtreatment.

 

There is very little discordance when it comes to high-grade comedo-type DCIS. The problem is distinguishing low or moderate grade DCIS from atypical ductal hyperplasia (AH or ADH). A diagnosis in this category should be called a “borderline lesion,” and standard treatment should be wide excision followed by high-risk counseling. Then, the benefit of knowing about a significant risk factor is enhanced, the harms of screening minimized, and everyone is happy, sort of. Will it happen? Of course not. It’s far too sensible, and would require retractions from countless experts.

I’ve been harping about this controversy since Rosai’s 1991 article, even before results from the clinical trials that added radiation therapy to DCIS management. Considering all of the above, perhaps it’s not so strange that I limit my new patient practice to women with “tissue risks” found on breast biopsy, e.g. AH/DCIS. It’s considered “going the extra mile” when I opt for an additional pathology opinion from well-known experts. Yet, if the experts don’t agree…what next? A big part of my role is explaining the nature of this controversy while, at the same time, offering guidance, erring on the side of caution.