The Alchemy of Lumpectomy Margin Guidelines

First, it was lumpectomy margins for invasive disease, and then in late August 2016, margins for DCIS. In the former case, it was “no ink on tumor,” and in the latter, “2mm is adequate as long as radiation is planned.” After a quarter-century of breast conservation, with absolute chaos reigning the entire time as to what constitutes an adequate margin, the majority are shouting, “Amen!”

Now, go back and read my July 2016 blog “Guidelines Morphing Into Canon.” Guidelines are great, as long as they are used as guidelines. Instead, guidelines have a way of becoming “standard of care,” and from there, it’s a small leap to canon, with any departure being a grave violation of patient care, and certainly not worthy of reimbursement by the insurer who only has the patient’s best interest at heart.

Lumpectomy margins are probably the most random, irreproducible, inaccurate measure we have in breast cancer management today. I sometimes joke that “pathologic margins are only a surrogate for, well, true margins.” But somehow, if you collect enough experts together in the same room, representing three major cancer organizations (SSO, ASTRO and ASCO), then wave a statistical wand over a plethora of clinical studies, you can turn lead into gold!

Having spent a year in the pathology lab, and being intimately familiar with tissue processing, I have a different view. From the time the lumpectomy specimen is being removed from the patient, aberrations begin to occur that give phony confidence when the margin is assessed. From the specimen radiograph that can knock off the outer rim of fat, to the handling of the gross specimen, to the ink that slides down into the crevices, to the ink that can be dragged into the specimen with the scalpel, and on and on, I can tell you why false-positive margins emerge. And then, on the false-negative side, there are even more reasons. The point is that margins are highly unscientific. So, how do p-values correct this? They don’t, but they give the illusion of certainty and the delusion of evidence-based medicine.

There are so many variables that should enter into the decision for re-excision that I can’t begin to list them here. But the first problem with this new standard is that “DCIS” is addressed as though it were a single entity. But a 5.0cm DCIS where the extent is not seen well on imaging, and with margins of only 2mm at multiple sites around the specimen…well, it’s not even in the same category as the 0.5cm DCIS that fits on one slide were you see the entire lesion.

As DCIS grows, the margins become less and less reliable, and imaging of all types can be unreliable as a roadmap as well. How many prospective, randomized clinical trials have been performed to test the value of radiation in women with Grade 1 or 2 DCIS that measures under 1.0cm? Answer: None. Yet, these women are told to undergo radiation because prospective RCTs have shown benefit to radiation, even in “good DCIS,” failing to mention that so-called “good DCIS” in those trials, when Grade 1 or 2, can be up to 2.5cm in the study protocol, with 3mm margins. That’s a different animal than the 0.5cm DCIS that was also 0.5cm on imaging that I described above, especially if you can accomplish 1.0cm margins. In this latter situation, recurrence rates with wide excision alone are close to zero, making it very difficult for radiation to add anything other than trouble.

Take note, by the way, that if one thinks this through (called rational thought, often a passé approach in today’s climate), and you get only 2mm margins on your first excision of a 0.5cm Grade 1 DCIS, the Triumvirate would spare you the re-excision (and this is where they are being praised), but insist upon radiation. In contrast, you could perform re-excision and spare the patient radiation therapy. Which can cause more trouble – a re-excision or unnecessary radiation? I hope the answer here is self-evident.

The congratulatory comments have been pouring in for the New Triumvirate of Margins – SSO, ASTRO and ASCO – with other societies chiming in their support. The take-away message is that this 2mm directive is a wonderful advance in that it will cut down on the need for re-excisions. In fact, the leader of this entire “margin guideline” movement is a breast surgeon who, only a few years before this crusade, published one of the highest re-excision rates ever reported – 60%. The reason for this reinvention of one’s self is unknown, but it’s remarkable that one can go from assuring the world that a 60% re-excision rate is okay, to the extraordinary steps taken these past few years to minimize re-excisions.

Regardless, this movement to cut down on re-excisions is no doubt admirable. We are trying to minimize overtreatment these days. Critics point out that DCIS should not be called cancer at all, but there’s a more elemental question than that – Many of the small, low grade DCIS lesions I see in consultation are not DCIS but ADH (atypical ductal hyperplasia), best termed a “borderline lesion.”

Many clinicians would be shocked to understand the subjectivity that comes with DCIS, esp. small, low grade lesions. The articles have been published (most recently by Elmore et al, JAMA 2015), but the implications are so overpowering that surgeons and radiation oncologists just tuck the information away in their individual safe places. But the fact remains, we have known since the days of the BCDDP study in the 1970s that there are lesions for which there is no agreement on the diagnosis. My solution to this problem is staggeringly simple – admit these lesions are “borderline” and treat them all with wide excision, NO RADIATION. This alone would spare more overtreatment than this entire Margin Triumvirate trend, which has been focused on a lesser endpoint. My solution is not original. A small minority of pathologists have been trying to draw attention to the problem for over 50 years, but clinicians will have none of it. They want black-and-white from a medical discipline (pathology) that happens to have shades of gray as part of its core.

So, here’s how I rank the most urgent problems today with regard to DCIS:

  1. Unnecessary radiation therapy for a non-life-threatening condition and for which there is no survival difference after treatment (the current Margin Triumvirate was not able to reach a conclusion or recommendation on this most important issue).
  2. The correct diagnosis, which ought to be “borderline lesion” in some cases, the working definition being whenever two experts disagree (this phenomenon is so well known in the hidden world of pathology that practicing pathologists will even tell you, correctly, that “Dr. X would call this lesion atypical ductal hyperplasia, while Dr. Y would call this DCIS. So which diagnosis do you want?”
  3. How do we handle the “STOP Pre-op MRI” movement when, for the DCIS patient, 2% (our publication) will have a life-threatening invasive cancer on the opposite side of their known DCIS. In other words, without MRI, one in every 50 patients undergoing some minimal treatment of their DCIS is going to have an untreated invasive cancer on the opposite side. Imagine anywhere else in the body where “wrong side surgery” is performed as a matter of routine once every 50 times! Yet, pre-op breast MRI is maligned from every angle possible, and especially so by the leadership of the Margin Triumvirate. Remarkably, because pre-op MRI in DCIS does not help much with the index lesion, it is maligned even more than when used for invasive disease. Yet, for our tiny 2% (nothingness to a statistician), it’s a matter of life and death, bringing up the possibility that MRI is even MORE important for DCIS than invasive disease. After all, we don’t alter survival with pre-op MRI in invasive cancer, but what about our 2% in of women with DCIS? We’re talking about 1,200 women each year diagnosed in the U.S. with DCIS who actually have mammographically invisible invasion on the opposite side.
  4. How do we accept the notion that nothing is important beyond 2mm when a vast body of data exists that show a direct relationship between margin size and local recurrence rates? Do we ignore the entire body of work by Silverstein and Lagios (USC/Van Nuys Prognostic Index), replicated by others? Do we reduce the seemingly infinite clinical presentations of DCIS and call it a single entity and offer simplistic solutions in the name of “generalizability.” DCIS is a complex array of entities, and there are many breast surgeons who put an extraordinary amount of thought into re-excising as needed, not when guidelines dictate a one-size-fits-all approach.

 

Re-excisions are not the most pressing problem with regard to lumpectomy, but of all the outstanding issues, why did this group try to tackle the least scientific? Why did they focus on turning lead into gold?

I have said it many times before – when it comes to prospective, randomized trials that involve surgery or radiology, there is no such thing as pure science, and that means evidence-based medicine cannot thrive here like elsewhere. Why? Inadequate blinding and huge differences in quality.   Excellence is put on the back burner, and the focus is on “generalizability” – or, stated alternatively, the acceptance of substandard care into the trials.

In contrast to this poorly controlled situation, a miracle drug can be tested with triple-blinding (even the study sponsors don’t know who is taking what), and the pill being studied is standardized with a high degree of quality control. Imagine the absurdity of a clinical trial for a new drug, with each participating hospital responsible for manufacturing its own drug, free of quality controls, with these directions: “Whatever works best for you. We want the results to be generalizable.” That’s exactly what happens in surgery and radiology trials. If there is a more unscientific parameter than MARGINS that has been subjected to evidence-based medicine, I’d like to know what it is.

Oh, that’s right. The American Cancer Society subjected the clinical breast examination to the rigors of evidence-based medicine, and decided it should be stopped! I guess there really is no limit to the absurdity to which statisticians and their death-to-rationalism clinicians will follow.

The Notion That “Good Science Means Less Screening” Has to Stop

Fast forward to a time when the predicted water shortage extends to the entire country. And rather than admit the shortage and the need to ration water, the government begins a campaign to convince the public that daily bathing is potentially harmful, and offers little in the way of health benefits. In fact, bathing is painted as an unnecessary luxury, and no one bathes like Americans anyway. We should slow down to every other day, or every week, or not at all. The population responds with the usual division – pro-bathers vs. anti-bathers, and those who cling to the middle. Journalists jump into the fray, pointing out the dangers of soapy water going down the drains, the unnatural alteration of our skin flora, and the harmful chemicals that are absorbed from perfumed soap. Finally, the government steps in and enforces weekly showering by mandating timed controls on our shower heads. “Once a week” becomes the new mantra taught to all in order to control the minds of the population.

Well, we are experiencing this very phenomenon when it comes to mammographic screening for the early detection of breast cancer. Rather than admit that it has become too costly (depending on how costs are calculated) to screen annually starting at 40, we are in the midst of a propaganda campaign designed to limit or eliminate mammographic screening. Out of the thousands of breast cancer books on the market, you might be surprised to learn that, other than textbooks and monograph/pamphlets, there are no books with the lay public in mind, having the single goal of justifying “start at 40, annually” screening guidelines. It’s a “dog bites man” issue, in that the justification for early diagnosis is self-evident. Or so we thought in years prior. Today, a concerted effort is being made to cut back or eliminate breast cancer screening, and these anti-screening forces are gaining ground even among some breast cancer lay activists. The new mantra is: harms outweigh benefits. You can find plenty of books to support this view as the American public is being re-programmed to bail out of their love affair with screening. So, if the title of my upcoming book seems self-evident, it is not. If the topic seems self-evident, it is not. Who would have thought that we would ever need a book to justify early diagnosis? The best I can tell, this book will be the first to tackle the anti-screeners head-on, and to justify a bottom line that has been misplaced – that is, less screening means more breast cancer deaths. Period.

Mammography and Early Breast Cancer Detection: How Screening Saves Lives is a polemic that justifies the pro-screening school of thought. Available later this fall, I’ll offer the Table of Contents below as a teaser. One can readily see that this will not be a dry recitation of medical facts, but instead, a tour through the smoke-filled rooms inhabited by public health experts who are, remarkably, peddling death, while claiming “Trust us, we’re doing what’s best for the population as a whole.”

http://www.mcfarlandbooks.com/book-2.php?id=978-1-4766-6610-5

Table of Contents

Acknowledgments vii
Preface 1
1. Last Word vs. Final Word 5
2. Early Diagnosis May Be the Key, but It’s Not a Lock 10
3. Biology Can Trump, but Size Matters 16
4. Prostate Is Not Breast, So Give It a Rest 22
5. The Four Horsemen That Inflate the Power of Mammography 31
6. The Four Horsemen Are Throttled by Clinical Trials, but O Canada! 41
7. The Mammography Civil War (1993-1997) 49
8. The Number Games 61
9. The (Over)Selling of Mammography 67
10. The Evidence for ¬Evidence-Based Medicine (or, How to Raise the
Bar of Bias: An Editorial) 74
11. Blame It on Canada (and Something’s Rotten in Denmark, Too) 82
12. Overdiagnosis: Embracing Your Inner Malignancy 88
13. Overdiagnosis Part 2: A Way Out of the Wet Paper Bag 95
14. The Task Force Opens Fire 105
15. The Zombies Among Us 121
16. Circumstantial ¬Evidence-Based Medicine 128
17. The Social Tsunami of ¬Anti-Screening 137
18. The 2015 ACS Peace Accord–Science or Societal Pressure? 144
19. A Journey to the Pathology Lab to View the ¬By-Products of Screening 152
20. Risk-Based Screening–It Feels So Right, but Wait… 168
21. The Greatest Story Never Told 178
22. The Myth of Mammography 184
23. Do These Genes Make Me Look Dense? 192
24. The Emperor of All Modalities 201
25. The Bright Side of the Dark Side of the Force 210
26. The Crystal Ball Is Fair to Partly Cloudy 222
Chapter Notes 229
Bibliography 241
Index 247

GUIDELINES MORPHING INTO CANON

Guideline medicine is relatively new, that is, the practice of following published guidelines based on evidence-based medicine. These guidelines are derived from consensus panels of clinicians, with whom insurers may or may not agree; however, these third party payors feel the pressure to cover diagnosis and treatment as outlined by physicians. Thus, the strength in collective expertise allows one to practice good medicine with greater ease…usually.

But there are problems, one of which is the fact that experts don’t always agree. Two sets of eyes can look at the same data and come away with different interpretations. And, when it comes to surgical and radiologic guidelines, things can really get murky because prospective, randomized, controlled trials cannot be purely blinded. Partial blinding can be done, but these trials are never as pristine as drug trials where the dummy-pill control looks just like the real thing, and neither patient nor doctor knows who is taking what. So, when it comes to surgical technique and radiologic interpretive skills, there is an unsettling tendency toward the “tall poppy syndrome,” that is, chopping the head off excellence in the name of uniformity.

Some clinical trials make no effort whatsoever to ensure quality care. The reasoning is this: your results must be “generalizable” to the community standard because you can’t expect everyone to become an expert. Centers of Excellence have thus lost considerable ground. When pre-operative MRI, for instance, was subjected to a so-called “high quality” prospective, randomized controlled trial (COMICE), the epidemiology might have been statistically sound, but the technology was substandard, the interpretations were substandard, cooperation and communication with surgeons was poor, and the outcomes widely misunderstood. To many, the p-values were the only important feature, and pre-op MRI was widely condemned. After all, it was a “prospective, randomized controlled trial” and the mere utterance of those words render magical truth.

Yet, for those who have worked hard to achieve excellence in the use of breast MRI, our outcome data is completely different than the COMICE trial results, and we can make a very strong case that MRI should be done routinely…IF IT IS DONE WELL. But critics say if you only study MRI at centers of excellence, your outcomes will have no external validity. Okay. So should we all aspire to mediocrity? “Bad MRI is worse than no MRI at all,” is an adage we have been using ever since the introduction of breast MRI into clinical practice. Yet, the pro-MRI data is routinely ignored, while policy-makers cling to substandard MRI results. A meta-analysis was even performed where mediocrity was studied in the collective sense, somehow rising above the “garbage in, garbage out” designation.

Most clinicians are familiar with the epidemiologic biases – selection, lead time, length time and overdiagnosis. But there are minor biases as well. One of them is called “file cabinet” bias or “shelf bias,” slang for the fact that enormous amounts of data go unpublished. What is the relationship between the 1% of data that gets published versus the 99% real life data that sits on shelves or in file cabinets? For instance, I have detailed data for perhaps the largest series of consecutive, routine pre-op MRI in existence. Most centers use pre-op MRI selectively or rarely, and their data is highly skewed – that is, there’s a reason why some women get MRI and some don’t, so any attempt to compare the two groups is highly biased, with the MRI group composed of younger patients, denser breasts, lobular histology, and extensive in situ components. When “no difference” in outcomes is reported with or without MRI, it may well be that the MRI group would have had much worse results without the MRI. Maybe not. However, the mere fact that a cavalier dismissal of MRI is the conclusion of these studies, in the face of glaring selection differential, is a stunning disregard of the power of bias.

But our data at Mercy Breast Center-OKC is unique. We can compare cancer yields in sub-groups to a degree greater than anyone (to my knowledge), by virtue of the complete absence of selection bias. The downside is that we have no real-time control group, so we cannot make claims about MRI vs. no MRI. With only historical controls for reference, we cannot provide high quality data as to what would have happened without MRI. But what we surrendered on that front actually strengthened our data on another front – sub-group comparisons.

When it comes to sub-group comparisons among those women who have undergone pre-op MRI, our data is unique and invaluable. This is one reason we have persisted with routine pre-op MRI as long as we have, given the apparent absence of a comparable data bank anywhere. I’ve personally kept the extensive database for 13 years, logging in over 2,000 studies, with close comparison of final path to what the MRI predicted. Yet, the data beyond our first 603 patients remains unpublished (at this point), so it doesn’t even count. And when we published our results after 603 patients in the American Journal of Surgery (Vol 196: 389-397), we had no idea that MRI was going to have its feet held to the flames in a way that was never done for pre-op mammography (for which there is not a shred of evidence, btw, that pre-op mammograms for palpable tumors alter the outcomes demanded of MRI). As a result, we didn’t spend a lot of time in that publication addressing the sub-groups, namely age, breast density, and histology.

So, when I see guidelines that say, “Pre-op breast MRI is an option for women with dense breasts and/or lobular histology,” I have to respond: “Based on what evidence?” Even though I agree with using MRI in these instances (at a minimum), the belief that MRI cancer yields are higher in dense breasts and lobular histology is not as clear-cut as most believe. If you look at our (unpublished) data, you are drawn to this conclusion – either do MRI routinely or not at all. The sub-groups targeted by the vast majority have nearly the same cancer yields as those where MRI is thought to be unnecessary. Yes, there are differences, especially with lobular histology, but not a clear cut-off point. What irony! “Selective use of pre-op MRI” is based almost entirely on a “gut feel” without empiric back-up, and our decision to keep performing pre-op MRI routinely is based on actual data, the gold standard of evidence-based medicine. Yet, we are the ones under fire.

In the community setting, we have no resources or residents to help with the publication process, so what we’ve managed so far is done at considerable burden in the private sector. But even when we do publish, our results have been largely ignored, perhaps because they run counter to the party line. Witness what should have been a highly provocative article that we published in the use of pre-op MRI in patients newly diagnosed with DCIS. At the time, it was the largest series of DCIS and pre-op MRI ever reported (The Breast Journal 2012; 18:420-427). Furthermore, our implications involved survival differences in DCIS patients, unlike invasive disease where survival should not even be an endpoint with regard to the use of pre-op MRI.

Many believe that pre-op MRI is of limited value in invasive disease when it comes to better local management of the index lesion, and then absolutely worthless in DCIS. And, because these critics are focused entirely on the index lesion, as are nearly all publications, I cannot disagree when it comes to the known area of DCIS. DCIS is so hard to define during surgical excision that a good road map doesn’t help much, if at all. But we asked a different question entirely, based on data from MD Anderson (Dawood et al. Ann Surg Oncol 2008: 15:244-249) where they reported a large observational series of 799 DCIS patients, treated unilaterally between 1976 and 2005 (no MRI is assumed for the vast majority, if not all), with the finding that the most likely event after treatment was invasive cancer (at a rate of 3.9% after only 2.9 years), usually in the opposite breast. Even more concerning was the associated increase in disease-specific mortality following the second event. Commentators (Drs. Lagios and Silverstein) pointed out that with such short follow-up, the contralateral invasion was probably present at the time of DCIS diagnosis, but remained undiagnosed.

Our series was intended to address that possibility. We ignored the index DCIS lesion, we ignored other areas of DCIS found on MRI, and we focused entirely on “elsewhere” sites of invasion that were not connected to the known DCIS, and in fact, were either in separate quadrants or were contralateral. Our results were nearly identical to the MD Anderson numbers, with “elsewhere” invasion present in 3.5% of our 285 patients.

Occult ipsilateral invasion might be treated incidentally with whole breast radiation (untreated if partial breast radiation is used), though 4/5 patients with ipsilateral “elsewhere” invasion turned out to have Stage IIA disease. But most unsettling were the findings in the contralateral breast where there will be no treatment other than possible endocrine therapy, generally considered inadequate as sole treatment for invasive disease. And again, there was a surprising proportion of Stage IIA disease, this time in half of the patients. It is very difficult to call untreated Stage IIA breast cancer “subclinical” as was the adjective of choice of a vocal MRI critic in describing all MRI discoveries.

This 2% risk of untreated contralateral invasion introduces an intriguing scenario. What do you do with this bit of knowledge? Do you perform 98 routine MRIs to capture the 2? But if you don’t, you are in essence, performing “wrong side surgery” in 1 out of every 50 DCIS patients. Take that wrong side surgery and apply it anywhere else in the body, and you’ve got gross malpractice. But in DCIS management, you have 2% of 60,000 women every year in the U.S, or 1,200 patients, who have their DCIS treated on one side, while life-threatening cancer is left in place on the opposite side (ignoring the trend toward bilateral mastectomies). I don’t have the answer to the best approach here, but I would run from dogma that says “Don’t routinely perform pre-operative beast MRI.” 1,200 lives are at stake every year with untreated (unknown) contralateral invasive cancer, and the current standard of care says, “Don’t worry about it. Stick your head in the sand. MRI is bad enough for invasive disease and plays no role whatsoever in DCIS.”

Enter the new organization called “Choosing Wisely,” http://www.choosingwisely.org, which asks specialty societies to list 5 things that should NOT be done in their specialty. An organization to which I belong, the American Society of Breast Surgeons, fell for the trap and opted to include “routine pre-op MRI” as one of the five “Don’ts.” This is a new level of dogmatism we haven’t seen since Halsted, something that has no place in science. Rather than guidelines that suggest proper treatment, we are now slipping into different territory, that is, condemnation. The very definition of wisdom decries dogma and eschews condemnation. So, as one who has a strong handle on our own extensive data plus the published literature, it is disconcerting to hear this new school of thought, which says, “WE can’t make it work, so YOU have to stop doing it.”

Our DCIS article should have started a buzz, but it got no attention at all, and to my knowledge has never been referenced. I don’t have the answer, but 2% wrong side surgery is deeply troubling. Some of the mysterious breast cancer-specific deaths of women who have only been diagnosed with DCIS, but then show up with metastatic breast cancer (landmark article in the October 2015 issue of JAMA Oncol 1:888-896 by Narod et al) could easily be due to undiagnosed invasion in the opposite breast. There are other explanations for the bizarre results in that study, of course, but I’m just tossing another possibility into the ring.

Remember, it is not our data that raised the issue of reduced mortality after DCIS and second events. It was the MD Anderson observational data that documented diminished survival in those women who were diagnosed with invasion disease within a mere 2.9 years of their DCIS, most commonly in the opposite breast. Would MRI up front at the time of DCIS diagnosis improved survival? I don’t know, but it is certainly possible. A known delay in diagnosis of invasive breast cancer of even 6 months buys you a guilty verdict in a malpractice trial, but we really don’t know what the consequences are of ignoring the 2% who undergo wrong side surgery for DCIS, when it ought to be invasive cancer treatment for the other breast.

“Choosing Wisely” needs to reconsider their use of the word “wisdom.” Perhaps, they should call themselves “Choosing Efficiently.” And perhaps, the American Society of Breast Surgeons should think again about their willingness to say, “Don’t.” In a sense, knowledge shrinks as wisdom grows (not mine – it’s a quote from philosopher Alred North Whitehead). We need more policy makers who have read Nobel laureate Dr. Richard Feynman’s essays on scientific wisdom. Of the many insightful quips of this transcendent individual, one of my favorites is paraphrased like this: “When your results verify your hypothesis and excitement abounds, stop and think through all possible alternative explanations.”

This is no small issue. Guidelines are just that. Even Johnny Depp in Pirates of the Caribbean, when challenged about not following the “pirate code,” told his adversaries that pirate laws were “really more like guidelines.” Unfortunately, when some policy-makers use the word “guidelines,” what they intend are “rules,” and nothing is more demonstrative that guidelines are morphing into canon than the word “DON’T.” Halsted would be delighted to see this resurrection of dogma. Others have said it before me – dogma is the enemy of science.

Dense Is As Dense Does

Throughout my career, I’ve enjoyed the serendipity of Forrest Gump, strolling through history and meeting the right people at the right time. Working in a geographically isolated, non-academic community hospital, odds would have ordinarily kept me in lockdown as far as contributing anything to medical research. But to draw upon the overworked quote of Louis Pasteur, “Chance favors only the prepared mind.”

By studying two problems intensely, over the course of many years, the insight gained allowed me to enter two arenas of expertise – 1) breast imaging theory, with a focus on mammographically invisible cancers, and 2) quantification of breast cancer risk. As it turned out, the vast majority of experts were jousting elsewhere, and largely by default, I was able to claim expertise in two areas. Then, who would have predicted 25 years ago that these two agenda items would merge, in the form of “risk-based multi-modality imaging.”

Granted, some of my Forrest Gump experience was facilitated by my years in academia, but that only laid the groundwork. I did not walk onto the stage until I had been in the community setting for quite a while. I won’t name drop here (as I’ve done it excessively elsewhere), but my 3rd career evolved through the influence of many key people after I left my area of original training, i.e., general surgery. Career 1 was private practice in general surgery, Los Angeles, focusing on trauma. Career 2 was a dedicated breast surgeon beginning at my alma mater in 1989. Career 3 was a risk assessment and genetics expert (1st M.D. in Oklahoma to begin BRCA genetic testing, on Day One 1996), using multi-modality imaging based on risk levels.

Now, after 25 years of personal study, both mammographic density and risk assessment have been thrust to the forefront, and everyone has an opinion, it seems. As of 2015, all accredited cancer programs (by the Commission on Cancer) are required to provide risk assessment and genetic testing services. And, as of 2016 in Oklahoma, women are to be informed about the increased cancer risk and decreased sensitivity of their dense mammograms, according to new state legislation (see May 2016 blog). Unfortunately, there are no teeth in the legislation requiring insurers to pay for what needs to be done next.

As with most legislation, one problem is solved and many problems are created in the process. The good news is that women with extremely dense mammograms (10% of the female population where there is a near white-out) will be notified by letter of their “condition.” The bad news is that another 40% will be notified as well, when their risk and decreased sensitivity is not really that much different than the 40% one step down in density. A bell curve bisected is a false dichotomy, and those 40% of women just above the cut-off may be unduly alarmed, while the 40% below the cut-off will have a false sense of security.

If we use the 4 levels proscribed by the American College of Radiology, we have:
A = predominantly fatty replaced (10% of women)
B = scattered fibroglandular densities (40% of women)
C = heterogeneously dense tissue (40% of women)
D = extremely dense tissue (10% of women)

Again, a bell curve bisected. Nothing for A & B. “The Works” for C & D. While those at the extremes are pretty clear cut, the vast majority of women are bunched in the middle, and one radiologist might call you a B, while another would call you a C. Or, a single radiologist can call you a B one year and a C another year. (I won’t list the various bizarre scenarios that arise out of that degree of subjectivity.)

Density is far more complicated than the 4 traditional levels. In the past, quantification was tried: Level 1 = under 25% dense; Level 2 = 25-50%; Level 3 = 50-75%; Level 4 = over 75% dense. Yet, no matter how the definition has changed, radiologists still group patients into the same bell curve. The problem is that there is a strong qualitative aspect as well as a quantitative level of density, and this alone renders the pro-density activists on shaky ground.

Years ago, I began a practice that has never changed. Because our goal is to find cancer in the size range around 1.0cm, when assessing a density pattern in a new patient, I take a 1.0cm image in my head and move it around the X-ray to see how easy it would be for a cancer to hide. If there are large patches, it’s a concern, even if the overall density is less than 25%. And, the reverse is true. This is my attempt to practice so-called “precision medicine” (while the formal guidelines for screening that are promoted under this same moniker are often nothing of the sort).

The false dichotomy problem boils down to this – cancer can hide on low density mammograms. Yet now, those women who don’t get the letter are going to think they are home free. If legislators would review the screening MRI data, they wouldn’t be so quick to jump on the “dense breast legislation bandwagon,” which has now impacted 30 states (and counting). In those clinical trials where high-risk women had both mammograms and MRI performed, even the low density patients had a 50% miss rate on mammography.

High-density screening will usher in improvements through multi-modality imaging (mostly ultrasound) for those who garner a C or a D by the radiologist. But what I’m worried about is the 40% of women in the B group who are going to be tricked into thinking that mammograms are going to have a 90% detection rate. This is not true! The American College of Radiology used to admit this through their recommended reporting system that stated “sensitivity might be reduced” for women at that 2nd level. But that word of caution is no longer required. Now, only the C and D patients have this sensitivity disclaimer. In effect, this move only sharpens the distinction of the false dichotomy – A&B on one side, C&D on the other – no problem in the one group, trouble in the other. In fact, we’re dealing with a 0 to 100% continuum.

Complicating the matter are ethnic differences, where Asians have a higher mammographic density than whites, but a lower risk for breast cancer. Then, there is the obese patient who has a lower density level on mammography, but a higher risk for the development of breast cancer. This is not a straightforward issue, as usually portrayed. Even the pro-density activists are not promoting accurate information, falling into the false dichotomy trap.

In short, risk level is not the best way to determine who should do more or less screening. Density is not the best way either. A combination of “Risk and Density” (says Forrest Gump) is the best way to select patients for doing something more. As for doing less for women without risk and without density, well, it’s a tough question. Certainly, the 10% of women with “fatty replaced breasts” are going to do fine with annual mammograms and no adjunct screening modalities. But for everyone else, it’s a struggle to come up with a strategy that assures early detection.

Maybe you see now why I’ve spent more than 20 years helping scientists who are trying to develop a screening blood test to detect breast cancer, which would then prompt the need for multi-modality imaging if positive. And, why I’m working with computer scientists who are trying to develop image analysis systems that improve upon the human eye, again allowing better selection of patients for multi-modality imaging.

Things are looking up, though. The introduction of tomosynthesis (3-D mammography) is the first significant technologic advancement in the history of mammography, in that more cancers are clearly detected. Furthermore, these detections are taking place more often in dense breasts where “architectural distortions” are seen through the density by virtue of thin slices. Those patches of white, where my imaginary 1.0cm tumor can hide, are no longer so effective in camouflaging the cancer. The standard radiologist disclaimer, “it’s like looking for a snow man in a snow storm” is not as true as it once was. Tomosynthesis can sometimes see a vague outline of the snowman where 2-D mammography cannot; Ultrasound works like radar and can see the snowman by using sound waves; and MRI (or molecular imaging) lights up the snowman with the flare of contrast enhancement.

Now, if we can only figure out a way to make it all cost-effective. Odd that we really don’t need any technologic breakthroughs to find breast cancer early. The miracles of technology are already at our disposal. The only problem is trying to figure out who to put on what machine and when. And the final answer is not going to come through false dichotomies.

Do These Genes Make Me Look Dense?

CHAPTER 23 — from Mammography and Early Breast Cancer Detection: How Screening Saves Lives by Alan B. Hollingsworth, MD (McFarland & Company, Jefferson, NC, 2016).

…a history of mammography leading to current controversies, a primer on the epidemiology of screening, and a polemic designed to endorse the need for multi-modality imaging in order to improve early detection.

LINK to Publisher’s Page: http://www.mcfarlandbooks.com/book-2.php?id=978-1-4766-6610-5

***

Shortly before the new year of 2004, an educator from Connecticut underwent her annual screening mammography, as she had been doing for years. Like before, the results were “All clear.” Six weeks later, she was diagnosed with Stage IIIC breast cancer. She was stunned to learn, for the first time, that she had very dense breast tissue on mammography, the feeble explanation as to why her cancer had been invisible (probably detectable several years earlier with other modalities).

 

Nancy M. Cappello, PhD has since joined the list of solo game-changers, that is, non-celebrities who have risen from anonymity to turn their misfortune into far-reaching upheavals that have permanently altered how we manage breast cancer. In the footsteps of Rose Kusher (ending the one-step biopsy/frozen section/mastectomy approach), Betty Rollin (First You Cry, the book and movie that introduced many to the emotional impact of mastectomy), Susan Komen and her sister Nancy Brinker (initially, the promotion of screening mammography, then breast cancer research in general), Nancy Cappello set fire to sweeping reforms as to how women are informed about their breast density.

 

As of this writing, nearly half of the 50 states have passed legislation requiring radiologists to inform patients when mammograms are dense (white on X-ray) and to describe additional imaging modalities that might help, most notably ultrasound. Federal legislation is being considered as well. Through the Are You Dense? Organization (RUdense.org) that Dr. Cappello founded, Connecticut became the first state to require that high density information be given to patients, then she has encouraged all states to do the same. Additionally, Connecticut became the first state to pass legislation requiring third party payors to cover ultrasound screening for women with dense breast tissue.

 

What added fuel to Dr. Cappello’s conflagration was the discovery that the medical profession knew about the problem of breast density all along, but did precious little to inform the public.

 

I sympathize without excuse. I do have mixed emotions about legislated physician practice, but this bundle of provocative data about density was dropped into a black hole, it seems. As an aside, it has always been a curious phenomenon that some information new to the scene is processed quickly and new standards adopted overnight, while other innovations or ideas are ignored. After decades of medical literature warning about the danger of breast density, it was like the boy who cried “Wolf,” with readers becoming numb to the data, if they paid attention at all.

 

And for those historians of breast density who have already groaned at the pun, it was Dr. John N. Wolfe who first drew attention to mammographic density patterns, publishing his classification system of N1, P1, P2 and DY that decorated the bottom of mammography reports starting in 1976(1). I read my first mammogram report with the Wolfe pattern noted in 1980, my first year in private practice. The reaction was straightforward at that time – so what? There were no good alternative methods for imaging, and the focus then was not on mammographically invisible cancers. In fact, Dr. Wolfe proposed his system entirely on the basis of imparted risk for breast cancer, yet there were no interventions available at the time short of preventive mastectomy. In fact, Dr. Wolfe was so convinced that women with the DY pattern were headed for breast cancer (45% lifetime risk by his calculations) that he felt they should seriously consider preventive mastectomies. And with the introduction of breast implants at about this same time, “subcutaneous mastectomies with implants” were already being performed for reasons far less impressive than Wolfe DY patterns.

 

But then, we entered a silent period, where at the clinical level, Wolfe patterns were gradually dismissed, sometimes as an old-fashioned “folly.” Certain investigators, though, were using different classifications schemes, but still coming up with the same conclusion – denser breasts translated to greater risk. Still, the entire focus was still on imparted risk. It was only later that clinicians began to appreciate that breast density was double jeopardy. Not only was density imparting elevated risk, but also it was responsible for hidden cancers missed by the very mammography that defined density in the first place.

 

What is most peculiar in re-tracing my steps during this era is that no one was arguing anything different. It wasn’t controversial, rather, it was esoteric. The risk data for density was consistent no matter how the density levels were described – the highest density breasts had roughly a 4-6 fold relative risk for breast cancer when compared to the lowest density breasts. And, lagging behind, only a handful of studies prior to 1990 showed the danger of density when it came to early detection. The vast majority of clinicians believed the “90% Sensitivity” for mammography, independent of density levels.

 

A word about 4-6 fold relative risk. Recall from the Number Games chapter that relative risks (RRs) are fractions with a numerator and a denominator. When the word hit the street about the 4-6 fold risk, many patients were terrorized by these numbers. The problem with this particular RR is that the number applies to the highest density compared to the lowest density, the latter being present in only 10-15% of the population (hardly the average woman). In epidemiology-speak, the “referent” was this low density group, not the average woman. Indeed, the “average” density patient is at 2-fold risk for breast cancer when compared to predominantly fatty breasts.

 

How can someone be “average risk” and “2-fold risk” at the same time? By switching out the denominators. That’s why they are called relative risks. The RR will change if you alter either numerator or denominator. Forget the math if it’s not your thing – if we compare women with extremely dense tissue to the average patient (not the low density group) we get a more acceptable RR of 2.0.   This degree of risk is more in line with having a first-degree relative diagnosed with breast cancer at age 45, whereas an RR of 4.0 would be like having two first-degree relatives diagnosed with breast cancer at 45.

 

While nearly all the focus was on the imparted risk of mammographic density, pioneering radiologists interested in ultrasound saw immediate application for screening women with dense breasts. Radiologist Dr. Thomas Kolb, then at Columbia University in New York City, was certainly at the forefront and became an activist for screening ultrasound. Within a decade, at least 10 large studies, totaling 60,000 patients, had been published(2), all showing similar results – the number of additional cancers discovered after negative mammograms and negative clinical exam ranged from 2.71 per 1,000 to 4.61 per 1,000. Roughly speaking, this is a relative 50-100% improvement over mammograms alone. The Society of Breast Imaging made its recommendation for ultrasound screening accordingly.

 

The evidence for MRI screening began trickling in about this same time as well, so at my facility, we created a “Breast Density” brochure in the early 2000s, describing the double jeopardy of density as well as the recommendation to consider multi-modality imaging with either ultrasound or MRI. At the time we initiated this program, there were no screening guidelines for multi-modality imaging at all, so we developed a scoring system that combined risk and density for patient selection, giving equal weight to both. After all, if one is going to recommend a second tier of imaging, then it should be based on the probability that the first tier is going to fail(3) – and the most powerful predictor for first tier failure is breast density.

 

While the primary interest in breast density seemed fixated on the associated cancer risk rather than the hidden cancer rate, even the risk agenda failed to generate momentum. To this day, one has to seek out maverick modifications of our mathematical models (e.g., Tice modification of the Gail model) in order to incorporate density levels into formal risk assessment. This, after hundreds of articles have confirmed the relationship of density and risk.

 

In my presentations about multi-modality screening, primarily MRI, I would offer the overwhelming evidence that mammographic density is a risk factor, then call it, “The Rodney Dangerfield of Risk Factors.” It doesn’t get any respect. Yes, it’s talked about all the time, but try to work the density level into your formal risk assessment program, and you’ll need to have those maverick models at your disposal. (Addendum: After publication of this book, in 2017, the Tyrer-Cuzick model released version 8.0 that included breast density in calculating breast cancer risk.)

 

In 2009, a study published in the Journal of the National Cancer Institute(4) described a meta-analysis of 47 studies of breast density related to breast cancer risk, involving 28,521 cancer patients and 3 different ways to categorize density levels – all 3 methods showed the same thing – a 4-fold risk for breast cancer when the top category is compared to the bottom category, and approximately a 2-fold risk when compared to the average patient. Once we add the possibility of invisible (or “missed”) cancers on top of the risk problem, the Rodney Dangerfield reference was really an understatement. Everything has changed now, with heightened awareness of density, not via 47 studies combined into one meta-analysis, but through one woman trying to reach 50 states.

 

The double jeopardy concept seems to trip up even the experts. When the American Cancer Society issued their 2007 guidelines for breast MRI screening, they treated mammographic density as a risk factor, placing it in the “insufficient evidence” category along with “15-20% lifetime risk” and other modest risk factors. Density was not considered as a determinant of “first tier” sensitivity for women, that is, an independent predictor of mammographic failure. Fortunately, the recommendation for adding ultrasound to screening high-density women was adopted by the Society of Breast Imaging, but then again, unfortunately, experts are no longer the voice of authority. The Society of Breast Imaging guidelines have gone largely unheeded, while we wait for “neutral epidemiologists” to weigh in.

 

No one has attempted an ultrasound screening trial where the endpoint is mortality reduction, given all the problems we’ve seen already with mammography, not the least of which is obsolete technology by the time the requisite follow-up is complete. Instead, we have relied on prospective trials (non-randomized) where participants undergo multi-modality imaging, and then cancer yields are recorded for each method individually – mammography, ultrasound and MRI – and in combination. Improved cancer yields are reflective of a mortality reduction, though purists will be quick to point out that we can’t rule out the impact of the Big Four biases.

 

We have very good offerings for imaging beyond mammography available now, barely utilized, though with much improved outcomes:

Mammographic density and no other risk factors = ultrasound

Mammographic density plus additional risks = ultrasound or MRI (perhaps alternating)

Women at very high risk, regardless of density = MRI

 

Nancy Cappello saw the problem shortly after her own diagnosis. Breast density is double jeopardy. It is a risk factor, and it is a predictor of mammographic failure.

Several questions often arise with regard to breast density. First, “How did I get it?” And, “what can I do about it?”

Baseline breast density is a product of your genes, though environmental influences do cause alterations. Interestingly, some of these alterations are associated with a similar change in breast cancer risk, raising the question as to whether or not density can be used as a surrogate measure for risk-reducing strategies. Each pregnancy lowers breast density a bit, and breast-feeding does as well, both known to be protective factors when occurring in the younger age groups.

 

Other factors don’t fit so well. Body fat (BMI = body mass index) is a confounding variable that doesn’t always match up with breast density risk. In fact, for many years, it slowed down acceptance of breast density as a risk factor. Ethnicity doesn’t always match either. Asian women, in general, have a much higher density level than African-American women, yet the breast cancer risks are higher for African-Americans. Postmenopausal hormone replacement therapy, especially estrogen and progesterone, can result in an increase in mammographic density. And, the use of SERMs (Selective Estrogen Receptor Modulators – tamoxifen or raloxifene) can lower breast density, perhaps reflecting when they are also preventing breast cancer.

 

Taking “statins” to prevent cardiovascular events and death is widely accepted, even though, as with any preventive health measure, it is the minority who benefit. Yet, the surrogates – blood lipid levels – have become endpoints unto themselves. “We’ve successfully treated your hyperlipidemia” ignores the fact that the goal is something else entirely – that is, reducing the probability of cardiovascular events and death. The lack of such a surrogate may be one of the reasons why so few women accept a recommendation to take the SERMS, which are FDA-approved to reduce the risk of breast cancer. Fairly good evidence suggests that the same women who have a lowering of their breast density pattern while on SERM therapy are the same ones benefitting most from the drug. Perhaps someday, breast density will become an official surrogate, and pharmacologic risk reduction will enjoy greater popularity.

 

A common misconception is that “young women have dense tissue, while older women do not,” thus explaining the superior efficacy of mammography in older women. While these differences may be true in general, exceptions are quite common. Some women who start screening in their 30s will have low density mammograms, and we see 80 year-olds with mammograms where we can’t see a thing. Some women gradually become less dense after menopause, others do not, especially if they take estrogen-plus-progesterone hormone replacement therapy. And, there is no sharp loss of density at menopause. In fact, if you look at density in the 40s as a whole and compare it to density in the 50s, there is very little difference. The point is that breast density is a highly individual situation. I’ve included an assessment of density in my risk assessment program for 20 years. My reasoning for this is based on the double jeopardy issue, with a determination of the individual’s estimated sensitivity for breast cancer detection should it occur, a separate issue from density as a risk factor.

 

Now, one quibble with the Are You Dense? educational efforts. As often happens in medicine, we are confronted by continuums for which we must create artificial (and subjective) classification schemes. Are You Dense? has opted to use the dichotomy approach – two groups, dense and non-dense, with 50% density as the dividing line, an approach used in some clinical trials as well. The problem here is that the 50% point is where subjectivity is at its worst.

 

Picture a bell curve, which is what we nearly have in breast density, from 0 to 100%, with most women bunched in the middle. Then, you draw a line straight down from the peak of that curve, and you have the majority of women clustered right at the dividing line. Not only do radiologists routinely differ at this point in the dividing line, studies have shown that the same radiologist will call the density level different from year-to-year in the same patient, even when the density level is unchanged. This is no one’s fault – it’s simply the nature of subjectivity applied in a quantitative fashion to a phenomenon that has a strong qualitative feature as well. In other words, it’s not merely how much density is present, but what is the nature of that density – small patches of white? Large patches of white? Diffuse haziness? Net-like strings of white? Net-like strings of white interconnecting small white patches with a diffusely hazy background?

 

Once breast density became an accepted risk factor and predictor of mammographic failure, an entire industry arose in order to quantify it in a meaningful fashion, using software that spits out exact percentages of density, a number that can be translated to both “risk levels” and “sensitivity levels.” The resilient Dr. Dan Kopans, surfacing again on this issue, has spent considerable effort trying to educate the world about the subjectivity of breast density, pointing out the extreme complexity that underscores all attempts to simplify the problem, that is, the qualitative issues are every bit as important as the quantitative.

 

For the past 20 years, the American College of Radiology has required that interpreters of mammography describe the degree of breast density, dividing into 4 categories. The definition for each category has changed slightly over time, but what used to be Levels 1-4 are now called Levels A-D, generally based on quartiles of density: 0-25%, 25-50%, 50-75%, 75%-100%, though with some modifiers. Unfortunately, these percentages have not until recently been listed in a straightforward fashion on the radiology report. Instead, radiologists were directed to dictate “in code” using a formal lexicon. For instance, “scattered fibroglandular densities” was code for a Level 2 (now called Level B) mammogram, or 25-50%. Interestingly, only Level A does not include a disclaimer about reduced sensitivity for the detection of cancer. Levels B, C and D each have different wording that describes increasing concern about lowered sensitivity with increasing density. Primary care physicians grew immune to the redundant terminology that was actually “code,” but you can imagine Nancy Cappello’s shock to discover that this density information rarely made it to the patient.

 

In my practice, as I am reviewing the mammograms for the double jeopardy, I attempt to incorporate the quality of density as well as the quantity, the latter having already been addressed by the radiologist with A, B, C and D levels. In my mind’s eye, I picture a small invasive cancer, about 1.0 to 1.5cm, and I move that imaginary cancer around both the MLO views and the CC views while asking myself a simple question: Is there anywhere this (imaginary) cancer could hide? Invasive lobular carcinoma can prove problematic to this approach, as the diffuse growth pattern in some lobulars allows them to hide anywhere they want. This approach is still subjective, but it’s not a false dichotomy. Generally speaking, the overall percentage of white reflects the imparted risk level, while the qualitative pattern of the white allows me to predict the probability that a small cancer will be detected (Sensitivity).

 

Importantly, this “roving” 1.0 to 1.5cm (imaginary) tumor approach to mammographic review is based on the premise from the previous chapter that overall density pattern is only an indirect predictor of invisible cancers – the real problem is the density immediately adjacent to the tumor that could completely encase the cancer. This is why even Level B density (scattered fibroglandular densities, or 25-50% dense) was originally accompanied by a disclaimer on radiology reports that sensitivity could be compromised, while the dichotomy approach would call these patients “non-dense.” When you add the possibility that the origin of cancer might be primarily within these patches, it can render some uncomfortable conclusions about mammography used alone.

 

I would be remiss if I didn’t briefly mention the controversial entity of DCIS (ductal carcinoma in site) that is a product of screening mammography, usually presenting in the form of a calcium cluster, that is, tiny white dots clustered into a group on X-ray. Calcium appears on mammography better than ultrasound or even MRI, so mammography has held the top spot in the hierarchy of screening. But now that DCIS is dropping in popularity as a worthy goal, ultrasound has gained momentum in that a higher percentage of ultrasound-discovered cancers are small invasive cancers, rather than DCIS. To be specific, mammography-discovered cancers will be DCIS about 25-40% of the time, while ultrasound-discoveries are DCIS only 10-20% of the time. So, some observers claim superiority of ultrasound over mammography in the very high density groups, given one or the other. This introduces the speculative, but attractive, notion of screening with ultrasound alone in selected patients.

 

Certainly, mammograms alone in this highest density group (greater than 75%) have failed to deliver. I once encountered a breast center that had a disclaimer at the bottom of its reports: “4% to 8% of breast cancers are not visible on mammography.” I did not make any friends when I openly stated that the only way to make that statement true in a patient with extreme breast density was to add a zero to both numbers, that is “40% to 80% of breast cancers are not visible.” People thought I was kidding, or at least exaggerating. I was not. Years later, when DMIST demonstrated the sub-group of film screen technique in young women with dense breasts, and only 27% Sensitivity (a 73% miss rate), I was vindicated, although still not popular.

 

With the great advantage of lower cost and greater patient comfort when compared to breast MRI, whole breast screening ultrasound has some distinct advantages. The drawback is specificity, with more benign biopsies (note: I do not use the pejorative term, “unnecessary biopsies”) than occur with mammography or MRI. Most studies show that an ultrasound-generated biopsy will be malignant only 5-10% of the time (compared to 20% for mammography and 30-40% for MRI).

 

Furthermore, there is a bigger difference in technique and interpretations with ultrasound than mammography, very much related to experience and skill. To that end, a new development (albeit in research status for 40 years) is “automated whole breast ultrasound,” which gives a standardized picture of the entire breast that should be very desirable for the screening setting. While targeted, handheld ultrasound will still be better for diagnostic problems, the automated whole breast ultrasound approach should be a great addition for the screening tools available for women at average risk, but with dense mammograms. That said, preliminary data suggests that a physician using hand-held ultrasound will detect more cancers than the automated approach.

 

Although single-center studies and multi-center studies have revealed that screening ultrasound increases cancer yields by 3-4 per 1,000 above the 5-7 cancers found by mammography, I’m only going to review one of the most important studies – ACRIN 6666 (5).

 

The American College of Radiology Imaging Network designed a trial to study ultrasound as a complement to mammography in high-risk, high density patients. Notice that participants had to have both – traditional risk factors in addition to the inherent risk of breast density. Although there is not a control group in a study like this for cancer yields (a no imaging at all group), care was taken to account for as many variables as possible. For instance, patients were randomized as to which study was performed first (mammography or ultrasound), and the radiologists were blinded as to the results of one study when interpreting the other.

 

Kudos to the trial designers who addressed every possible criticism that I can dream up concerning current multi-modality guidelines. First, the study designers realized the age-discrimination inherent when lifetime risks are used as a sole criterion. The 60-year-old patient with risk factors, and at peak short-term incidence of breast cancer, often won’t qualify for MRI because of fewer remaining years in her lifetime. So, the ACRIN 6666 team came up with short-term risk calculations, as well as standard long-term calculations. Then, designers also realized the “qualitative” problem of density, allowing certain patients to qualify for the trial if there were large patches of white even though overall density was less than 50%. And then, one of the most subtle, but perceptive, entry requirements allowed an adjustment in traditional risk requirements based on density level, that is, the extremely dense patient did not need as many traditional risk factors to qualify, and conversely, the lower density patients needed higher calculated risk. I can’t say it made any difference in the outcome, but I will offer personal testimony to this – these entry requirements were so sophisticated in design that they could be used for all forms of multi-modality imaging. Enough praise to Dr. Wendie Berg and her team, moving on to results.

 

From 2004 to 2006, 2,662 women (with both density and traditional risks) at 21 sites underwent 3 rounds of double screening with mammograms and bilateral, whole breast ultrasound, at 0, 12 and 24 months. From the boatloads of data generated, let’s go with this:

 

33 cancers were detected by mammography alone, 32 detected by ultrasound alone, and 26 were seen on both mammography and US. Clearly, the modalities are detecting different aspects of cancer, as less than a third of the cancers were seen on both modalities. This suggests a powerful complementary role to ultrasound. In fact, ultrasound appears equal to mammography, if picking one or the other.

 

A breakdown of tumor size shows that mammography detected tumors with a mean size of 1.15cm whereas US detected tumors with a mean of 1.0cm. This close comparison in size indicates that US is not finding cancers much earlier than mammography, it is capturing the cancers that ought to be large enough to be seen on mammography, but were missed, presumably due to density. Yet, there is an important difference: 75% of the cancers were invasive by ultrasound, while only 52% were invasive by mammography. Because these ultrasound-discovered invasive cancers were small and usually node-negative (4% with positive nodes, compared to 33% node positivity for mammograms alone), ultrasound could again be declared the winner in head-to-head competition. That said, ACRIN 6666 was not looking for a “winner” when it came to these two modalities.

 

Beyond digital mammography, ultrasound generated 5.3 cancers/1,000 the first year and 3.7 per 1000 in each of the second and third screens. What about sensitivity, documenting the benefit from a different angle? Discounting the first screen (as a prevalence screen), the next two incidence screens revealed a sensitivity of 76%…combined! That’s right. Both mammography and ultrasound together missed 24% of cancers. Sensitivity of mammography alone was 52%. Granted, this is not your average patient population, but “risk levels” have no influence on sensitivity levels. Density, on the other hand, is the primary determinant of sensitivity.

 

But that’s not where the story ends. ACRIN 6666 designed a sub-study, where women could undergo a single breast MRI at the conclusion of the 3 screens with mammography and ultrasound. Only 612 women of the 2,662 moved ahead with this aspect of the trial, so a separate set of statistics was created for this group. An additional 9 cancers were discovered on the single MRI (8/9 invasive with an average size of 0.85 cm, all node negative). Converted to our 1,000 standard, this was an additional cancer yield of 14.7 per 1,000. Had all 2,662 women participated, an extrapolation would indicate that 39 additional cancers would have been detected, in a group that had been cleared as “good to go and cancer-free” with 3 sets of negative mammograms and 3 sets of negative ultrasounds.

 

The smaller tumor size and negative nodes with MRI is a little more impressive than the difference between ultrasound and X-ray, now comparing 0.85cm with MRI to the 1.15cm of mammography. So, MRI has lowered the threshold of detection, which automatically hurts the other two forms of imaging when it comes to sensitivity. To the point, when sensitivity for mammography and ultrasound are now recalculated while including the MRI-detected cancers, then the sensitivity for mammography and ultrasound combined was only 44%. A 56% miss rate using clinical exam, mammography and ultrasound together three times over the course of 24 months!

 

These study results are both sobering and confusing, if those two descriptors can fit in the same sentence. As a result, the 2012 headlines surrounding this landmark study were mixed, as though the investigators interviewed were baffled by their own data, leaving the journalists bewildered, yet eager as ever to report.

 

Consequently, some headlines pronounced ACRIN 6666 a clear victory for screening ultrasound, while others proclaimed the superiority of breast MRI. But one headline was never used and never considered: “Mammograms alone are good enough.”

 

Wendie Berg, MD, PhD, breast radiologist, was the Study Chair and Principal Investigator for ACRIN 6666. In January 2014, two years after the trial results had been released, she underwent a digital mammogram with 3-D tomosynthesis, which showed Level C density (50-75%), but no cancer. Because of her family history, she decided to proceed with adjunct breast imaging, and in April 2015, she publicly announced that a 0.9cm invasive carcinoma had been discovered using breast MRI. A new activist group was formed – DENSE (Density Education National Survivors’ Effort) (6).

 

 

  1. Wolfe JN. Breast patterns as an index of risk for developing breast cancer. AJR 1976; 126:1130-1139. Dr. John Wolfe (1923-1993) was a professor of radiology at Wayne State University School of Medicine in Detroit when he published his landmark paper. His N1 pattern today would be called “predominantly fatty” (estimated lifetime risk for breast cancer – 2%), with the P-1 pattern being less than 25% prominent ducts, P-2 being greater than 25% prominent ducts, and DY (dysplastic) for dense fibro-glandular tissue. Acknowledgement for the bad pun goes to Karla Kerlikowske, MD who wrote a New England Journal of Medicine editorial in 2007, titled, “The Mammogram That Cried Wolfe.” I should have known a pun so obvious would not be original on my part, but in truth, I found Dr. Kerlikowske’s editorial in my files after I had written this chapter. So, it probably was not an original thought on my part, but instead, a subliminal repository from which I drew.
  2. While all pioneering authors on breast ultrasound screening deserve mention, space limits us to the first authors on papers that led to the Society of Breast Imaging recommendations: Paula Gordon, Thomas M. Kolb, W. Buchberger, Stuart S. Kaplan, Isabelle Leconte, Pavel Crystal, V. Corsetti, W. Berg (see #5), and Kevin Kelly. Dr. Thomas Stavros played a key role in the development of breast ultrasound, as well as many others.
  3. Hollingsworth AB, Stough RG. Breast MRI screening for high-risk patients. Semin Breast Dis 2008; 11:67-75. In this article, we gave our initial experience with MRI screening, using a point system that selected patients for auxiliary imaging equally weighted for risk and density. The scoring system also delineated the MRI interval, that is, MRI performed annually, every 2 years, or every 3 years. While this was intended for MRI, the principles are the same for ultrasound. Our preliminary findings were unsettling in that our MRI-discovered cancers were almost entirely in patients who would later prove not to qualify for MRI based on American Cancer Society guidelines introduced in 2007.
  4. Cummings SR, Tice JA, Bauer S, et al. Prevention of breast cancer in postmenopausal women: approaches to estimating and reducing risk. J Natl Cancer Inst 2009; 101: 384-398.
  5. Berg WA, Zhang Z, Lehrer D, et al. for the ACRIN 6666 Investigators. Detection of breast cancer with addition of annual screening ultrasound or a single screening MRI to mammography in women with elevated cancer risk. JAMA 2012; 307:1394-1404.
  6. Dr. Berg joined with JoAnn Pushkin and Cindy Henke-Sarmento in forming DENSE, their educational website called DenseBreast-info.org.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Nothing Like a False Dichotomy to Promote Truth

I can’t recall the last time I read a pro-screening article in one of the major journals. It’s simply not cool. In contrast, the anti-screening militia is using every weapon imaginable in order to re-program the medical community and the American public to believe that “less is more” when it comes to the early detection of breast cancer. But no matter how fashionable it is to be an anti-screening iconoclast, or a “personalized medicine” advocate who favors less screening through miraculous prescience as to who is truly headed for breast cancer, the fact remains that less screening means more lives lost to breast cancer.

Recently, an editorial was published in JAMA (2016; 315:977-978), one of the Big Four journals, entitled “A Public Health Framework for Screening Mammography: Evidence-based vs. Politically Mandated Care,” written by a public health aficionado (Kenneth W. Lin, MD, MPH) and a lawyer (Lawrence O. Gostin, JD). The punchline seems straightforward enough – politics should play no role in medical guidelines. But there’s more than a punchline here.

The authors can’t say enough to extol the virtues of the U.S. Preventive Services Task Force and their evidence-based conclusion that we should screen with mammography every 2 years (rather than annually) starting at age 50 (rather than 40). That is, screening women in their 40s results in more harm than benefit. As I’ve stated on many occasions, evidence-based medicine can give you objective measures of benefits and harms, but that final step – weighting harms versus benefits – is 100% subjective. You can’t put “unnecessary biopsies” on one end of the teeter-totter and “lives saved” on the other end, using any semblance of science. That doesn’t stop evidence-based crusaders from doing so, however. In fact, it doesn’t seem to occur to them that the last step is totally subjective – instead, they speak of their scientific purity from beginning to end.

Authors Lin and Gostin simply gush about the high-quality evidence used in the Task Force process, then use the age-old false-dichotomy approach to point out the evils of political interference. It’s as though there’s no valid position that opposes the Task Force and its recommendations that is not politically motivated. In fact, quite a few physicians, believe the Task Force is way out of line, and it has nothing to do with politics. In fact, we can barely be civil about it (based on scientific evidence, by the way), and it has paradoxically hurt the cause of screening. Witness the total frustration of Dr. Dan Kopans, a pioneering mammography expert, who called the 2009 Task Force “idiots,” whereupon he was promptly blacklisted. Oh, you didn’t know that there are blacklists in medicine? Dr. Kopans can’t get a word published in mainstream journals anymore. He has been throttled to “preaching to the choir” in the friendly radiology journals where the readers agree that the Task Force are “idiots.”

Many of us believe that the Task Force is politically motivated to begin with, making the false dichotomy in this JAMA article even that much more bizarre – both sides are products of politics. I agree that we need to keep politics out of medicine, and a good starting place would be to dissolve the U.S. Preventive Services Task Force. Or, at least modify their structure so that they have to include members with expertise in the areas being governed. A committee of bean counters has no place telling the farmers how to grow those beans.

The epidemiologists and public health care aficionados have hijacked many areas of medicine, including breast cancer screening. They have no room for the experts. Experts are, by definition, motivated entirely by financial gain. To the Task Force, it is inconceivable that someone examined the evidence and chose a specialty based on that evidence. No, instead, doctors were once starry-eyed innocents until they picked a specialty whereupon they morphed into greedy, biased individuals who can no longer weigh evidence objectively. “You can’t have providers establishing policy!” is the battle cry of the hijackers who now control preventive medicine.

It’s not quick or simple to expose the shaky foundation of the Task Force guidelines. My favorite point of argument is that the 2009 guidelines – not screening during the decade of the 40s – was reversed from 2002 based on so-called new evidence. But the 2002 group who recommended screening in the 40s calculated a 15% relative reduction in mortality, which was completely unchanged when re-evaluated in 2009. Not a single percentage point difference. The 2009 group (the Task Force rotates members frequently) and their “new evidence” found nothing new with regard to benefit. What did change then? Well, harms. That’s right. In spite of better mammography, fewer call-backs, easier biopsies, better pathology, the Task Force used new modeling methods to calculate greater harms than had been considered in the past. So, the benefits of mammography as performed 30-40 years ago in the historical trials were balanced (subjectively) against modern and mysterious harms, such as permanent psychological damage after a benign biopsy. Benefits minimized, harms inflated. It was an easy formula.

The more you know about this travesty of common sense, the more disturbing it is. And to make matters even more disgusting, the new 3-D mammography does exactly what mammography critics have been asking for: 1) it lowers the callback rate, and 2) it raises the detection rate. What does that do for that delicate balance of harms-to-benefit ratio? What does it do to the teeter-totter? Harms are lower, benefits are greater, so you’d think this 3-D development would be welcomed with open arms.

Well, the Task Force has already spoken in 2015 – there’s not enough evidence to rule on 3-D tomosynthesis mammography. Okay, then when will there be enough evidence? Answer: never. Why? Because the Task Force considers any breast imaging modality used in screening to be “unproven” unless it has been shown to lower mortality in a prospective, randomized trial. And, since there will NEVER be prospective, randomized trials for each and every new development in breast imaging, the Task Force is sitting in the catbird seat with their “Insufficient Evidence” ranking. So, they will always be playing catch-up, rendering their opinions on obsolete technology.

Is everyone who is against the Task Force simply motivated by politics? Hardly. There is a legitimate and vehement argument against the Task Force that isn’t about emotion, politics, or gamesmanship. It’s about a deep and abiding belief – held by those physicians who actually practice breast cancer screening – that the Task Force is dead wrong, and that their influence is going to result in thousands of breast cancer deaths each year. In my abiding opposition to the Task Force, I decided to write down all my objections – the result was a 400-page book, due out in late 2016 or 2017 from McFarland Publishing (Jefferson, NC). The working title is: Mammography and Early Breast Cancer Detection: How Screening Saves Lives. If you read that book, you won’t be calling the organized opposition to the Task Force “political.” Instead, you will appreciate that it’s the Task Force that’s the “political animal in the room.”

CLINICAL TRIALS – the Good, the Bad, and the Semi-Ugly

We all do it. And, we’ll keep doing it. What is “it?” Judging from the title above, you’d think I’m talking about clinical trials for medical research, but that’s not exactly what I’m getting ready to examine. “It” is the bragging about clinical trials. Not the bragging about the number and scope of available trials where I’m as guilty as anyone. Rather, I’m talking about the new trend of boasting about a high percentage of one’s patients entering clinical trials, be it an individual physician or an institution.

Somewhere along the line, the public came to equate clinical trials with “better care,” “more advanced care,” or “cutting edge care.” And from there, it was just a hop, skip and jump to design a marketing strategy based on that misconception. So, my first point is this – a physician can provide top quality, cutting edge patient care without participating in a single clinical trial! That said, all improvements in medical care are 100% dependent on clinical trials, so it is critical that we have these studies in place. And, it is critical to enroll patients. It is so critical, in fact, that the Commission on Cancer, the organization that accredits hospital-based cancer programs, sets a mandatory minimum for enrollment in clinical trials (lest we get lazy about it).

In spite of public perception, though, clinical trials are not to be confused with “better care.” In fact, the volunteers randomized to the study group are at risk that something unforeseen could go wrong, prompting a too-late wish to have been assigned to the safer placebo group. In the earliest days of Herceptin™ research, access to the clinical trials was difficult. It helped to be living in Southern California where the drug was developed. Nevertheless, I had one newly diagnosed breast cancer patient in her 30s who wanted to be part of the clinical trials that, in the end, proved Herceptin™ to be a miracle drug for many. But it wasn’t so miraculous for my patient who traveled to California to enroll in the trial where she was given Herceptin™. She proved to be in that small group of patients where researchers discovered the potential of cardiac damage, especially when Herceptin™ was combined with Adriamycin™. She did not make it back to Oklahoma alive, instead, dying of the cardiac damage from the treatment regimen in her clinical trial. Her pioneering spirit is a great benefit to those women today where meticulous care is taken to avoid cardiac injury, and it can be argued that lives are saved accordingly. Yet, I still recall her unbridled enthusiasm to be part of a clinical trial where the experimental treatment ended her life, not the cancer.

The point is this — clinical trials are research trials, and some people are harmed in the study group, a fact often lost in the PR. “We turn more patients into guinea pigs than any other facility in the state” doesn’t quite have the same ring to it as what we hear on TV, does it?

Granted, in oncology research, terminal patients are given access to new treatments that are not otherwise available to others in the same boat. And in Phase 1 (is it safe?) and Phase 2 trials (is it effective?), a volunteer can usually receive the proposed new treatment rather than placebo. But in Phase 3 trials, where the new agent is further along in development, a volunteer is less likely to receive the experimental drug, given the mandate in late stage trials for a control group that receives the “standard of care.”

Yet, 50% of late stage trials fail, and 75% of early stage trials fail. Failure doesn’t necessarily mean “harm” was done, rather, it’s usually a simple lack of benefit. From the patient’s standpoint, that’s not a very attractive selling point. But from the researcher’s standpoint, it’s all good. Even a failed trial offers information. But what the public may not appreciate is that there are perks for the researchers, whether or not the trial is successful.

First and foremost, there’s dollars to be made. Researchers earn salary support and institutional income with each patient enrolled. Most academic centers are highly dependent on this income. The more patients who enroll, the better. Pharmaceutical-sponsored research can be very lucrative, whether the trial is successful or not. Many years ago, I experienced an encounter I call “The Graduate” moment for me, when a colleague was critical of my chosen line of research (it was non-income-producing), and told me to remember just one word – “Pharmaceuticals.” For Dustin Hoffman, it was “plastics,” but close enough. In a word, I was being told to consider switching my research focus to something else wherein a drug company could be pulled into play for the benefit of the university’s pocketbook.

Even if a researcher’s heart is not stirred by profit or job security, most scientists are happy with recognition. Thus, being the “top recruiter” to a clinical trial can bring such honors as “first author” on a paper, or a presentation at a meeting, or any of the other recognitions that define success in the academic setting.

On rare occasion, the desire for fame (with or without fortune) can be so strong that patient records are falsified in order to pump up the numbers recruited to clinical trials. One of the more famous examples occurred when, in the early 1990s, Dr. Roger Poisson at Saint-Luc Hospital in Montreal falsified data so that more women were admitted to NSABP clinical trials, most notably the B-06 trial that was the landmark study proving the equivalency of lumpectomy and mastectomy. Although falsified records were only documented in relatively few cases, his entire contribution of patients to the B-06 had to be tossed out, and all outcomes re-calculated. Because Dr. Poisson had been a top recruiter (personally adding 354 patients to the 2,163 in the trial), it threw the entire conclusion of the B-06 into doubt, prompting a nationwide panic about the safety of lumpectomy. And when NSABP head Dr. Bernard Fisher did not jump through the hoops fast enough to suit the federal government who had helped sponsor the research, he was stripped of his position by the feds. Later exonerated, the scandal may have contributed to the fact that Dr. Fisher never won the Nobel Prize (not yet anyway, he’s still alive in his 90s, while Dr. Poisson passed in 2013). Dr. Fisher has won every other award medicine has to offer, but the Nobel committee can be pretty stuffy when it comes to scandals.

But such extremes aside, there’s something semi-creepy about creating hierarchies based on enrollment in clinical trials wherein the researchers are guaranteed benefit, but the patients are not.

When an institution boasts 90% enrollment of its patients in clinical trials, hopefully it implies that the facility is participating in a large number of available trials so that there’s “something for everyone,” and that’s fine and dandy. But there’s another way to get 90% enrolled in clinical trials, even if only a few trials are open. Informed consent can be cast with an optimistic spin, such that the patient, at a highly vulnerable point in his or her life, may acquiesce to the pressure in order to please the doctor, a phenomenon well-documented. Here, the susceptible patient coupled to a convincing sales pitch does the trick.

The “clinical trial” has become so sacred that to proffer even a weak word of caution might be considered heresy. But remember, I stated at the beginning, “We all do it.” My facility brags about clinical trials just like everyone else. I suppose it’s better than bragging about nebulous qualities, like “the best doctors with the highest cure rates.” But make no mistake — boasting about clinical trials has an undeniable self-serving feature. The majority of researchers and basic scientists have their hearts in the right place, motivated independently of the perks from clinical trials. Still, there’s a used car salesman component here – you, the consumer, may or may not get a good deal, or you may even get a lemon, but you can rest assured the dealer and the dealership will not lose.

Double Trouble from the Task Force

In April 2015, we were treated to the “new” breast cancer screening guidelines from the U.S. Preventive Services Task Force. We learned back then that there would be no major changes from the 2009 Task Force, demonstrating remarkable consistency. Because the 16-member Task Force rotates its volunteers, there is a changing of the guard each time new guidelines are announced. In the past, this has translated to waffling on the screening recommendations, with each new Task Force changing what the last group had done, beginning in 1989. This time, not so much. Maybe the one physician who served as liaison to the new group made the difference.

The Big Two guidelines stayed the same: 1) Start screening at 50 (no apology rendered to the 25% of eventual breast cancer victims under age 50 who are thus excluded from screening), and 2) Screen every 2 years rather than annually.

Other controversies were settled with an “I” for “insufficient evidence.” Given that the Task Force insists on prospective, randomized trials with a proven mortality reduction before they can accept what all other rational thinkers accepted long ago, the Task Force will always be playing catch up. For instance, in 2009, the Task Force issued an “I” for digital mammography. In spite of this, essentially every breast center in the U.S. switched from the older film screen technique to digital, so in 2015, the Task Force simply deleted this technology from their list upon which to pass judgment. They were so late to the party that it would have been an embarrassment to admit that there was STILL no evidence to support digital.

They replaced the digital issue with an “I” for tomosynthesis, even after admitting the early data on tomo (3-D mammography) is exactly what critics have been demanding of mammography – better detection and fewer call-backs. That said, by the time the Task Force meets again on this issue, every breast center in the U.S. will have already replaced their 2-D units with 3-D. Tomosynthesis is the greatest single advance made in mammography technology since its introduction. The evidence is clear to those of us who don’t demand a prospective, randomized trial for each and every step we take.

As usual (since 2009 at least), the Task Force recommendations prompted a media storm, with many believing that the Task Force is the “official” policy even though few had even heard of them prior to 2009. In a way, they are “official.” The Affordable Care Act uses Task Force guidelines, so who cares if every other organization in the U.S. recommends screening earlier than 50. The Task Force “C” recommendation for (not) screening at 40-49 will mean no insurance coverage through ObamaCare. After the disastrous 2009 introduction of this “less is more” recommendation for screening 40-49, the Task Force softened the definition of “C” such that many thought they had reversed their position. They did not. The wording is kinder and gentler, but a C is a C is a C. Screening in the decade of the 40s is still a “C.” And C’s don’t count when it comes to the Affordable Care Act.

After the 2009 media brouhaha died down, we continued screening as we had before, based on the American Cancer Society guidelines, only to have that rug halfway pulled from beneath us as well when the ACS changed the starting age to 45 (covered in another blogatorial).

Then, in January 2016, it started all over. Double trouble from the Task Force. The Task Force announced their new guidelines – AGAIN – and we went through the same media storm as we’d done 9 months earlier. What had happened during this gestational period?

Pretty much nothing. As it turns out, the April 2015 announcement was only their “draft,” whereupon they invited public comment (getting more than they bargained for). Of course, this is largely for show, because they didn’t change a thing, nor would you expect them to after their expert analysis of numbers. In fact, it gave them the opportunity to field the criticisms in advance, and put their answers in writing, as part of the official 2016 guidelines. Good strategy – allow the critics to expose their best arguments ahead of time. Nothing like having the last word.

Yet, the sad truth is that the “less is more” approach may be applicable to many aspects of medicine and breast cancer in particular, but it translates to more breast cancer deaths when applied to screening. The benefit of screening, as calculated by mortality reductions, did not change for the Task Force from 2002 to 2009. What changed was a new way to calculate harms, such that the balance tipped away from screening.

Many of us believe, however, that the harms have been grossly exaggerated, while the benefits understated, and the result is going to be more breast cancer deaths. For women currently in their 30s, opting for Task Force guidelines, it has been calculated that 2,000 more women will die each year of breast cancer (currently at 40,000 per year). Drop in the bucket? Not if you’re one of the 2,000. Furthermore, women currently in their 30s are facing 50 years of breast cancer risk, on the average, so when you multiply 2,000 per year X 50 – ugh – 100,000 more breast cancer deaths over the next 50 years? Inconceivable?

Not really. The calculations were made by two prominent radiologists who have expertise in screening (RE Hendrick and MA Helvie in the February 2011 edition of the American Journal of Radiology), using the same database that the Task Force used. The “100,000 more deaths” tells us what would occur if all women were compliant with the standard “start at 40, then every year” guidelines, but then backed off according to Task Force recommendations. So, rest easy. If we look at a more practical number than 100% compliance, taking into account that many women don’t bother with mammograms at all, Drs. Hendrick and Helvie calculated “only” 64,889 more deaths over the next 50 years. Ah, that’s better. As Jimmy Fallon might say, “Thank you, Task Force, for reducing the number of call-backs, unnecessary biopsies, overtreatment of breast cancer, and for saving the system so much money…now, if we could figure out why more women are dying of breast cancer, we’d be even happier.”

Putting the Elderly Out to Pasture – at age 70

On December 17, 2015, JAMA Surgery published an opinion piece written jointly by a surgical oncologist (Dr. Ismail Jotoi) and the chief architect of the Canadian Screening Trials (Dr. Anthony B. Miller). To add gravity and authority to their cause, they invoke the timeworn admonition in their title (in Latin, of course) – Primum Non Nocere – “Above all, do no harm.” The irony is too much – if we follow their advice to quit screening at 70, then approximately 1,000 additional breast cancer deaths will occur each year, give or take a few. This is a conservative estimate, by the way, given that mammographic compliance isn’t that great to begin with. Primum Non Nocere?

This overworked “do no harm” is mostly used today when someone wants to preface their opinion that everyone should stop doing something the author believes is harmful. It’s a sure bet to strike a chord because there’s potential harm in everything we do in medicine. The phrase is attributed to the Hippocratic Oath, which actually stated, “to abstain from doing harm.” To follow this admonition literally, of course, one would have to cease practicing medicine.

So, what was Primum Non Nocere intended to mean? Originally, it stated that if you weren’t sure whether or not a therapeutic intervention would help your patient, then it’s better to do nothing than risk harm. Of course, when the Hippocratic Oath was widely adopted, doing harm was the norm. Very few medical practices of the day actually benefitted the patient, so it’s best we forget about the original meaning. Today, the phrase is meant to imply a weighing of risks vs. benefits. If the risks (harms) outweigh the benefits, “don’t do it.”

And this is where the controversy in mammography screening lies – harms vs. benefits – with anti-screening forces exaggerating harms to the point of publishing studies of permanent psychological damage after a benign biopsy. At the same time, benefits are minimized, such as the insistence on quoting mortality reductions based on “invitation to screen” rather than the benefit as measured in those who actually screen.

The authors in this opinion piece seem to have no restrictions on what they’re willing to call a “harm.” For instance, “lead time” is listed as a harm, complete with a reference. Lead time is NOT a harm, but a built-in bias.  For the sake of clarity, “lead time” is one of the primary epidemiologic biases that can be controlled through prospective, randomized trials with mortality as the endpoint. Unchecked, lead time may exaggerate the benefit of screening, but that is not a direct harm to the patient. To list it as such is a remarkable display of anti-screening bias, the irony being that it is misusing a term involving bias.

Only one of the epidemiologic biases can actually result in harm, and this is “overdiagnosis bias,” which can result in overtreatment. As for the other biases – lead time bias, length bias, selection bias, etc. – they don’t harm anyone. Furthermore, these biases have been held in check with meta-analyses of all prospective, randomized trials of mammographic screening where there is a proven mortality reduction. It is bizarre that “lead time” would be newly tagged as a harm. What’s next? Mammographic screening contributes to global warming?

Dr. Anthony Miller can be forgiven for his opinions. After all, he crossed over to the Dark Side of the Force decades ago, and can only interpret the world of screening in one way – a good clinical exam is good enough. He has spent his entire career promoting clinical exam over mammography, in spite of the fact that there’s not a shred of evidence that lives are saved through physical exam alone. His own CNBSS studies are used to denigrate mammography, even though quality in the Canadian trial was so outrageously bad that only 32% of the breast cancers diagnosed in the mammography limb were actually discovered by mammography. No wonder clinical exam looks good to Dr. Miller – his mammograms missed most of the cancers! A typical screening trial will flip-flop the numbers seen in Canada, that is, 68% of the cancers will be found on mammography while 32% will be felt on exam. One can trace Dr. Miller’s writings on the superiority of clinical exam over mammography back over 4 decades, even before he knew the results from his own trial. Absolutely nothing is going to alter his opinion.

The authors have loaded this opinion piece with their own personal bias. For instance, the claim is made that the UK Age Trial “failed to show a significant benefit from screening.” The reference for this statement is an article Dr. Jatoi wrote in 2011. Yet, in the latest 2015 update from that trial, widely accessible through Lancet Oncology (2015; 16:1123-1132), the data shows something else entirely. For background, the UK Age Trial was designed to study screening women in their 40s. (No other age groups, just 40-49, so I’m not sure why it was even included in this article about women 70 and older.) With far better design than the Canadian trial and vastly superior mammography to the Canadian trial, when Dr. Jatoi wrote his 2011 article noting “no benefit,” there was a clear trend toward benefit even though it had not yet reached statistical significance. Then, in September 2015, the final results were announced –

For the first time ever for women in their 40s, a statistically significant 25% reduction in breast cancer mortality was demonstrated for the 1st 10 years after diagnosis for those in the mammography limb. Beyond 10 years, this difference gradually declined to 12% (and barely lost its statistical significance). It was totally predictable that the anti-screeners would call the trial a failure. But what really happened? Did mammography save lives?

In fact, women in the control group (no mammography), after a decade of no screening in the Age Trial, began their routine mammographic screening as done in the UK, starting at 50. Thus, the “no mammography” became a “mammography group.” In fact, there was zero difference between the two groups from age 50 on, diluting the earlier benefit. No surprise…the screening phase of the study is over by then, and both groups are getting screened. This is a problem in long-term follow-up of all screening studies, in that controls screen anyway, and women in the screening group may stop screening. Oh, yes, another interesting feature of the AGE Trial – there was no measurable overdiagnosis, in spite of the never-ending emphasis and exaggeration of this bias.

In another controversial statement, the authors make the claim that, with improved systemic therapies, we can back off on screening – “with modern adjuvant systemic therapy, a less sensitive screening modality such as CBE (clinical breast exam) may now prove as effective as mammography.” It is true that when that day comes in the future when systemic therapy cures all breast cancers, we will no longer need screening. But it is disastrously premature to state that this day has come already, or even that we are easing into an era where we can back off on screening. The 40,000 women who died of breast cancer this year would not be nearly as enthusiastic about those systemic therapies as these authors seem to be.

I don’t have time or space here to go over the Harvard study that showed, overwhelmingly, that most breast cancer deaths occur in unscreened women. Systemic therapies and screening are complementary, and until that 40,000 number is dramatically reduced, we cannot back off on either screening or systemic therapies. In fact, we need to be doing more in both categories. One does not preclude the other, nor do they need to operate in an inverse proportion. I could easily claim that if we screened all women with breast MRI, we could back away from systemic therapies. And though I could defend that position to a degree, it would be every bit as misleading as the inverse, claiming that systemic therapies allow us to back off on screening.

Since there is very little data from the prospective, randomized trials on the mortality reduction for screening women in their 70s, what would rational thought tell us? If we divide screening into the decades, the benefit of mammography gains momentum over time. It takes fewer and fewer screens to save a life as women age. Screening in the 50s is more effective than the 40s, while screening in the 60s is better than the 50s (by quite a bit). Would any rational person on the planet truly believe that this accelerating benefit would come to an abrupt stop at age 70? In fact, barring co-morbidities, it is likely that the effectiveness is every bit as good, if not better, than for women in their 60s. Yet, we’re told Primum Non Nocere, and to drop mammography at its peak performance level; instead, adopt clinical exam for which there is no evidence anywhere that lives can be saved with exam alone.

A full 25% of eventual breast cancer victims are over the age 70. We’re already being told by some to stop screening under age 50, which is another 25%. So, if the anti-screeners get their way, screening only between 50 and 70, we will be excluding 50% of the eventual breast cancer victims from the potential of early detection. Compare this to the current 5% of eventual victims excluded by virtue of being under age 40 at the time of diagnosis. I submit that the leap from 5% to 50% will be deadly for many women, several thousand a year, whereupon the term Primum Non Nocere takes on a new twist.

It only takes a few pages of the written word to sling mud. In contrast, there are so many misunderstandings, false claims, erroneous misrepresentations of available data, and flat-out errors, that it takes a book to wipe the mud away. That’s why I wrote one – the working title is
Breast Cancer Early Detection: Past, Present, Prologue (McFarland & Company, Inc., Publishers) with release estimated late 2016.

Have No Fear: George Orwell is Here

One of the bureaucratic divisions in Orwell’s classic 1984 is the Ministry of Truth which, of course, is anything but. Instead, it drowns the public in propaganda.

Today, we’re experiencing an anti-screening campaign of Orwellian proportions. And it’s being led by physicians, no less, who claim to be rising above emotion, and dealing only in scientific truth. Indeed, they have appointed themselves as the Ministry of Truth.

They are supported in their efforts by well-meaning journalists who, of course, have no interest in “dog bites man.” “Mammography saves lives” is stale and boring. But pull a switcheroo, and make that “mammography doesn’t save lives,” and you’ve got yourself a story.

Witness the Oct. 12, 2015 issue of TIME magazine where the cover states: “What If I Decide To Just Do Nothing?” The article itself brings up many valid and controversial points, but let me direct you to the sidebars, which I offer here verbatim: 1 in 800 – The chances of a woman in the U.S. getting diagnosed with invasive breast cancer. Or, 1 in 4,566 – The chances that a woman in the U.S. will die from breast cancer.

Don’t you feel better now? After all, you’ve been frightened out of your wits for many years with scare tactics that claimed, “1 in 8 women will be diagnosed with breast cancer.” Isn’t it refreshing to know the number is a mere “1 in 800?”

The cardinal rule in quoting risks happens to be: don’t ever toss a risk into the ring unless you attach it to “time.” Risks expressed as percentages without a time frame are, by definition, reckless and misleading. Critics of breast cancer screening love to point out how we’ve been terrifying the public for years with overstated risks. “One in 8” is the risk of developing breast cancer over a lifetime calculated through age 85 or 90, yet this has not always been made clear. So, to undo the damage, we apparently have a respectable news magazine doing the opposite – understating risk without proper explanation as to length of time.

How does “1 in 800” make sense? In fact, it is true if you’re talking about the risk of being diagnosed with breast cancer over the course of one year. Even then, “1 in 800” is under-kill, as this would only apply to a woman in her 40s. So, here’s the truth: There’s a “1 in 800” chance of being diagnosed with breast cancer during the next 12 months, if you’re a woman in her 40s. TIME’s sidebar is a shameful misrepresentation of risk. The same is true for their “1 in 4,566,” which is only accurate if talking about one year. In fact, the “lifetime risk” for dying of breast cancer is “1 in 35.”

Today’s counter-mantra seems clear: We are going to undo the sins of the past, even if it means misleading patients in the opposite direction. After all, “1 in 8” was also heavily promoted without specifying the duration of time to which this applied. So, we’re simply getting even.

The difference, however, is that lives will be lost by listening to the new Ministry of Truth.