New ACS Screening Guidelines – Are we falling into line…or for a line?

For many years, we have been relying on the American Cancer Society (ACS) to uphold the familiar screening mammography guidelines – that is, “annual mammograms starting at age 40, and continue as long as you are healthy,” updated in 2003, but no real change since 1997. From 1983 to 1997, the 40-49 recommendation had been “every 1-2 years.”

 

The original recommendation from the ACS, by the way, was in 1976, when data were really slim – back then, it was “annual mammograms for women after 50, but for women in their 40s, there should be a family history of breast cancer in a first-degree relative.” The “baseline mammogram at 35-39” was adopted in 1980, but dropped in 1992.

 

In 2009, the U.S. Preventive Services Task Force recommended major cutbacks in screening – start at 50, and every 2 years thereafter. This launched a new battle between pro-screeners vs. screening minimalists. Comparable to the Mammography Civil War in the 1990s, neither side was willing to budge.

 

On October 20, 2015, the American Cancer Society caved, ostensibly at least, to the anti-screening tsunami, announcing its new guidelines in JAMA – in brief, “begin at age 45, screen annually to 54, then switch to every 2 years (biennial).”

 

At least that’s the “all meat no potatoes” version of the new guidelines. Cleverly embedded in the new guidelines are the old guidelines, but you probably won’t hear about that. In fact, the ACS stated that women should still have the option to begin at 40, and that they should have the option to continue annually. These options are not buried in fine print – no, in fact, they are in the Abstract along with the same level of “Qualified Recommendation” as all but one of the other guidelines. The only recommendation listed as “Strong” was the 45 starting age. Switching to biennial screening was “Qualified”… the recommendation to screen as long as overall health is good and life expectancy of 10 or more years was “Qualified”… and no need for clinical exam in women without risk factors was “Qualified.” The old recommendations are listed right there as “Qualified” with all the others, but the old “option” was suppressed within minutes of the ACS announcement. And since most rely on the media for their medical updates, well, precious few will be able to quote the true ACS guidelines.

 

Why will the old guidelines, now “optional” be suppressed? Because the War on Cancer is coming to an end. Today, there is a powerful contingency that has declared War on the Harms Associated with Cancer Detection and Treatment, no matter how many lives may be lost in the process. The politically correct focus is on doing no harm, an impossibility in medicine, by the way. More accurately, it’s about harm:benefit ratios, a completely subjective judgment call, balancing apples against oranges. But the screening minimalists do everything they can to attach numbers and justifications for their position. In an accompanying Editorial to the new guidelines, Drs. Keating and Pace, who are deeply engaged in informing patients of the harms of screening, don’t even acknowledge the fact that the old guidelines are left as an “option” for women who want to continue what they have done for many years. Instead, they dwell on the harms, their favorite topic, as evidenced by their “informed consent” which results in a statistically significant “improvement” in women opting to do less screening.

 

The American Cancer Society, for the first time ever, broke the data down into 5-year increments, rather than the usual 10-year approach. Highly sophisticated analyses, transparency and independent review, all the wonderful things that go into formulating guidelines are there, like never before. All of this was in line with the 2011 Institute of Medicine guidelines on how to create guidelines. And to help themselves, the ACS drew from several other guideline resources including the “GRADE” terminology to distinguish between “strong” and “qualified” recommendations. Guidelines on how to create Guidelines, then Guidelines on how to communicate these Guidelines to the People intended to benefit – this is “evidence-based medicine” in all its glory. But when it comes to the end – harms vs. benefits – it’s a total judgment call whether to screen or not, at what ages, and at what intervals.

 

On the bright side, the ACS included observational studies, something that makes purists like the Task Force sick to their stomachs. Benefits are greater in these studies because (pick one): 1) women actually get mammograms here, rather than be “invited to screen,” plus those mammograms are performed using modern technology, or 2) the extra benefit is due to lead time bias, length time bias, overdiagnosis, and selection bias.

 

In contrast to the Task Force that didn’t utilize expert opinion, the ACS asked 22 experts and 26 relevant organizations to review the guidelines in advance. When it was all said and done, with the world literature review performed by the Duke University Evidence Synthesis Group (a separate article in JAMA), the ACS recommendations appear to be bullet-proof.

 

In reviewing how the ACS arrived at age 45, they addressed the issue from multiple angles, pointing out that incidence and mortality from breast cancer during ages 45-49 are actually closer to the 50-54 group than to the 40-44 group (barely). Frankly, the whole thing smacks of a predetermined compromise, thus the early decision to use 5-year blocks of time. “Let’s shoot for peace and harmony here. We must stop the madness that prevails today in conflicting recommendations. We’ll address the problem in 5-year intervals, allowing us to justify the compromise age of 45. The rest doesn’t matter so much, since the greatest controversy is the starting age.” Or, so I imagine.

 

The lead author for the ACS, Dr. Kevin Oeffinger, claims that this is a great victory for personalized screening – “The evidence simply no longer supports one-size-fits-all,” he is quoted as saying in TIME magazine. A terribly proud moment, considering only about 1,000 women will die each year if these guidelines are followed. And what a great month to make the announcement – Breast Cancer Awareness Month. When your 44-year-old friend is diagnosed with Stage IIB breast cancer, by self-exam (since the ACS no longer endorses clinical exam), you have now been made aware that you can tell her: “Oh, but think of all the heartache of unnecessary biopsies you’ve spared other women!”

 

We already disenfranchise 5% of eventual breast cancer victims by offering nothing to women under 40 to help with early detection, unless they are at very high risk (and most who develop breast cancer are not at high risk). By moving to 45, another 7% will be disenfranchised, bringing our total to 12%. And if the Task Force were to get their way, starting at 50, the total number of eventual breast cancer victims left out in the cold without screening will be an astounding 23%.

 

Dr. Stephen Feig, pioneering breast radiologist, has warned that “personalized screening” is a euphemism for “restricted screening.” Where “personalized screening” can be beneficial is when risk factors and breast density become the foundation to do “something more.” To use it to “do less” is deceptive. Call it what it is – limited and restrictive screening – there’s nothing beneficial in it for the average risk patient, which by the way, is the population where most breast cancers occur.

 

Admittedly, the ACS took a leadership role in 2007 when they introduced high-risk screening guidelines using breast MRI, starting at age 30. This is “personalized screening” in the good sense. They will be releasing new guidelines for these women in the near future. My hope is that they don’t “personalize” high-risk screening to the point that it’s pointless.

 

As a final point, we don’t need “more screening” as much as we need “better screening,” and this is happening already. 3-D tomosynthesis mammography is in the first stage of implementation, and the numbers coming in are remarkably good, with improvements in both Sensitivity and Specificity (fewer false-negatives and fewer false-positives). So, the recommended “informed consent” as proposed by screening minimalists are already “misinformed consents.” And, with the addition of whole breast ultrasound and breast MRI applied to a broader population, mortality reductions will be far superior compared to what is accomplished today. Too bad that the anti-screeners gained so much momentum at precisely the point in time when screening is getting an upgrade, or perhaps being revolutionized.

 

One thing for sure, not a single additional life will be saved by switching to the new American Cancer Society policy of “personalized screening,” which is designed to spare women anxiety from false-positives and theoretical “overdiagnoses.” Oh yes, and to save money. We’ve finally “fallen into line” with other countries around the world that have committed to screening mammography. And what a line it is!

 

 

Informed or Misinformed Consent?

In the September 8, 2015 edition of JAMA, an editorial was offered by Harald Schmidt, PhD, MA, calling into question “incentives” for mammographic screening. Writing from the Center for Health Incentives and Behavioral Economics, Perelman School of Medicine, University of Pennsylvania, Dr. Schmidt raises an interesting question – should we be giving away free T-shirts or movie tickets to reward patients for having their screening mammograms?

 

I share some discomfort about certain promotional activities. Over the years, I have been repeatedly approached to offer a “free mammogram” at health fairs and silent auctions for fund-raising events. Others in the community were doing it, so “why don’t you?” Well, a mammogram can generate a false-positive, with a requisite biopsy that can backfire if a borderline lesion is found that ends up being called “cancer,” and the next thing you know, bilateral mastectomy or radiation therapy is being performed. That’s why. I’ve never been comfortable with such flippancy when it comes to a screening mammogram. It is a two-edged sword.

 

But Dr. Schmidt takes his position further – calling into question media-awareness campaigns, e-mail or phone call reminders, and the concept of default mammogram orders with electronic medical records. In his argument, Dr. Schmidt endorses the recommendations of the U.S. Preventive Services Task Force to the degree of, oh, roughly 100%. And that’s where we part.

 

Anti-screeners are taking great delight in performing studies that show how women – when “truly informed” about the risks versus the limited benefits of mammography – will opt for less screening, or no screening at all. But what they call a “thorough informed consent,” I call baloney. The benefits of mammography in these consents are based on a technology that doesn’t even exist anymore. ALL the prospective, randomized controlled trials that the Task Force holds in such dear esteem were accomplished with film screen mammography during an era prior to quality assurance. Only the U.K.’s AGE trial made it into the 1990s with active screening. The other 8 prospective RCTs took place starting in the 1960s, peaking in the 70s and 80s. We are two generations past that now, and the only thing the Task Force can do is watch the tail end of progress pass them by as they pronounce “Insufficient Evidence” (as they recently did for 3-D tomosynthesis mammography). Because of their insistence on prospective RCTs (excluding all other evidence), no matter how absurd or antiquated the study, they were never able to endorse digital mammography. The technology came and went (or is leaving, at least, being replaced with 3-D tomosynthesis) while they awaited results from ethereal prospective RCTs that are never going to occur.

 

After grossly minimizing the benefits of screening, often based on “invitation to screen” rather than actual screening, the harms are greatly exaggerated (including lifetime psychological damage from a benign biopsy).   It is demoralizing to read the informed consents that are being laid on unsuspecting women where benefits are minimized and harms maximized. Dr. Schmidt emphasizes the very important need to make women aware of “overdiagnosis” with mammography. Fine. What number should we use, since the published range is from 0 to 50% in a debate that is highly contentious? If overdiagnosis is occurring at the same rate we see in prostate cancer, then explain how, at autopsy, over 50% of men will have occult prostate cancer undiagnosed and subclinical, while only 1% of women will die with an undiagnosed invasive breast cancer.

 

Grossly exaggerated overdiagnosis rates are just one example of how women are being frightened out of mammography. These are not informed consents – they are propaganda. Not a single life will be saved by this social tsunami, but you can readily calculate the number of deaths as women choose to back away from screening after their “misinformed consents.” 2,000 more breast cancer deaths per year is a fairly accurate number that will result if Task Force guidelines are followed rather than American Cancer Society (and all other professional societies that address breast cancer).

 

Yes, we have a resource allocation problem, and this is why some clinicians are drawn to the concept behind (or membership on) the Task Force. It’s hard to argue with the basic philosophy that the U.S. has been throwing money at diseases for many years without a proven advantage in many instances. Task Force members and their ilk, while claiming to be unbiased judges of quality care, are not immune from human bias. In this case, their bias is such that when it comes to spending money for preventive health measures, “We have to put a lid on it.”

 

The challenge is making sure that the Task Force is not talking about the lid to your coffin.

 

 

Publish or Perish Has Morphed Into the H-Index

If you saw the movie Moneyball and the impact of mega-statistics on major league baseball, then you won’t be surprised that Big Data (and Little Data) has invaded academic medicine as well, not to mention science in general. No longer is that iffy word “judgment” to be utilized to assess competence as distinct from brilliance. Today, it’s done with numbers.

Once, a researcher could get away with racking up large numbers of publications, using all sorts of tricks. The goal was simple, that is, get as many papers in print as possible. But someone caught on that quality ought to count, too.

Now, we have the h-index, another approach to converting performance into numbers that can be used for promotion, to award grants, or simply for bragging rights.

The h-index is an attempt to reward not only the number of published papers, but also how often those papers are cited by others. (Did your mother ever tell you not to base your self-worth on others? Well, if you are a scientist, your mother was wrong.)

In its attempt to include both quality and quantity, the h-index reflects both the number of publications and the number of citations per publication. For example, an h-index of 10 means that among all publications by one author, 10 of these publications have received at least 10 citations each. The h-index has little value when used across different fields. That is, it works only when comparing scientists in the same field. At least, that’s the way it was intended.

Of course, over the past 10 years since its introduction, it has been studied to the hilt, with statistics generated about the statistics, and at least some evidence that the h-index doesn’t help one twit, that is, that the old-fashioned method of counting papers worked just as well. But for some, it has become a goal unto itself. It certainly has become a science unto itself, with mathematical variations and versions out the wazoo.

The reason I bother to mention it on this web site is the fact that one way a researcher can pump their h-index number is through self-citation. Every time you publish a paper, you quote yourself to no end (my favorite variation here is the innate ability of some to not only quote themselves, but also to do so incorrectly, gradually building a case out of thin air).

Alternatively, you can promote your work directly to the media through various options, create a buzz about your work, then sit back and wait for others to cite your work, given its importance now confirmed.

Another way to pump numbers is to become embroiled in a controversy. This may explain why some experts enjoy generating controversial theories, sometimes oddball theories, just to get a reaction. For our purposes here, nothing will get you cited more frequently than announcing, “Screening mammography does not save lives.”

Then, of course, there are the inevitable rankings. You knew it had to happen. Is my number bigger than yours? Nobel laureate chemist Harry Kroto ranks a lowly 264th on the h-index for chemists, so he’s not too hot on the old h-index idea. Is this the route we really want to go? Is this how promotions and grants should be decided?

In the end, I think our mothers were right. It’s best not to worry about what others think about you. But still, you have to wonder if physicist Jorge E. Hirsch’s mother tried to teach him that lesson. Dr. Hirsch is the physicist who, in 2005, introduced the h-index   (h = Hirsch). Today, several software programs track your number, re-calculating on a regular basis as citations appear in the scientific literature. Appropriately enough, one of the programs is called Publish or Perish.

Will $2 million Revolutionize Breast Cancer Screening?

The breast cancer screening “industry” in the U.S. is sometimes pinned to a 6-8 billion dollar figure that is supposed to reflect high cost and low gain. That is, few lives saved for such an impressive price tag. And when compared to $6 billion, $2 million seems piddly. But it’s $2 million that has been awarded by the National Cancer Institute in a R-01 grant to my collaborators (“inventors”) and me to see if we can revolutionize how we screen for breast cancer.

First, some background. Here sits a breast MRI machine, less than 20 feet from my office. I have to consider, every day, that if we were able to screen the entire population with this device, very few women would ever die of breast cancer. So, while everyone waits for a “breakthrough” to radically alter the treatment of breast cancer, I’m looking at that breakthrough in the past tense. It’s already been done. Yet, many experts either ignore it or criticize it.

And for sure, it’s not the long term answer – the final answer will come when we are able to cure or control metastatic breast cancer 100% of the time. Or, alternatively, when we can immunize against cancer, so that the disease becomes an historical relic. Then, we won’t need to screen. Early diagnosis won’t be required. But that day is not yet on the horizon. For the next 50 years, at least, early diagnosis through screening will be important. And, we could do so much better than the status quo with the technology currently at our disposal. What a shame that it is nearly impossible to use. Strict guidelines and stricter insurance companies make it difficult to identify patients who qualify for MRI screening.

Currently, women qualify only when they are at very high risk for breast cancer. Sounds both obvious and justifiable, but in fact, it’s an inefficient approach piggy-backed onto inefficient baseline screening. First of all, it excludes the vast majority of women who are headed toward breast cancer, the 80% who do not have a family history. Then, under current guidelines, women qualify on the basis of “lifetime risk,” an unwieldy number that declines as you age, given that you are “passing through” your risk. So, young women with risks qualify more easily, but as time goes on, when individuals actually enter the danger zone where incidence peaks, their remaining lifetime risk may not allow them to be screened with MRI.

Risk-based screening is on the tip of everyone’s tongue, but for me, it leaves a bad taste even though my area of expertise is breast cancer risk assessment. Why? Because the difference in cancer yields between very high-risk women vs. normal-risk women is not great enough to warrant completely separate approaches. If you screen 100 women with MRI who have already had a normal mammogram, you will find 1 cancer in a population of “normal risk” women. If you screen 100 women at the very highest level of risk, e.g., BRCA-positive women, you will find 3 cancers. All that work to find 3 cancers instead of one.

My point is this: research should focus on how to identify women that have mammographically-occult breast cancer on the day of the negative mammogram, NOT using the surrogate of breast cancer risk spread out over a lifetime. For 20 years, my only idea to make this happen has been through a low cost screening blood test that would tell you to proceed with MRI if mammograms were negative. And, I am recently encouraged by new developments in this area, and will be writing about it more in the future.

But I’ve always been haunted by another fact, well-known to all breast radiologists, but seldom discussed. Often, when you diagnose breast cancer, you can look back one year earlier and see a subtle change in the density level in the area where cancer has recently been diagnosed. In fact, so many attorneys were taking advantage of this, and successfully swaying juries into rendering guilty verdicts against radiologists, that a group of experts wrote an article on the topic, admitting that 58% of the time, you can detect “something” happening in a zone where cancer will be diagnosed 1-2 years later. Yet, these changes are too minor to hold the radiologist accountable. If breast radiologists called back everyone with such subtle changes for diagnostic work-ups, then they would be calling back the majority of patients being screened.

The article was designed to address the absurd standard to which radiologists were being held by the courts, not simple perfection, but prescience beyond perfection. But when I first read the article, my “take home” message was entirely different. What if those women had undergone breast MRI? I suspect nearly 100% would have been positive for cancer. But again, you can’t do MRI on all those with such subtle changes. Or can you?

One year ago, a publication caught my eye in The Breast Journal, where an accomplished computer scientist (Bin Zheng, PhD) and his mentor (Hong Liu, PhD) had been working on image analysis through computers in the detection of subtle differences in the comparison of one breast to the other, and over time. Using their invention, they reported being able to find women at nearly 10-fold short-term risk for breast cancer based on mammographic density changes. This is not CAD (computer-assisted diagnosis), already in current use where specific lesions are identified by the computer. This is a “second line” computer analysis, an “ultra-CAD” if you must, identifying changes after routine CAD has “signed off” on normal mammograms.

It was a brilliant approach, and my mind raced back to the 58% who have subtle changes in the year(s) prior to diagnosis. This “ultraCAD” would serve the same purpose as a screening blood test, that is, in the efficient selection of patients for MRI, not based on future long-term risk, but based on the high probability of a current malignancy missed by mammography.

And to my surprise, these computer scientists were working out of an Advanced Cancer Imaging Lab located only 30 miles away at my alma mater, the University of Oklahoma, Norman campus, only a short walk to the basketball arena and a short jog to the football field. As always, basic scientists need clinical collaborators in order to lift their inventions from laboratory into actual practice. I contacted Dr. Zheng, and we ran a quick pilot study that included 30 patients with normal mammograms, but 5 of whom actually had cancer discovered on MRI. Dr. Zheng’s system identified 9 of 30 as “very high short-term risk,” and all 5 of the cancers were included in the 9. Had we used his system to select patients for MRI in the first place, we would have only performed 9 MRIs instead of 30 to find the cancers. This is efficiency. I won’t take up a lot of space here, describing what this means in terms of cancer yields on MRI, but in summary, if it works, 1) the cancer yields will dwarf anything ever accomplished through risk stratification, and 2) it will open up MRI screening to all women, not just the minority who have risk factors.

The National Cancer Institute seemed to agree that we may be onto to something. In July, 2015, the NCI awarded us over $2 million for a 5-year study that will involve approximately 10,000 breast images. In keeping with NIH policy, “Research reported in this publication was supported by the National Cancer Institute of the National Institutes of Health under Award Number R-01CA197150. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.”

Our grant title is: “Increasing Cancer Detection Yield Using Breast MRI Screening Modality.” Breast MRI can detect more than 90% of breast cancers at an early stage, twice the number detected by mammography in head-to-head comparisons. While molecular biologists whittle away at the secrets locked inside the cancer cell, going for the eventual cure, it’s well past time to take full advantage of the miracle of multi-modality imaging, something invented in the past, yet presently offered only to a select few.

History in the Making…or Naught?

On June 17, 2015, the first patient (perhaps the first anywhere) was enrolled in a “proof of concept” trial that has the potential to alter forever how we screen for breast cancer – by using a blood test. The intent of a screening blood test would be to use it in tandem with mammography. Then, if the mammogram is negative, but the blood test positive, the patient would proceed to a breast MRI, the latter highly sensitive for the detection of breast cancer (capable of finding 2X to 3X the number of breast cancers found on mammograms in head-to-head trials of MRI and mammography).

 

Currently, screening with breast MRI is limited to high-risk patients only, but there are two problems here. Even at the highest known risks (mutations in BRCA or comparable genes), the MRI yield is only in the 3% range. That is, 100 MRIs to find 3 cancers. This is barely cost-effective, so risk-based screening is already at its limits of practicality. The other problem is that this approach excludes the vast majority of women who are headed for breast cancer, given that most who are eventually diagnosed have no major risk factors.

 

Yet, even so-called “tumor markers” are unreliable when it comes to diagnosing recurrent cancer, so how can we be discussing a blood test that detects breast cancer at its earliest stages? The answer comes through new technologies, one of which allows us to detect tiny quantities of autoantibodies that develop in response to the earliest forms of breast cancer. In fact, a blood test for early detection may not even work for detecting late stage disease. No one knows at this point.

 

The enrollment of this first patient in a trial expected to last several years is the culmination of 24 years of dedication to this single agenda. And, a special nod to the women of Oklahoma who let my team draw blood for basic scientists around the world, wherein we have shipped over 10,000 samples during my tenure at Mercy Hospital–OKC. University researchers as well as biotech companies have tried diligently to develop such a test, but have been cut short by failure. Out of many collaborations over the years, only 3 have made it to formal clinical trials, and 2 of those failed. One remains.

 

Provista Diagnostics, Inc (New York/Scottsdale) is completing an 11-site formal clinical trial using a blood test in a diagnostic role, to help radiologists decide whether or not to proceed with biopsy. Through a proprietary algorithm, 10 biomarkers, half of which were developed at Dana-Farber/Harvard by Dr. Karen S. Anderson et al, are combined into a “yes” or “no” report suggesting the need to proceed with biopsy or not. Mercy Breast Center contributed over 500 samples in the development phase of the test, and this multi-site diagnostic trial of 1,000 patients will draw to a close in August 2015, having included samples from the Cleveland Clinic, Mayo Clinic, Scripps and other busy sites.

 

Now, enter my “proof of concept” study being conducted at Mercy Hospital – OKC, which reflects a different use of the same blood test, for screening asymptomatic women rather than diagnostic help for the radiologist. Here, blood will be drawn on patients who are already scheduled for breast MRI in our high-risk screening program, under an approved Institutional Review Board protocol. Thus, there will be no further implications of a positive blood test result – the patient will already have had the definitive test, i.e., the MRI. After several hundred patients have been enrolled, we will be able to calculate what the outcomes would have been if we had acted only on the blood test results combined with mammography. Mathematical modeling already shows us that MRI yields, even with a modestly good blood test (say, 80% sensitivity and 90% specificity), should vastly improve upon the cancer yields of 3% currently being achieved with the risk stratification approach, with yields possibly up to 10% or more, highly cost-effective.   Plus, a screening blood test will open up the option of MRI screening to all women rather than only those at high risk.

 

Now, jump one step further into the future. With a very good blood test (95% sensitivity and 95% specificity), one could screen on the basis of the blood test alone. That is, no mammography at all, unless the blood test is positive. A positive test would then require mammography first to localize and diagnose the tumor. If mammograms were negative, then the patient would proceed to ultrasound or MRI. Even further into the future, the blood test could be improved to the point of detecting only those cancers that are biologically a threat to the patient, thus avoiding “overdiagnosis.”

 

Why hasn’t this been a top line agenda item for research funding? Because of the false belief that mammography detects 90% of breast cancers. “Why search for a blood test when we already have mammography?” But in recent years, the oft-quoted 90% has been adjusted down to 80% and “lower in younger women with dense tissue.” But that’s still not enough adjustment. The true sensitivity is lower still. Mammograms need help. The first time I heard of anyone else working on a blood test for early diagnosis of breast cancer was the 1st government grant (in 2004) issued for this same purpose, awarded to a researcher at the Fred Hutchinson Cancer Center in Seattle. I sent blood samples in collaboration with the PhD recipient of the grant, but the proposed line of study dead-ended. But I’m getting ahead of myself…

 

In 1991, I attended a breast conference in Dallas, Texas where a young Dr. Steve Harms showed the audience his experimental images using MRI to detect breast cancer. One case of breast cancer after the next, he reported, “and the mammograms were negative.” This was one of several turning points in my career, but I began a personal search to try and determine the incidence of missed cancers on mammography. My first academic paper was one of the early analyses as to why some cancers don’t appear on mammography. Of course, in the back of my head was one overriding concern – once breast MRI becomes available, how are we going to recommend such a cumbersome test to the entire population? Even with upcoming improvements in 2016 that are going to shorten the MRI study into a 10-minute test, there’s still the IV to be started, injection of a contrast agent, etc.

 

While still in academics, I started a weekly research meeting where those scientists at the OU Health Sciences Center campus interested in breast cancer research met and exchanged ideas, an exercise that continued every Friday the entire decade I was at that location. In 1991, shortly after I had seen the presentation by Dr. Harms on MRI, one of the senior scientists who attended that meeting (Paul McKay, PhD, who had been one of my professors in medical school) showed me an article about a proposed blood test for breast cancer. The light bulb turned on, and it’s never gone out.

 

Dr. McKay was based at the Oklahoma Medical Research Foundation, and he managed to get funding to bring the author of the paper (Dr. Chaya Moroz) from Tel Aviv to Oklahoma City, so that I could assist her in validating and bringing her blood test to the clinic. I had explained to Dr. McKay how the blood test could be the spark needed to select patients for additional imaging beyond mammography, especially MRI. In actuality, Dr. McKay and OMRF provided half the funding to launch this relationship with Dr. Moroz. The other half was raised by an army of volunteers, led by Cheryl Browne and the late Patricia Browne, who had devoted themselves whole-heartedly to the establishment of Oklahoma’s first multidisciplinary breast center that was under construction when the blood test opportunity arose. It was a match made in heaven – a novel breast center with a novel research agenda – but with one problem. In spite of my trip to NYC to meet with the senior partner of Penny & Edmonds intellectual property law firm handling the rights to the test, as it turned out, the blood test did not pan out. 10,000 samples and 24 years later, we are on the cusp of finding out if we can revolutionize breast cancer screening…or naught.

 

 

 

 

New Evidence Clashes with New Evidence, While Old Wins

Today’s headlines from the AP were a bit misleading: “Study Sees Benefit from More Extensive Breast Cancer Surgery.” A presentation at the American Society of Clinical Oncology (ASCO) at the end of May described a prospective, randomized trial led by Yale Cancer Center, addressing breast conservation technique. “More extensive surgery” is the misleading part, conjuring up images of a return to the Halsted radical mastectomy. In fact, the clinical trial was merely adding “thin shavings of the lumpectomy cavity” to reduce the chance of returning to the operating room for “positive margins.” In the study, returns to the operating room for wider excision were cut in half by the “shaving” technique. Commentary indicated this was a “new” and revolutionary concept that will impact as many as 100,000 women per year.

 

Prospective randomized trials are the mainstay of evidence-based medicine, so we have a “new” standard of care proposed. But wait a minute! This very issue was recently addressed by a large consensus panel with multi-expert firepower, and for the first time since breast conservation was introduced, the verdict came loud and clear – You don’t need to do anything more than what was done in the original NSABP B-06 trial in the 1970s. Unless ink touches tumor cells, there’s no need to re-excise. This “new” standard (in fact, about 35 years old) was adopted by the Society of Surgical Oncology (SSO) and the American Society for Therapeutic Radiology and Oncology (ASTRO) in 2013 after a meta-analysis was performed on local recurrence rates as related to margin status.

 

So, we have high quality evidence about to appear in the New England Journal of Medicine that suggests one should do more at the time of surgery to get better margins, while high quality evidence also suggests a surgeon doesn’t need to do anything more than get the tumor out with no ink touching tumor cells on final pathology.

 

For those of us who were practicing surgeons at the time breast conservation was introduced, there was considerable caution used in patient selection. At the time, according to B-06 protocol, if you didn’t get clear margins on the first attempt, the patient was returned to the operating, not for re-excision, but for mastectomy. This prompted many of us to be compulsive in attention to detail to make sure we got clear margins the first time around.

 

Having done a year of surgical pathology, I was very comfortable with how tissue was processed and how it subsequently appeared on the microscope slide. So, early on, circa 1989, I adopted a compulsive approach where I would submit the primary lumpectomy specimen to the pathologist for inking of margins, but then I would take an additional 5mm (or so) shaving around the lumpectomy cavity at those margins where the lumpectomy specimen interfaced with breast tissue (as opposed to fat, muscle, or skin). I would then orient these shavings and place India ink myself on the new “outer rim” while in the operating room so that the pathologist could “bread-loaf” the shaving and give me a report on the second set of margins. Not only did this give me greater assurance of margin status, but it also afforded the opportunity to re-create the tumor and its extensions in a 3-dimensional pattern.

 

In the academic setting, we once debated whether a return to the operating room for wider excision should be listed as a surgical complication. I explained how I kept my returns to the OR to a minimum, and a colleague was astounded at the ‘wasted’ time in the operating room: “You can’t possibly be doing that on every lumpectomy patient.” He was wrong.

 

A few years later, at a breast cancer meeting, a prominent breast surgeon described how he was using this exact same technique. At the end of his presentation, he said, “There are four of us doing this now.” Wow. Certainly, he meant, “In my limited circle of academic friends who are breast surgeons, there are four of us doing it now.” But it came across as a terribly constricted view of the world. Nonetheless, it helped validate what I was doing at the time (even if only five of us were doing it instead of four).

 

My decision to perform breast conservation in this fashion was 100% rational thought based on my experience in the pathology lab coupled to what I was learning about tumor growth patterns for different histologic types. Does it really take a prospective, randomized trial before we can practice medicine wisely?

 

It say it all too frequently – “I’m not opposed to evidence-based medicine. I’m opposed to the glorification of evidence-based medicine to the exclusion of rational thought.” And believe it or not, one sees such exclusion on a regular basis.

 

In this case, a prospective, randomized trial is being regarded as an historical moment in the annals of breast conservation surgery (okay, that’s a bit of a stretch) – a “new” concept – but in fact, common sense arrived at the same conclusion 25 years ago (for at least 5 of us).

 

And what do we do now with the 2013 recommendation from a Who’s Who list of experts who told us “no ink on tumor” is good enough. Paraphrased: “Don’t worry about close margins. Don’t worry about the method of pre-op imaging to design a roadmap for accurate lumpectomy. Go in, get the tumor out, and don’t be concerned about your margin status unless ink touches a tumor cell, exactly what the NSABP said in the 1970s.”

 

Those who want prospective randomized trials for everything we do in medicine are going to be sorely disappointed. We will never have “high quality” evidence for every situation. Rational thought is the tool that fills in the blanks, and while empiricists love to point out examples where rational thought failed miserably, I have no trouble coming up with examples where pure empiricism failed miserably. Good science is not synonymous with empiricism – it is a blend of empiricism and rationalism. Unfortunately, humans are not particularly good at balancing opposing concepts.

Circumstantial Evidence-based Medicine

I was reading a journal article recently, covering a topic for which there is precious little information available. The topic, for purposes here, doesn’t matter. The point is that at the end of the article, when the author included the obligatory self-criticism, he lamented, “the primary weakness of our study is that it is not evidence-based.” I wanted to throw up. The article had done a masterful job of filling an information void, yet political correctness using the buzz phrase “evidence-based medicine,” has become so powerful that it turns bright-shining humans into dim-witted lackeys scrambling for acceptance.

What the author should have said is this: “the primary weakness of our study is that it was not a prospective, randomized controlled trial.” Clearly stated. Specifically stated. More accurately stated. Yet, not quite sufficient if one is looking for politically correct self-flagellation.

How did it come to this? What is evidence-based medicine? Haven’t we been doing this all along?

The buzz phrase emerged about 1990, but did not gain a foothold until a decade later when it was linked to the sister-concept of “guideline-based” medicine, sometimes referred to disparagingly as “cookbook medicine.” If you’ve ever wondered why your doctor spends more time in the exam room examining his or her computer more than you, it’s because they are under pressure to document layers of trivia to allow 1) more effective billing, 2) to meet oppressive accreditation guidelines, and 3) to weather the storm of criticism if someone (peer, attorney, patient) should question whether or not proper guidelines were followed.

How did this sociologic transition begin? Here are some reasons – Rising costs and limited resources. The decision of physician-led groups to police themselves rather than wait for the government to step in (of course, it’s always “other specialties” that are causing the problems). Accreditation organizations ballooning like any bureaucracy, generating more and more requirements to follow guidelines, and more importantly, to document whether it’s true or not. The computer revolution that allowed full access to published literature, creating awareness of relative ignorance. Scientific developments occurring so rapidly, no human can keep up, again deferring to computer science. The rising status of epidemiologists (medicine without the blood) and public health specialists who anointed themselves as the only neutral parties worthy of establishing guidelines (part of their ascension to the throne included studies that demonstrated the inability of many physicians to take conclusions from randomized controlled trials and counsel an individual patient correctly). As usual, the causes are multi-factorial.

What is the definition of “evidence-based medicine” anyway? You may have heard the claim that “Medicine is both an art and a science.” When I heard this term growing up in the home of a “general practitioner,” I was told that “art” pertained to bedside manner. Today, the “art” means something different, in my view. Bedside manner is in its own class, better described as “ethics” or “humanity” or plain ol’ kindness and empathy. Instead, the “art” of medicine is filling in the blanks that are left by pure science, using logic and wisdom derived from available facts. Alternatively stated, the “art” is using reason to fill the gaps left by empiricism. It is impossible to settle every issue in medicine with a prospective, randomized trial. Therefore, there will always be blanks that need to be filled in, sometimes using that nebulous tool “judgement.”

Francis Bacon (1561-1626) is sometimes referred to as the “father of empiricism” (and/or “father of scientific method”) based on his philosophical stance that inductive reasoning should guide science, not the old-fashioned syllogism or rational deductive reasoning. But even Bacon cringed at schools of thought based on pure empiricism, claiming that this approach “gives birth to dogmas more deformed and monstrous than the Sophistical or Rational School.” His famous parable of the spider (pure rationalism, spewing forth silk from within to weave a web), the ant (pure empiricism, collecting grains of dirt, but nothing from within), and the ideal of the honey bee (a blend of both empiricism and rationalism, collecting pollen and offering honey in return) reveals that regardless of his devotion to inductive reasoning and empiricism, one should be well-grounded in reason, i.e., rational thought. It is the blend of empiricism and rationalism that generates honey.

Evidence-based medicine uses a process that is admirable, organizing what was already known about high quality evidence vs. low quality, into systematic rankings for these quality levels, used both for individual publications as well as guidelines. While some claim the top of the pecking order is the prospective, randomized trial, there’s actually a qualifier that generates even higher quality evidence – double-blinded – that is, both the patient and the researcher are blinded to the intervention, be it pill or placebo. It should be evident that some trials cannot be double-blinded, and I’m referring to the area where I practice – up front medicine heavily focused on radiology and surgery. It’s difficult to ask a surgeon to perform a procedure blindfolded, or a radiologist to interpret an MRI without looking at the picture (though radiologists can be blinded to the final pathology).

Dropping down a notch on the evidence scale opens up all sorts of potential bias, too numerous to describe here. But even a well-designed clinical trial has one over-riding problem – it may not translate to the real world. The paradox here is that the greater the number of exclusion factors one uses to control for variables in the clinical trial (raising the quality of the data), the more restricted is the population to whom the results apply. While guidelines are sometimes careful to note these limitations, this does not always translate to actual practice.

Still, evidence-based medicine is inherently a worthy goal. The problem is going overboard. Academic departments in Evidence-based Medicine have emerged (maybe this is just re-christening of Epidemiology), and organizations sprout more and more guidelines, which are much more than suggestions – they are very strong recommendations that put a physician on the defensive if not followed. Even though these guidelines may be physician-generated, insurers don’t necessarily follow suit, instead, generating their own set of guidelines, differing from one insurer to another.

Rigid devotion to empiricism has many untoward side effects, including the development of guidelines that are logically inconsistent. For instance, guidelines for SERM risk reduction (pharmacologic risk reduction) in high-risk women are based on the inclusion/exclusion criteria of the clinical trials that proved effectiveness. Fine. Then, the use of high-risk screening with breast MRI are based on the criteria used in different clinical trials. Fine. But now the bottom line: inclusion/exclusion criteria were markedly different in these two available interventions. As a result, women who qualify as high-risk for SERM risk reduction may not qualify for MRI, and vice versa. The illogical result? “Here, take this pill every day for 5 years to lower your risk of breast cancer, and here’s the host of side effects you need to know about, including uterine cancer or even death due to pulmonary embolus. And by the way, you’re not at high enough risk to warrant recommending a breast MRI.” Really?

In 2013, one of the nation’s pre-eminent breast oncologists, Harold Burstein, MD (Dana-Farber/Harvard), wrote an engaging editorial in The Breast about his experience at the St. Gallen (Switzerland) breast cancer conference, entitled: “Expert opinion vs. guideline based care: The St. Gallen Case Study.” He wrote, “In contrast to the current American craze for detailed guidelines and pathways, the St. Gallen meeting unabashedly seeks to find expert consensus. There are no checklists of tests. No defined pathways. No lists of preferred regimens. No arrows pointing one of three ways based on a decision node….The tenor is to provide a direction for care that covers most of the patients rather than to script the design of care to be given to all patients with few exceptions…”

Dr. Burstein realized that this approach can be viewed in a negative light, “The looseness of the St. Gallen (conference) process alternately charms and appalls many observers from the U.S. This is particularly the case for those who look to St. Gallen to define standards of care that are transmittable to third-party payors, hospital administrators, and programmers who write electronic health record templates.”

Perhaps, Dr. Burstein was in the “charmed” group by virtue of his master’s degree in the history of science where one is exposed to the many philosophical theories as to what constitutes “the scientific method,” along with the fact that many major scientific discoveries used no methodology at all, other than rational thought. He closes his editorial with, “The current enthusiasm for guidelines and pathways has innumerable merits. But one necessary weakness is the assumption that clinical expertise can be fully bottled, packaged and shipped around the world. For those who cherish learning from wise colleagues and exploring the endless variations of clinical care, it is a delight that meetings such as St. Gallen continue to flourish.”

My thoughts on the topic are identical, but my spin a little more critical. Whereas these guidelines serve well to bring everyone up to a minimum standard, they do not encourage excellence above and beyond guidelines. Quite the contrary, the absence of a guideline can squash excellence. Witness the fact that it took Mel Silverstein, MD, arguably the most knowledgeable doctor in the U.S. on DCIS (Stage 0 breast cancer), 12 years to get his recommendation for “wide excision alone” into the NCCN guidelines. Why? His reasoning was superb, not to mention the cost-effectiveness of his approach, but his data was considered “low quality,” i.e., from non-randomized observational studies, even though he followed a strict protocol. As a breast surgeon, I adopted his system as supremely logical the first time I heard it. Those of us who accepted the Van Nuys protocol had to endure criticism from peers (for not irradiating everyone with DCIS) while Dr. Silverstein fought for recognition of his approach. Even after he managed to get his guidelines into print, there’s a notation that this is a “2b” recommendation, based on low level evidence — an asterisk, much like Roger Maris.

My beef is not with concept of evidence-based medicine and associated guidelines, in principle. My beef is with the by-product of obsessive preoccupation that seems to go hand-in-hand. I can offer many examples in my area of expertise (especially breast MRI) where excellence is squashed, and ignorance perpetrated, through slavish devotion to illogical guidelines.

Another twisted by-product of “neutral evidence-based medicine” is the fact that guidelines are no longer considered reliable when written by experts in an area who are also providers of health care in that same area. Now there’s a new concept. How do you find an expert to help establish guidelines who does not practice in that particular area? Answer: you don’t. You use experts in numbers and statistics, not experts who actually uses the proposed guidelines.

Understanding that one purpose of evidence-based medicine is to eradicate human bias (an impossible task), I can agree to go as far as restricting experts to non-voting status on guideline committees, or at an absolute minimum, allowing experts to testify at guideline meetings in order to put things into perspective through revealing nuance lost in raw statistics. Instead, some “think tanks” totally exclude practicing experts from the process. A good example is the U.S. Preventive Services Task Force on breast cancer screening, where they not only refused to consider any observational studies of screening mammography, but also refused to hear testimony from radiologic experts on screening mammography, much less have one serve as a non-voting member of the committee.

My beef is with the fact that while, in principle, evidence-based medicine is a worthy goal to provide a stronger basis for science in medicine, in fact, it is evolving with a more extended goal, that is, science to the exclusion of art. It is “high-quality data,” which may or may not correlate with Reality, to the exclusion of logic and wisdom. Ultimately, it will serve to control medical practice by those who don’t do it.

Is the Diagnosis of Breast Cancer Subjective?

A recent article in J.A.M.A. (Journal of the American Medical Association) prompted national media coverage followed by fleeting anxiety in the breast cancer community. Why “fleeting?” Because the same problem has been exposed every few years since 1991, but the ramifications are so overwhelming that it’s easier to ignore the problem entirely. The title of the article was misleading – “Diagnostic Concordance Among Pathologists Interpreting Breast Biopsy Specimens.” A more accurate title would have replaced “Concordance” with “Discordance,” given that the findings were shocking (unless you’ve followed this controversy for the past 24 years). In brief summation of the study, pathologists don’t agree on which patients have atypical hyperplasia (AH) vs. ductal carcinoma in situ (DCIS) even though the clinical implications are huge. For the former (AH), at most, the recommendation is a wide excision at the site of the AH. A comprehensive breast center will also refer the AH patient for high-risk counseling and interventional options such as aggressive screening. But if the flipped coin lands on DCIS, it’s “cancer,” and that includes radiation therapy and possible endocrine therapy. Some women even opt for bilateral mastectomies.

 

In lieu of going to the animal lab my “research year” of surgical residency, I opted to spend the academic year of 1977-78 in a surgical pathology fellowship at UCLA, in what turned out to be the pivotal year of my career. If I had to describe, in one word, my most lasting impression from that experience, I would choose “subjective.” Clinicians without pathology experience believe that the findings under the microscope are completely objective and as close to pure science as anything in medicine. Often, this is the case. But not always. And certain problems, such as ADH vs. DCIS, are highly subjective.

 

In the 1970s, David Page, MD (Vanderbilt) introduced breast cancer risk levels associated with various benign biopsy findings, which brought “atypical hyperplasia” out of the lab and into the clinic. Critics countered with an article by Rosai et al in the American Journal of Surgical Pathology in 1991 that revealed the classification system as being too subjective for clinical use, with wide disagreement in diagnoses among experts. Dr. Page and other experts responded in 1992 with an article in that same journal showing that strong agreement could be achieved after consensus training – among experts, that is. There was no extrapolation to general pathologists. Even then, the agreement was in the eye of the beholder. In the view of pathologists, concordance was excellent. But from the perspective of clinicians, not so much. When the distinction between AH and DCIS was considered, at least one expert disagreed with the other 5 in most of the cases. Yet, clinicians have been treating these diagnoses as black-and-white entities for decades.

 

So, in the 2015 article now making a splash, it’s a double whammy. If experts don’t agree, how did these researchers establish the “true” diagnosis for each biopsy by which the “average” pathologist was to be compared? The fine print reveals that 3 experts were unanimous on the diagnosis in only 75% on the first try, though differences were eventually hashed out to form a consensus-derived diagnosis. In the second whammy, the same biopsy slides were reviewed by the study group, that is, 115 pathologists who then proved to be disturbingly discordant from the consensus, especially when it came to differentiating AH from DCIS. (I won’t belabor here the shocking discordance in 5 of 72 cases of completely benign findings where a significant number of pathologists called the lesions “invasive cancer.” Nor will I address here the equally shocking finding that at least one pathologist labeled 22 of these 72 completely benign cases as DCIS. Those findings are a different problem than “subjectivity.”) In short, the 2015 conclusion is identical to what many have been saying all along, only using remarkably sophisticated techniques and statistics to add punctuation to a sentence that was written 24 years ago.

 

In 1991, still fresh from my subjective enlightenment at UCLA, I made a 35mm teaching slide for academia that claimed serious trouble would eventually brew if we didn’t acknowledge this problem and merge AH and low-grade DCIS into one diagnosis – call it “borderline” if you will. Treatment, I claimed, should be the same for both entities, i.e., wide excision alone. The gynecologists had the same issue going on with “severe dysplasia” of the cervix vs. “carcinoma in situ,” but they had already done the smart thing, recommending the same treatment (cervical cone) for both diagnoses. The distinction is subjective, the treatment is not. In 1993, results of the NSABP B-17 clinical trial indicated that all women with mammographically-discovered DCIS should undergo lumpectomy and radiation therapy. Other studies confirmed the same, always using the dichotomous approach of separating ADH from DCIS as distinct entities, treating the latter aggressively.

 

Dr. Mel Silverstein and Dr. Mike Lagios have done more than any of us with regard to this problem by introducing a scoring system (Van Nuys Prognostic Index in 1997) that ends up guiding treatment so that small areas of low grade DCIS are excised as you would AH, without radiation. But acceptance of “no radiation” came only after 12 years of mud-slinging conflict, and even today, the evidence-based medicine forces make sure that everyone knows that “excision alone” for selected cases of DCIS is only “2b” evidence (weak evidence, as opposed to prospective, randomized trials).

 

The controversy has far-reaching implications, not only with regard to correct diagnosis and treatment, but also when it comes to screening. Anti-screening activists love to parade this issue around in its nakedness when discussing the harms of screening. And, in fact, they are correct. As long as we stumble over the subjectivity of AH vs. DCIS, as long as we keep irradiating women with borderline lesions, as long as women undergo bilateral mastectomy for these marginal lesions, then this controversy is truly the greatest harm of screening, as it is mostly a by-product of widespread mammography followed by subjective pathology. Forget mammography call-backs, “unnecessary biopsies” and the like, highly overstated as harms by anti-screening critics. The real potential for harm is not with radiologic standards, but with our unwillingness to adopt a “borderline lesion” approach in this problem of AH vs. low-grade DCIS, thus avoiding overtreatment.

 

There is very little discordance when it comes to high-grade comedo-type DCIS. The problem is distinguishing low or moderate grade DCIS from atypical ductal hyperplasia (AH or ADH). A diagnosis in this category should be called a “borderline lesion,” and standard treatment should be wide excision followed by high-risk counseling. Then, the benefit of knowing about a significant risk factor is enhanced, the harms of screening minimized, and everyone is happy, sort of. Will it happen? Of course not. It’s far too sensible, and would require retractions from countless experts.

I’ve been harping about this controversy since Rosai’s 1991 article, even before results from the clinical trials that added radiation therapy to DCIS management. Considering all of the above, perhaps it’s not so strange that I limit my new patient practice to women with “tissue risks” found on breast biopsy, e.g. AH/DCIS. It’s considered “going the extra mile” when I opt for an additional pathology opinion from well-known experts. Yet, if the experts don’t agree…what next? A big part of my role is explaining the nature of this controversy while, at the same time, offering guidance, erring on the side of caution.