Skip to main content
Vanessa's Thoughts

Bias in Evidence Based Medicine

By June 22, 2022No Comments

I am sharing my insight into evidence based medicine (EBM), as I recognise that we have all believed this to be the gold standard and when we followed the guidance, we were doing good things. Yet, bias has led to unintended consequences.

As a follower of evidence based medicine, I had not appreciated some of the detriment that may occur but I also believe that we should be kind to ourselves as our brains are designed as high performance pattern recognisers to cope with all the information that they are exposed to. This leads to thinking short cuts and using strategies such as EBM enabled us to function in a world where it is impossible to be effective without some systems to help.

Bias at a system and organisational level can lead to systemic disadvantage, culture and impact on functioning of organisations and communities.

This is a long blog but I believe understanding more about bias is critical to the solving of the wicked problems that we currently face in health and social care.

In research, the concept of bias is well defined and has led to scrutiny of the methodology and analysis of bias. We have trusted that Random Controlled Trials (RCT) and EBM should be able to give clear, definitive and reliable responses to questions.

A good-quality RCT must fulfil the following:

(a) the clinical question must be clearly stated,

(b) the statistical methods must be accurately chosen,

(c) the target sample must be carefully selected,

(d) the randomisation must happen in a clear, unbiased and blinded way,

(e) the collection of data must be rigorous and thorough,

(f) the analysis of data must be blinded and statistically correct,

(g) the evaluation of the results must be unbiased,

(h) the conclusions of the work must be a direct consequence of the statistical analysis and no room should be allowed for personal beliefs and unsupported opinions.

Wider evidence based practices mirror this process utilising national audits, benchmarking systems and evaluation of key performance matrix to inform best practice.

However, critical to these assumptions is that:-

  • the research question is appropriate for the problem
  • the statistical method is fit for purpose
  • the sample population is inclusive and representative of the real world
  • that analysis utilises multiple perspectives to consider what the outcomes mean

Our approach is to use simplistic measures, which do not take into account the complexity of ecosystems.

We use a narrow choice of participants and measure with demographic datasets that are not fit for purpose.

Then we analyse through a fixed mind set, often utilising the medical model and efficacy of medicines.

New ways of framing evidence and generating innovative solutions requires a fundamental change in perspective, abandoning deeply held principles and assumptions, and introducing ideas and methodologies from disciplines beyond EBM.

This will feel scary as we have to acknowledge that we have practiced medicine in a manner which has may have caused harm – this causes me anxiety and distress as I believe in the ethical principle of ‘Do No Harm’.  

There are examples of positive outcomes through the use of EBM such as the British Thoracic Society’s asthma guidelines, developed through consensus but based on a combination of randomised trials and observational studies. Subsequently, the use of personal care plans and step wise prescription of inhaled steroids for asthma increased, and morbidity and mortality fell.

However there are other negative outcomes where research has misled practice, such as the breast cancer risk in Hormone Replacement Therapy, wrongly revealed risks that outweighed benefits in postmenopausal women. Use of opiate medications as first line analgesia led to a global epidemic of dependency as we believed that ‘when pain was treated, the person could not become addicted’ and the number of people that have died in relation to addiction is a travesty.

Research Biases

In research there are always biases that need to be considered.

The study design can be poor, statistical bias. There can be systematic differences between groups, selection bias, or between the care the patients receive as performance bias. The outcome can be determined differently in the groups, known as detection bias or the experimental and control groups can get mixed and lead to contamination bias. Within the trial, interviewer bias can influence results, chronology bias is where timing of trials can impact on results, recall bias, exposure misclassification or outcome misclassification can all influence results. Bias after trial completion can be seen with citation bias, confounding and confirmation bias all influencing how results are interpreted.

However the most influential bias of all is that of the “conflict of interest bias”.

RCTs are run or sponsored by the pharmaceutical industries, which have the money, the means and the knowledge to design and conduct wide trials and to publish their results in influential journals.

EBM’s acceptance of industry-generated ‘evidence’ leads to bias in the choice of hypotheses tested, manipulation of study design and selective publication, little scrutiny of the outliers and lack of inclusion. Organisations are penalised for not following these artificial norms and systemic disadvantage has been embedded.

Professional organisations can also perpetuate “conflict of interest bias” utilising their professional power to reinforce the status quo and further champion industry generated evidence.

The Care Quality Commission (CQC) is the independent regulator of health and adult social care in England. As part of the intelligence-driven approach to regulation, the CQC works closely with national clinical audit bodies to identify key metrics which reflect quality of care and track the performance of providers against these metrics however we rely on the evidence base metric being correct and Shrewsbury maternity services are an example of caesarian section rates being used as a measure of quality.

KPIs, benchmarking and achievement of standards measure success of organisations and are linked to performance and payment mechanisms. Therefore, work around practices develop so that financial penalties and reputational damage is avoided and does not lead to public health gain. Examples such as introducing a first point of contact rather than the an assessment, or that patients are moved to a medical admissions unit rather than a specialist ward are mechanisms introduced to ensure targets are met.

Assuming that the trials are conducted rigorously and the analysis of data is statistically correct, negative results are often omitted, cherry-picking those that can optimise the sale of medicines and products or show case success.

Negative results, are as worthy of publication as the positive ones but may be hidden. Therefore, medicine based on such evidence is likely to be less effective if not unsafe and focuses on medicines rather than wider interventions or systems.

This is a serious problem for the overall evidence base underpinning practice and we need to re-analyse this to take into account these flaws.

The Complexity of Patient forms a bias

Teams deal with people with real life circumstances that provide context. Chronically ill patients usually take multiple medications over many years in a different manner to which they are prescribed and comorbidity is a fact of life for our patients often combined with socioeconomic deprivation and other personal experiences which include culture and identity.

Thus, the application of EBM to real life means that we are building on a flawed foundation.

Biases are caused by:

(1) the lack of patient input to the design;

(2) the low status given to experience (‘anecdote’) in the hierarchy of evidence;

(3) the limited attention given in EBM to power imbalances in healthcare;

(4) the over-emphasis on the clinician-patient relationship which overlooks how we self-management our conditions and the influence of wider social networks; 

(5) the primary focus on people who seek and obtain care rather than on the hidden denominator of those that do not seek or cannot access care;

(6) the avoidance of understanding how the ‘google effect’ is impacting on decisions;

(7) Presence or absence of comorbidities within RCTs by excluding people with more than one disease, by studying the effect of a single intervention on a single disease state;

RCTs derive proof of effectiveness through strictly controlling as many variables as possible, so that any differences between the intervention and control groups may be attributed to the intervention rather than some other factor. The presence of multiple diseases and their various treatments, alongside altering the parameters in which they are taking their medicine, weaken this process, hence the exclusion of people to achieve statistical significance.

Similarly, service evaluations, only include those that use services and rarely explore the hidden opinions of those who do not access services, once again reinforcing bias.

Many diseases such as hypertension, diabetes and epilepsy are multifactorial and not a single entity with a single variable, yet we apply the same principles as a simple disease. Even our treatment of ‘an enlarged prostate’ is illogical, with the large fleshy prostates receiving the same treatment recommendations as the tight muscular prostates which we note from clinical examination.

Most published research has minimal patient input and the evidence gathered relates to options and outcome measures that patients themselves would not have chosen. The available menu of evidence-based choices reflects a biomedical framing and omits options that might be more acceptable and effective. The lack of study of exercise in relation to many diseases, is an example of the favouring of medicines as the intervention of choice.

EBM’s hierarchy of evidence tends to devalue the patient experience. The patient is effectively ‘regressed to the mean’ and offered the option that the average patient would benefit most from.      

Qualitative evidence, even when robust and relevant, is rarely used to its full potential.

Patient and public input to setting research priorities, study design, choice of outcome measures, and interpretation and dissemination of findings must be prioritised and effectively resourced.

Lack of Diversity as a bias

Much of the information about the research participation reports either under representation or lack of data about ethnicity.

The most represented group in these trials are white, middle class, highly educated men. The characteristics of trials is that the subjects are often younger and may vary significantly than the groups they are trying to treat. This misses ethnic diversity or more accurately genetic ancestry, children or the older populations, those with poorer socio-demographic markers to name a few.

This lack of reporting precludes the possibility of any subgroup analysis to identify any significant differences. More importantly, the lack of participation means there is a paucity of research evidence about which interventions are effective in disadvantaged groups.

The generalisability of the findings is limited to people who are sufficiently similar to the trial participants and assumes that the outcome will be the same missing physical, cultural, and structural issues which may not translate to our wider populations.

Creating trials with lived experience as part of the research team is important and using strategies such as Diverse and Inclusive steering groups may help address these issues.

Age, ethnicity and gender variations may lead to changes in responses to drug treatments. These variations, due to differences in the metabolism of drugs, results in variable circulating concentrations, so that the same doses of a drug given to different people can have variable effects.

More difficult is the task of unravelling cultural and structural issues to do with engaging in health care, but this is a crucial part of generating good evidence. Distrust of white dominated institutions is a key factor after the impact of ‘being experimented on without their knowledge or consent’ or being ‘denied services’ has created different perspectives of accessing services, which include the public sector vs private services and other commercial incentives which impact on choice and behaviours, alongside outcomes.

We need to know, not only that an intervention works in ideal trial circumstances with a well defined population, but also works in the context of routine care, within the widest community.

Bias in diagnosis

The human brain is a complex organ with the wonderful power of enabling man to find reasons for continuing to believe whatever it is that he wants to believe. – Voltaire

Errors in cognition have been identified in all steps of the diagnostic process, including information gathering, interpretation of test results and in clinical reasoning and diagnosis and treatment.

The causes of bias are varied, and include learned or innate biases, social and cultural biases, a lack of appreciation for statistics, and even environmental stimuli competing for our attention.

Our brains respond automatically move away from risk and towards others who create a feeling of belonging. We have type 1 thinking as a fast, intuitive, pattern recognition driven method of problem solving, which places a low cognitive burden or type 2 thinking, which is slower, more methodical and a thoughtful process. Type 2 thinking places a higher cognitive strain on the brain but allows appraisal of data more critically and looks beyond patterns. Our left and right sides of our brains lean towards process and empathy, neurodiversity and our character all creates differences in our perceptions. In addition, situations of stress, fatigue, environment and cognitive overload affects our decision making. This has formed part of our new Patient Safety Strategy with Human Factors being critical to understanding how decisions were made.

Examples of bias include the availability or likelihood of the event, leading to a favoured choice, diagnostic anchoring occurring when a plan is conceived before receiving all the necessary information about it. Confirmation bias, interprets information to fit a preconceived diagnosis. Overconfidence bias and diagnostic momentum reinforce following a course of action, without considering the new information available and changing the plan if required (particularly if the plan is identified by a senior or we have heavily invested in that plan).

Group decision making can be seen as group-think, where people agree and fail to challenge the status quo. Our alignment to our professional identify or cultural groups can impact on our decisions and wider influences, mean that intoxicated patients, repeat attenders or those who challenge may receive a different level of care.

We must use reflection and gain insight into our own thought processes and discuss openly human factors, patient safety and review the relationship between healthcare professionals and the systems with which they interact. By understanding our diversity of perspectives, we can co-create new shared understanding with our patients, colleagues and in the systems we work.

Meta-analysis reinforcing bias

Despite the risk of bias, RCT remains the most reliable research design, provided that the size of the sample allows statistically significant conclusions. If they have ‘low statistical power’, several small sized RCTs can be combined with an advanced statistical procedure called meta-analysis.

A meta-analysis consists of a systematic review of different studies in order to gather all the cases together and perform statistical tests on the pooled population. The main difference between a “simple” systematic review and a meta-analysis is that the former is a collective interpretation of the available studies, whereas the latter allows a proper statistical analysis with evaluation of the probability that the null hypothesis is true (p-value).

Meta-analyses are powerful studies that constitute the basis of many of our guidelines.

The risks of meta-analysis may be that they wrongly estimate statistical effect and do not appreciate differences between populations, unclear quality of included studies and inclusion of only published results.  Restrictions such as only reviewing articles in English can give different results from those that could be obtained if no linguistic restriction has been applied.

Therefore, meta-analyses cannot be completely trusted as regards their significance and may amplify problems with less effective, harmful or more expensive treatments being identified and wider public health or social investment poorly understood.

The Impact of Bias

Bias is depicted negatively as something that distorts the truth as a cause of systematic error and that can potentially be eliminated using technical procedures and checklists however bias can also be defined in terms of a value-driven perspectives which leads to beliefs and behaviours. This kind of bias cannot be eliminated and is unavoidable as is part of humanity. It is, however, potentially productive if we share our perspectives which can lead to creativity, empowerment and innovation.

Healthcare needs to build from statistical analysis and causation, towards shared understanding and co-production of next steps where empowerment and exploration of options are prioritised.

More investment in independent research is required with shift from the RCTs and datasets to include narrative and qualitative perspectives. Independent bodies, with a diversity of opinion and voices, need to set research priorities that are co-produced with the whole population and reflect public health opportunities rather than organisational commercial priorities.

A sensible addition is to ensure the registration and reporting of all clinical trials should be published including those with negative findings so we can learn from all insight.

What do people probably need to know?

  • Was the intervention accessible?
  • Was it delivered to the standards required?
  • Did it make a difference to the person?

What do those responsible for population health need to know?

  • What does the whole system look like to those who use services?
  • From this jigsaw, is it accessible by all?
  • Which elements are measurable and how can these be measured?
  • What difference does it make to the population and is it inclusive?