Evidence Soup
How to find, use, and explain evidence.

Tuesday, 13 March 2018

Biased instructor response → Students shut out

Benjamin-dada-323461-unsplash

Definitely not awesome. Stanford’s Center for Education Policy Analysis reports Bias in Online Classes: Evidence from a Field Experiment. “We find that instructors are 94% more likely to respond to forum posts by white male students. In contrast, we do not find general evidence of biases in student responses…. We discuss the implications of our findings for our understanding of social identity dynamics in classrooms and the design of equitable online learning environments.”

“Genius is evenly distributed by zip code. Opportunity and access are not.” -Mitch Kapor

One simple solution – sometimes deployed for decision debiasing – is to make interactions anonymous. However, applying nudge concepts, a “more sophisticated approach would be to structure online environments that guide instructors to engage with students in more equitable ways (e.g., dashboards that provide real-time feedback on the characteristics of their course engagement).”

Prescribe antidepressants → Treat major depression

Metaanalysis-lancetAn impressive network meta-analysis – comparing drug effects across numerous studies – shows “All antidepressants were more efficacious than placebo in adults with major depressive disorder. Smaller differences between active drugs were found when placebo-controlled trials were included in the analysis…. These results should serve evidence-based practice and inform patients, physicians, guideline developers, and policy makers on the relative merits of the different antidepressants.” Findings are in the Lancet.

Thursday, 08 March 2018

Redefining ‘good data science’ to include communication.

Data science revised skillset on VentureBeat by Emma Walker

Emma Walker explains on VentureBeat The one critical skill many data scientists are missing. She describes the challenge of working with product people, sales teams, and customers: Her experience made her “appreciate how vital communication is as a data scientist. I can learn about as many algorithms or cool new tools as I want, but if I can’t explain why I might want to use them to anyone, then it’s a complete waste of my time and theirs.”

After school, “you go from a situation where you are surrounded by peers who are also experts in your field, or who you can easily assume have a reasonable background and can keep up with you, to a situation where you might be the only expert in the room and expected to explain complex topics to those with little or no scientific background.... As a new data scientist, or even a more experienced one, how are you supposed to predict what those strange creatures in sales or marketing might want to know? Even more importantly, how do you interact with external clients, whose logic and thought processes may not match your own?”

How do you interact with external clients, whose logic and thought processes may not match your own?

Sounds like the typical “no-brainer”: Obvious in retrospect. Walker reminds us of the now-classic diagram by Drew Conway illustrating the skill groups you need to be a data scientist. However, something is “missing from this picture — a vital skill that comes in many forms and needs constant practice and adaption to the situation at hand: communication. This isn’t just a ‘soft’ or ‘secondary’ skill that’s nice to have. It’s a must-have for good data scientists.” And, I would add, good professionals of every stripe.

Tuesday, 06 March 2018

Biased evidence skews poverty policy.

Decision bias: food-desert map

In Biased Ways We Look at Poverty, Adam Ozimek reviews new evidence suggesting that food deserts aren’t the problem, behavior is. His Modeled Behavior (Forbes) piece asks why the food desert theory got so much play, claiming “I would argue it reflects liberal bias when it comes to understanding poverty.”

So it seems this poverty-diet debate is about linking cause with effect - always dangerous, bias-prone territory. And citizen-data scientists, academics, and everyone in between are at risk of mapping objective data (food store availability vs. income) and subjectively attributing a cause for poor habits.

The study shows very convincingly that the difference in healthy eating is about behavior and demand, not supply.

Ozimek looks at the study The Geography of Poverty and Nutrition: Food Deserts and Food Choices Across the United States, published by the National Bureau of Economic Research. The authors found that differences in healthy eating aren’t explained by prices, concluding that “after excluding fresh produce, healthy foods are actually about eight percent less expensive than unhealthy foods.” Also, people who moved from food deserts to locations with better options continued to make similar dietary choices.

Food for thought, indeed. Rather than following behavioral explanations, Ozimek believes liberal thinking supported the food desert concept “because supply-side differences are more complimentary to poor people, and liberals are biased towards theories of poverty that are complimentary to those in poverty.” Meanwhile, conservatives “are biased towards viewing the behavioral and cultural factors that cause poverty as something that we can’t do anything about.”

Thursday, 01 March 2018

Why don't Executives trust analytics?

Boston-dynamics-spot-mini

Last year I spoke with the CEO of a smallish healthcare firm. He had not embraced sophisticated analytics or machine-made decision making, with no comfort level for ‘what information he could believe’. He did, however, trust the CFO’s recommendations. Evidently, these sentiments are widely shared.

A new KPMG report reveals a substantial digital trust gap inside organizations: “Just 35% of IT decision-makers have a high level of trust in their organization’s analytics”.

Blended decisions by human and machine are forcing managers to ask Who is responsible when analytics go wrong? Of surveyed executives, 19% said the CIO, 13% said the Chief Data Officer, and 7% said C-level executive decision makers. “Our survey of senior executives is telling us that there is a tendency to absolve the core business for decisions made with machines,” said Brad Fisher, US Data & Analytics Leader with KPMG in the US. “This is understandable given technology’s legacy as a support service.... However, it’s our view that many IT professionals do not have the domain knowledge or the overall capacity required to ensure trust in D&A [data and analytics]. We believe the responsibility lies with the C-suite.... The governance of machines must become a core part of governance for the whole organization.”

Tuesday, 06 February 2018

Now cognitive bias is poisoning our algorithms.

Tversky-kahneman-altman-PWLtalk2018-cover-1-476x476

Can we humans better recognize our cognitive biases before we turn the machines loose, fully automating them? Here’s a sample of recent caveats about decision-making fails: While improving some lives, we’re making others worse.

Yikes. From HBR, Hiring algorithms are not neutral. If you set up your resume-screening algorithm to duplicate a particular employee or team, you’re probably breaking the rules of ethics and the law, too. Our biases are well established, yet we continue to repeat our mistakes.

Amos Tversky and Daniel Kahneman brilliantly challenged traditional economic theory while producing evidence of our decision bias. Recently I gave a Papers We Love talk on behavioral economics and bias in software design. T&K’s early research famously identified three key, potentially flawed heuristics (mental shortcuts) commonly employed for decision-making: Representativeness, availability, and anchoring/adjustment. The implications for today’s software development must not be overlooked.

Algorithms might be making the poor even less equal. In Automating Inequality, Virginia Eubanks argues that the poor “are the testing ground for new technology that increases inequality.” She argues that our “moralistic view of poverty... has been wrapped into today‘s automated and predictive decision-making tools. These algorithms can make it harder for people to get services while forcing them to deal with an invasive process of personal data collection. As examples, she profiles a Medicaid application process in Indiana, homeless services in Los Angeles, and child protective services in Pittsburgh.”

Prison-sentencing algorithms are also feeling some heat. “Imagine you’re a judge, and you have a commercial piece of software that says we have big data, and it says this person is high risk...now imagine I tell you I asked 10 people online the same question, and this is what they said. You’d weigh those things differently.” [Wired article] Dartmouth researchers claim that a popular risk-assessment algorithm predicts recidivism about as well as a random online poll. Science Friday also covered similar issues with crime sentencing algorithms.

Wednesday, 09 August 2017

How evidence can guide, not replace, human decisions.

Bad Choices book cover

1. Underwriters + algorithms = Best of both worlds.
We hear so much about machine automation replacing humans. But several promising applications are designed to supplement complex human knowledge and guide decisions, not replace them: Think primary care physicians, policy makers, or underwriters. Leslie Scism writes in the Wall Street Journal that AIG “pairs its models with its underwriters. The approach reflects the company’s belief that human judgment is still needed in sizing up most of the midsize to large businesses that it insures.” See Insurance: Where Humans Still Rule Over Machines [paywall] or the podcast Insurance Rates Set by ... Machine Intelligence?

Who wants to be called a flat liner? Does this setup compel people to make changes to algorithmic findings - necessary or not - so their value/contributions are visible? Scism says “AIG even has a nickname for underwriters who keep the same price as the model every time: ‘flat liners.’” This observation is consistent with research we covered last week, showing that people are more comfortable with algorithms they can tweak to reflect their own methods.

AIG “analysts and executives say algorithms work well for standardized policies, such as for homes, cars and small businesses. Data scientists can feed millions of claims into computers to find patterns, and the risks are similar enough that a premium rate spit out by the model can be trusted.” On the human side, analytics teams work with AIG decision makers to foster more methodical, evidence-based decision making, as described in the excellent Harvard Business Review piece How AIG Moved Toward Evidence-Based Decision Making.


2. Another gem from Ali Almossawi.
An Illustrated Book of Bad Arguments was a grass-roots project that blossomed into a stellar book about logical fallacy and barriers to successful, evidence-based decisions. Now Ali Almossawi brings us Bad Choices: How Algorithms Can Help You Think Smarter and Live Happier.

It’s a superb example of explaining complex concepts in simple language. For instance, Chapter 7 on ‘Update that Status’ discusses how crafting a succinct Tweet draws on ideas from data compression. Granted, not everyone wants to understand algorithms - but Bad Choices illustrates useful ways to think methodically, and sort through evidence to solve problems more creatively. From the publisher: “With Bad Choices, Ali Almossawi presents twelve scenes from everyday life that help demonstrate and demystify the fundamental algorithms that drive computer science, bringing these seemingly elusive concepts into the understandable realms of the everyday.”


3. Value guidelines adjusted for novel treatment of rare disease.
Like it or not, oftentimes the assigned “value” of a health treatment depends on how much it costs, compared to how much benefit it provides. Healthcare, time, and money are scarce resources, and payers must balance effectiveness, ethics, and equity.

Guidelines for assessing value are useful when comparing alternative treatments for common diseases. But they fail when considering an emerging treatment or a small patient population suffering from a rare condition. ICER, the Institute for Clinical and Economic Review, has developed a value assessment framework that’s being widely adopted. However, acknowledging the need for more flexibility, ICER has proposed a Value Assessment Framework for Treatments That Represent a Potential Major Advance for Serious Ultra-Rare Conditions.

In a request for comments, ICER recognizes the challenges of generating evidence for rare treatments, including the difficulty of conducting randomized controlled trials, and the need to validate surrogate outcome measures. “They intend to calculate a value-based price benchmark for these treatments using the standard range from $100,000 to $150,000 per QALY [quality adjusted life year], but will [acknowledge] that decision-makers... often give special weighting to other benefits and to contextual considerations that lead to coverage and funding decisions at higher prices, and thus higher cost-effectiveness ratios, than applied to decisions about other treatments.”

Monday, 31 July 2017

Resistance to algorithms, evidence for home visits, and problems with wearables.

Kitty with laptop

I'm back, after time away from the keyboard. Yikes! Evidence is facing an uphill battle. Decision makers still resist handing control to others, even when new methods or machines make better predictions. And government agencies continue to, ahem, struggle with making evidence-based policy.  — Tracy Altman


1. Evidence-based home visit program loses funding.
The evidence base has developed over 30+ years: Advocates for home visit programs - where professionals visit at-risk families - cite immediate and long-term benefits for parents and for children. Things like positive health-related behavior, fewer arrests, community ties, lower substance abuse [Long-term Effects of Nurse Home Visitation on Children's Criminal and Antisocial Behavior: 15-Year Follow-up of a Randomized Controlled Trial (JAMA, 1998)]. Or Nobel Laureate-led findings that "Every dollar spent on high-quality, birth-to-five programs for disadvantaged children delivers a 13% per annum return on investment" [Research Summary: The Lifecycle Benefits of an Influential Early Childhood Program (2016)].

The Nurse-Family Partnership (@NFP_nursefamily), a well-known provider of home visit programs, is getting the word out in the New York Times and on NPR.

AEI_funnel_27jul17

Yet this bipartisan, evidence-based policy is now defunded. @Jyebreck explains that advocates are “staring down a Sept. 30 deadline.... The Maternal, Infant and Early Childhood Home Visiting program, or MIECHV, supports paying for trained counselors or medical professionals” where they establish long-term relationships.

It’s worth noting that the evidence on childhood programs is often conflated. AEI’s Katharine Stevens and Elizabeth English break it down in their excellent, deep-dive report Does Pre-K Work? They illustrate the dangers of drawing sweeping conclusions about research findings, especially when mixing studies about infants with studies of three- or four-year olds. And home visit advocates emphasize that disadvantage begins in utero and infancy, making a standard pre-K program inherently inadequate. This issue is complex, and Congress’ defunding decision will only hurt efforts to gather evidence about how best to level the playing field for children.

AEI Does Pre-K Work

2. Why do people reject algorithms?
Researchers want to understand our ‘irrational’ responses to algorithmic findings. Why do we resist change, despite evidence that a machine can reliably beat human judgment? Berkeley J. Dietvorst (great name, wasn’t he in Hunger Games?) comments in the MIT Sloan Management Review that “What I find so interesting is that it’s not limited to comparing human and algorithmic judgment; it’s my current method versus a new method, irrelevant of whether that new method is human or technology.”

Job-security concerns might help explain this reluctance. And Dietvorst has studied another cause: We lose trust in an algorithm when we see its imperfections. This hesitation extends to cases where an ‘imperfect’ algorithm remains demonstrably capable of outpredicting us. On the bright side, he found that “people were substantially more willing to use algorithms when they could tweak them, even if just a tiny amount”. Dietvorst is inspired by the work of Robyn Dawes, a pioneering behavioral decision scientist who investigated the Man vs. Machine dilemma. Dawes famously developed a simple model for predicting how students will rank against one another, which significantly outperformed admissions officers. Yet both then and now, humans don’t like to let go of the wheel.

Wearables Graveyard by Aaron Parecki

3. Massive data still does not equal evidence.
For those who doubted the viability of consumer health wearables and the notion of the quantified self, there’s plenty of validation: Jawbone liquidated, Intel dropped out, and Fitbit struggles. People need a compelling reason to wear one (such as fitness coach, or condition diagnosis and treatment).

Rather than a data stream, we need hard evidence about something actionable: Evidence is “the available body of facts or information indicating whether a belief or proposition is true or valid (Google: define evidence).” To be sure, some consumers enjoy wearing a device that tracks sleep patterns or spots out-of-normal-range values - but that market is proving to be limited.

But Rock Health points to positive developments, too. Some wearables demonstrate specific value: Clinical use cases are emerging, including assistance for the blind.

Photo credit: Kitty on Laptop by Ryan Forsythe, CC BY-SA 2.0 via Wikimedia Commons.
Photo credit: Wearables Graveyard by Aaron Parecki on Flickr.

Tuesday, 03 January 2017

Valuing patient perspective, moneyball for tenure, visualizing education impacts.

Patient_value
1. Formalized decision process → Conflict about criteria

It's usually a good idea to establish a methodology for making repeatable, complex decisions. But inevitably you'll have to allow wiggle room for the unquantifiable or the unexpected; leaving this gray area exposes you to criticism that it's not a rigorous methodology after all. Other sources of criticism are the weighting and the calculations applied in your decision formulas - and the extent of transparency provided.

How do you set priorities? In healthcare, how do you decide who to treat, at what cost? To formalize the process of choosing among options, several groups have created so-called value frameworks for assessing medical treatments - though not without criticism. Recently Ugly Research co-authored a post summarizing industry reaction to the ICER value framework developed by the Institute for Clinical and Economic Review. Incorporation of patient preferences (or lack thereof) is a hot topic of discussion.

To address this proactively, Faster Cures has led creation of the Patient Perspective Value Framework to inform other frameworks about what's important to patients (cost? impact on daily life? outcomes?). They're asking for comments on their draft report; comment using this questionnaire.

2. Analytics → Better tenure decisions
New analysis in the MIT Sloan Management Review observes "Using analytics to improve hiring decisions has transformed industries from baseball to investment banking. So why are tenure decisions for professors still made the old-fashioned way?"

Ironically, academia often proves to be one of the last fields to adopt change. Erik Brynjolfsson and John Silberholz explain that "Tenure decisions for the scholars of computer science, economics, and statistics — the very pioneers of quantitative metrics and predictive analytics — are often insulated from these tools." The authors say "data-driven models can significantly improve decisions for academic and financial committees. In fact, the scholars recommended for tenure by our model had better future research records, on average, than those who were actually granted tenure by the tenure committees at top institutions."

Education_evidence

3. Visuals of research findings → Useful evidence
The UK Sutton Trust-EEF Teaching and Learning Toolkit is an accessible summary of educational research. The purpose is to help teachers and schools more easily decide how to apply resources to improve outcomes for disadvantaged students. Research findings on selected topics are nicely visualized in terms of implementation cost, strength of supporting evidence, and the average impact on student attainment.

4. Absence of patterns → File-drawer problem
We're only human. We want to see patterns, and are often guilty of 'seeing' patterns that really aren't there. So it's no surprise we're uninterested in research that lacks significance, and disregard findings revealing no discernible pattern. When we stash away projects like this, it's called the file-drawer problem, because this lack of evidence could be valuable to others who might have otherwise pursued a similar line of investigation. But Data Colada says the file-drawer problem is unfixable, and that’s OK.

5. Optimal stopping algorithm → Practical advice?
In Algorithms to Live By, Stewart Brand describes an innovative way to help us make complex decisions. "Deciding when to stop your quest for the ideal apartment, or ideal spouse, depends entirely on how long you expect to be looking.... [Y]ou keep looking and keep finding new bests, though ever less frequently, and you start to wonder if maybe you refused the very best you’ll ever find. And the search is wearing you down. When should you take the leap and look no further?"

Optimal Stopping is a mathematical concept for optimizing a choice, such as making the right hire or landing the right job. Brand says "The answer from computer science is precise: 37% of the way through your search period." The question is, how can people translate this concept into practical steps guiding real decisions? And how can we apply it while we live with the consequences?

Tuesday, 20 December 2016

Choices, policy, and evidence-based investment.

Badarguments

1. Bad Arguments → Bad Choices
Great news. There will be a follow-on to the excellent Bad Arguments book by @alialmossawi. The book of Bad Choices will be released this April by major publishers. You can preorder now.

2. Evidence-based decisions → Effective policy outcomes
The conversative think tank, Heritage Foundation, is advocating for evidence-based decisions in the Trump administration. Their recommendations include resurrection of PART (the Program Assessment Rating Tool) from the George W. Bush era, which ranked federal programs according to effectiveness. "Blueprint for a New Administration offers specific steps that the new President and the top officers of all 15 cabinet-level departments and six key executive agencies can take to implement the long-term policy visions reflected in Blueprint for Reform." Read a nice summary here by Patrick Lester at the Social Innovation Research Center (@SIRC_tweets).

Pharmagellan

3. Pioneer drugs → Investment value
"Why do pharma firms sometimes prioritize 'me-too' R&D projects over high-risk, high-reward 'pioneer' programs?" asks Frank David at Pharmagellan (@Frank_S_David). "[M]any pharma financial models assume first-in-class drugs will gain commercial traction more slowly than 'followers.' The problem is that when a drug’s projected revenues are delayed in a financial forecast, this lowers its net present value – which can torpedo the already tenuous investment case for a risky, innovative R&D program." Their research suggests that pioneer drugs see peak sales around 6 years, similar to followers: "Our finding that pioneer drugs are adopted no more slowly than me-too ones could help level the economic playing field and make riskier, but often higher-impact, R&D programs more attractive to executives and investors."

Details appear in the Nature Reviews article, Drug launch curves in the modern era. Pharmagellan will soon release a book on biotech financial modeling.

4. Unrealistic expectations → Questioning 'evidence-based medicine'
As we've noted before, @EvidenceLive has a manifesto addressing how to make healthcare decisions, and how to communicate evidence. The online comments are telling: Evidence-based medicine is perhaps more of a concept than a practical thing. The spot-on @trishgreenhalgh says "The world is messy. There is no view from nowhere, no perspective that is free from bias."

Evidence & Insights Calendar.

Jan 23-25, London: Advanced Pharma Analytics 2017. Spans topics from machine learning to drug discovery, real-world evidence, and commercial decision making.

Feb 1-2, San Francisco. Advanced Analytics for Clinical Data 2017. All about accelerating clinical R&D with data-driven decision making for drug development.