Evidence Soup
How to find, use, and explain evidence.

96 posts categorized "science & research methods"

Tuesday, 29 September 2015

Data blindness, measuring policy impact, and informing healthcare with baseball analytics.

This week's 5 links on evidence-based decision making.

Hear me talk October 14 on communicating messages clearly with data. Part of the HealthEconomics.com "Effective HEOR Writing" webinar series: Register here.

1. Creative statistics → Valuable insights → Reinvented baseball business Exciting baseball geek news: Bill James and Billy Beane appeared together for the first time. Interviewed in the Wall Street Journal at a Netsuite conference on business model disruption, Beane said new opportunities include predicting/avoiding player injuries - so there's an interesting overlap with healthcare analytics. (Good example from Baseball Prospectus: "no one really has any idea whether letting [a pitcher] pitch so much after coming back from Tommy John surgery has any effect on his health going forward.")

2. Crowdsourcing → Machine learning → Micro, macro policy evidence Premise uses a clever combination of machine learning and street-level human intelligence; their economic data helps organizations measure the impact of policy decisions at a micro and macro level. @premisedata recently closed a $50M US funding round.

3. Data blindness → Unfocused analytics → Poor decisions Data blindness prevents us from seeing what the numbers are trying to tell us. In a Read/Write guest post, OnCorps CEO (@OnCorpsHQ) Bob Suh recommends focusing on the decisions that need to be made, rather than on big data and analytics technology. OnCorps offers an intriguing app called Sales Sabermetrics.

4. Purpose and focus → Overcome analytics barriers → Create business value David Meer of PWC's Strategy& (@strategyand) talks about why companies continue to struggle with big data [video].

5. Health analytics → Evidence in the cloud → Collaboration & learning Evidera announces Evalytica, a SaaS platform promising fast, transparent analysis of healthcare data. This cloud-based engine from @evideraglobal supports analyses of real-world evidence sources, including claims, EMR, and registry data.

Tuesday, 08 September 2015

'What Works' toolkit, the insight-driven organization, and peer-review identity fraud.

This week's 5 links on evidence-based decision making.

1. Abundant evidence → Clever synthesis → Informed crime-prevention decisions The What Works Crime Toolkit beautifully synthesizes - on a single screen - the evidence on crime-prevention techniques. This project by the UK's @CollegeofPolice provides quick answers to what works (the car breathalyzer) and what doesn't (the infamous "Scared Straight" programs). Includes easy-to-use filters for evidence quality and type of crime. Just outstanding.

2. Insights → Strategic reuse → Data-driven decision making Tom Davenport explains why simply generating a bunch of insights is insufficient: "Perhaps the overarching challenge is that very few organizations think about insights as a process; they have been idiosyncratic and personal." A truly insight-driven organization must carefully frame, create, market, consume, and store insights for reuse. Via @DeloitteBA.

3. Sloppy science → Weak replication → Psychology myths Of 100 studies published in top-ranking journals in 2008, 75% of social psychology experiments and half of cognitive studies failed the replication test. @iansample delivers grim news in The Guardian: The psych research/publication process is seriously flawed. Thanks to @Rob_Briner.

4. Flawed policy → Ozone overreach → Burden on business Tony Cox writes in the Wall Street Journal that the U.S. EPA lacks causal evidence to support restrictions on ground-level ozone. The agency is connecting this pollutant to higher incidence of asthma, but Cox says new rules won't improve health outcomes, and will create substantial economic burden on business.

5. Opaque process → Peer-review fraud → Bad evidence More grim news for science publishing. Springer has retracted 64 papers from 10 journals after discovering the peer reviews were linked to fake email addresses. The Washington Post story explains that only nine months ago, BioMed Central - a Springer imprint - retracted 43 studies. @RetractionWatch says this wasn't even a thing before 2012.

Tuesday, 28 July 2015

10 Years After Ioannidis, speedy decision habits, and the peril of whether or not.

1. Much has happened in the 10 years since Why Most Published Research Findings Are False, the much-discussed PLOS essay by John P. A. Ioannidis offering evidence that "false findings may be the majority or even the vast majority of published research claims...." Why are so many findings never replicated? Ioannidis listed study power and bias, the number of studies, and the ratio of true to no relationships among those probed in that scientific field. Also, "the convenient, yet ill-founded strategy of claiming conclusive research findings solely on... formal statistical significance, typically for a p-value less than 0.05."
Now numerous initiatives address the false-findings problem with innovative publishing models, prohibition of p-values, or study design standards. Ioannidis followed up with 2014's How to Make More Published Research True, noting improvements in credibility and efficiency in specific fields via "large-scale collaborative research; replication culture; registration; sharing; reproducibility practices; better statistical methods;... reporting and dissemination of research, and training of the scientific workforce."

2. Speedy decision habits -> Fastest in market -> Winning. Dave Girouard, CEO of personal finance startup Upstart & ex-Google apps head, believes speedy decision-making is essential to competing: For product dev, and other organizational functions. He explains how people can develop speed as a healthy habit. Relatively little is "written about how to develop the institutional and employee muscle necessary to make speed a serious competitive advantage." Key tip: Deciding *when* a decision will be made from the start is a profound, powerful change that speeds everything up.

3. Busy, a new book by Tony Crabbe (@tonycrabbe), considers why people feel overwhelmed and dissatisfied - and suggests steps for improving their personal & work lives. Psychological and business research are translated into practical tools and skills. The book covers a range of perspectives; one worth noting is "The Perils of Whether or Not" (page 31): Crabbe cites classic decision research demonstrating the benefits of choosing from multiple options, vs. continuously (and busily) grinding through one alternative at a time. BUSY: How to Thrive in a World of Too Much, Grand Central Publishing, $28.

4. Better lucky than smart? Eric McNulty reminds us of a costly, and all-too-common, decision making flaw: Outcome bias, when we evaluate the quality of a decision based on its final result. His strategy+business article explains we should be objectively assessing whether an outcome was achieved by chance or through a sound process - but it's easy to fall into the trap of positively judging only those efforts with happy endings (@stratandbiz).

5. Fish vs. Frog: It's about values, not just data. Great reminder from Denis Cuff @DenisCuff of @insidebayarea that the data won't always tell you where to place value. One SF Bay Area environmental effort to save a fish might be endangering a frog species.

Monday, 20 July 2015

The Cardinal Sin of data science, Evidence for Action $, and your biases in 5 easy steps.

My 5 weekly links on evidence-based decision making.

1. Confusing correlation with causation is not the Cardinal Sin of data science, say Gregory Piatetsky (@kdnuggets) and Anmol Rajpurohit (@hey_anmol): It's overfitting. Oftentimes, researchers "test numerous hypotheses without proper statistical control, until they happen to find something interesting and report it. Not surprisingly, next time the effect, which was (at least partly) due to chance, will be much smaller or absent." This explains why it's often difficult to replicate prior findings. "Overfitting is not the same as another major data science mistake - confusing correlation and causation. The difference is that overfitting finds something where there is nothing. In case of correlation and causation, researchers can find a genuine novel correlation and only discover a cause much later."

2. July 22, RWJF (@RWJF) will host a webinar explaining its Evidence for Action program, granting $2.2M USD annually for Investigator-Initiated Research to Build a Culture of Health. "The program aims to provide individuals, organizations, communities, policymakers, and researchers with the empirical evidence needed to address the key determinants of health encompassed in the Culture of Health Action Framework. In addition, Evidence for Action will also support efforts to assess outcomes and set priorities for action. It will do this by encouraging and supporting creative, rigorous research on the impact of innovative programs, policies and partnerships on health and well-being, and on novel approaches to measuring health determinants and outcomes."

3. Your biases, in 5 tidy categories. We've heard it before, but this bears repeating: Our biases (confirmation, sunk cost, etc.) prevent us from making more equitable, efficient, and successful decisions. In strategy+business, Heidi Grant Halvorson and David Rock (@stratandbiz) present the SEEDS™ model, grouping the "150 or so known common biases into five categories, based on their underlying cognitive nature: similarity, expedience, experience, distance, and safety". Unfortunately, most established remedies and training don't overcome bias. But organizations/groups can apply correctional strategies more reliably than we can as individuals.

4. PricewaterhouseCoopers (@PwC_LLP) explains how four key stakeholders are pressuring pharma in 21st Century Pharmaceutical Collaboration: The Value Convergence. These four: government agencies, emboldened insurers, patient advocates, and new entrants bringing new evidence, are substantially shifting how medicine is developed and delivered. "Consumers are ready to abandon traditional modes of care for new ones, suggesting billions in healthcare revenue are up for grabs now. New entrants are bringing biosensor technology and digital tools to healthcare to help biopharmaceutical companies better understand the lives of patients, and how they change in response to drug intervention." These include home diagnostic kits to algorithms that check symptoms and recommend treatments."

5. Remember 'Emotional Intelligence'? A 20-year retrospective study, funded by the Robert Wood Johnson Foundation (@RWJF) and appearing in July's American Journal of Public Health, suggests that "kindergarten students who are more inclined to exhibit “social competence” traits —such sharing, cooperating, or helping other kids— may be more likely to attain higher education and well-paying jobs. In contrast, students who exhibit weaker social competency skills may be more likely to drop out of high school, abuse drugs and alcohol, and need government assistance."

Tuesday, 07 July 2015

Randomistas fight poverty, nurses fight child abuse, and decision support systems struggle.

1. Jason Zweig tells the story of randomistas, who use randomized, controlled trials to pinpoint what helps people become self-sufficient around the globe. The Anti-Poverty Experiment describes several successful, data-driven programs, ranging from financial counseling to grants of livestock.

2. Can an early childhood program prevent child abuse and neglect? Yes, says the Nurse-Family Partnership, which introduces vulnerable first-time parents to maternal and child-health nurses. NFP (@NFP_nursefamily) refines its methodology with randomized, controlled trial evidence satisfying the Coalition for Evidence-Based Policy’s “Top Tier”, and producing a positive return on investment.

3. Do recommendations from decision support technology improve the appropriateness of a physician's imaging orders? Not necessarily. JAMA provides evidence of the limitations of algorithmic medicine. An observational study shows it's difficult to attribute improvements to clinical decision support.

4. Is the "data-driven decision" a fallacy? Yes, says Stefan Conrady, arguing that the good alliteration is a bad motto. He explains on the BayesiaLab blog that the concept doesn't adequately encompass casual models, necessary for anticipating "the consequences of actions we have not yet taken". Good point.

5. A BMJ analysis says the knowledge system underpinning healthcare is not fit for purpose and must change. Ian Roberts says poor-quality, published studies are damaging systematic reviews, and that the Cochrane system needs improvement. Richard Lehman and others will soon respond on BMJ.

Thursday, 17 October 2013

Got findings? Show us the value. And be specific about next steps, please.

Lately I've become annoyed with research, business reports, etc. that report findings without showing why they might matter, or what should be done next. Things like this: "The participants biological fathers’ chest hair had no significant effect on their preference for men with chest hair." [From Archives of Sexual Behavior, via Annals of Improbable Research.]

Does it pass the "so what" test? Not many of us write about chest hair. But we all need to keep our eyes on the prize when drawing conclusions about evidence. It's refreshing to see specific actions, supported by rationale, being recommended alongside research findings. As Exhibit A, I offer the PLOS Medicine article Use of Expert Panels to Define the Reference Standard in Diagnostic Research: A Systematic Review of Published Methods and Reporting (Bertens et al). Besides explaining how panel diagnosis has (or hasn't) worked well in the past, the authors recommend specific steps to take - and provide a checklist and flowchart. I'm not suggesting everyone could or should produce a checklist, flowchart, or cost-benefit analysis in every report, but more concrete Next Steps would be powerful.

PLOS Medicine: Panel Diagnosis research by Bertens et al

So many associations, so little time. We're living in a world where people need to move quickly. We need to be specific when we identify our "areas for future research". What problem can this help solve? Where is the potential value that could be confirmed by additional investigation? And why should we believe that?

Otherwise it's like simply saying "fund us some more, and we'll tell you more". We need to know exactly what should be done next, and why. I know basic research isn't supposed to work that way, but since basic research seems to be on life support, something needs to change. It's great to circulate an insight for discovery by others. But without offering a suggestion of how it can make the world a better place, it's exhausting for the rest of us.

Wednesday, 30 January 2013

A must-read: "Does the Language Fit the Evidence? Association Versus Causation."

Science is easy; explaining it is hard. Back in 2009, Evidence Soup recommended the excellent Health News Review, whose mission is to "hold health and medical journalism accountable" (more about them at the end of this post). They've published Tips for Understanding Studies (available for purchase here). One of their free online writeups is a must-read for anyone working with evidence. Yes, it's basic -- but depending on your level of experience, will be a valuable refresher, an intro to research methods, or a guide to science writing.

Does The Language Fit The Evidence? – Association Versus Causation was put together by Mark Zweig, MD, and Emily DeVoto, PhD, "two people who have thought a lot about how reporters cover medical research". (I've been acquainted with Emily DeVoto for several years; I like that her catchphrase is "The plural of anecdote is not data.").

Passive language vs. active voice. The authors describe how an 'association' can be inadvertently misconstrued as a cause/effect relationship:

"A subtle trap occurs in the transition from the cautious, nondirectional, noncausal, passive language that scientists use in reporting the results of observational studies to the active language favored in mass media.... For example, a description of an association (e.g., associated with reduced risk) can become, via a change to the active voice (reduces risk), an unwarranted description of cause and effect. There is a world of difference in meaning between saying 'A was associated with increased B' and saying 'A increased B.'" [emphasis is mine]

These are subtle things with tremendous importance. When I was in school, I often heard "Correlation doesn't mean causation." Evidently, the difference is still be a big hurdle both for experts and non-experts. Zweig and DeVoto illustrate how things can go awry in this helpful example:

Study design Prospective cohort study of dietary fat and age-related maculopathy (observational).
Researchers’ version of results A 40% reduction of incident early age-related maculopathy was associated with fish consumption at least once a week.
Journalist’s version of results Eating fish may help preserve eyesight in older people.
Problem Preserve and help are both active and causal; may help sounds like a caveat designed to convey uncertainty, but causality is still implied.
Suggested language “People who ate fish at least once a week were observed to have fewer cases of a certain type of eye problem. However, a true experimental randomized trial would be required in order to attribute this to their fish consumption, rather than to some other factor in their lives. This was an observational study – not a trial.”


Abstracts stick to the facts. The authors note that the "language in a scientific publication is carefully chosen for the conclusion in the abstract or in the text, but not used so strictly in the discussion section. Thus, borrowing language from scientific papers warrants caution."

You can follow @HealthNewsRevu on Twitter. The project is led by Gary Schwitzer and funded by the Foundation for Informed Medical Decision Making.

Wednesday, 07 November 2012

What counts as good evidence? Alliance for Useful Evidence offers food for thought.

"What counts as good evidence?" is a great conversation starter. The UK-based Alliance for Useful Evidence / Nesta are hosting a seminar Friday morning to "explore what is realistic in terms of standards of evidence for social policy, programmes and practice." Details: What is Good Evidence? Standards, Kitemarks, and Forms of Evidence, 9 November 2012, 9:30-11:30 (GMT), London. The event is chaired by Geoff Mulgan, CEO of Nesta ; speakers include Dr. Gillian Leng, Deputy Chief Executive for Health and Social Care, NICE; and Dr. Louise Morpeth, Co-Director Dartington Social Research Unit.

Alliance for useful evidence Prompting that discussion is a 'provocation paper', What Counts as Good Evidence?, by Sandra Nutley, Alison Powell, and Huw Davies. They're with the University of St Andrews Research Unit for Research Utilisation (RURU). Let me know if you want me to send you a copy (I'm tracy AT evidencesoup DOT com).

The evidence journey. This paper doesn't break lots of new ground, but it's a useful recap of the state of evidence-seeking from a policy / program standpoint. While the authors do touch on bottom-up evidence schemes, the focus here isn't on crowdsourced evidence (such as recent health tech efforts). I love how they describe the effort to establish a basis for public policy as the evidence journey. Some highlights:

Hierarchies are too simple. We know the simple Level I/II labeling schemes, identifying how evidence is collected, are useful but insufficient. The authors explain that "study design has long been used as a key marker for evidence quality, but such ‘hierarchies of evidence’ raise many issues and have remained contested. Extending the hierarchies so that they also consider the quality of study conduct or the use of underpinning theory have... exposed new fault-lines of debate.... [S]everal agencies and authors have developed more complex matrix approaches for identifying evidence quality in ways that are more closely linked to the wider range of policy or practice questions being addressed."

Research evidence is good stuff. But the authors remind us "there are other ways of knowing things.  One schema (Brechin & Siddell, 2000) highlights three different ways of knowing": empirical, theoretical, and experiential.

Do standards help? The authors provide a very nice list of evidence standards & rating schemes (GRADE, Top Tier Evidence, etc.) - that is reason enough to get your hands on a copy of the paper. And they note the scarcity of evidence on effectiveness of these rating schemes.

Incidentally, Davies and Nutley contributed to the 2000 book What Works? Evidence-Based Policy and Practice that I've always admired (Davies, Nutley and Smith, Policy Press).

Thursday, 01 December 2011

U.S. agencies aren't supposed to allow political interference with scientific evidence. Good luck with that.

Evidence Soup is back in business. These past 3 months, I've been distracted by a number of things, including a move from Denver, Colorado to the San Francisco Bay Area. So, where were we?

Will "4.37 Degrees of Separation" play at the multiplex? Awhile back I wrote about recent research to test the Six Degrees of Separation theory. New evidence suggests that people are separated by an average of 4.74 degrees (only 4.37 in the U.S.). Doesn't really roll off the tongue, and I wouldn't expect Will Smith to star in a sequel. But this latest research applies only to people on Facebook; a New York Times piece reminds us "the cohort was a self-selected group, in this case people with online access who use a particular Web site".

Desperately seeking scientific integrity. Early on, President Obama launched an effort to ensure that the public can "trust the science and scientific process informing public policy decisions". His March 9, 2009 memorandum made agencies responsible for "the highest level of integrity in all aspects of the executive branch's involvement". Great idea to de-politicize science, though it's much easier said than done -- as evidenced by the nearly-two-year delay in providing "guidelines" for agencies issuing this policy. Really. Those guidelines were published December 17, 2010 [pdf here].

Oh, and meanwhile the Obama Administration was widely criticized for using sloppy science to support its moratorium on offshore drilling after the BP disaster on the Gulf Coast. It's never simple: In real life it requires weighing risks, factoring in economic impacts, and making political tradeoffs (consider this analysis of the economic impact of that ban - [pdf here]). A scientific integrity rule won't give you all the answers you need when making such complex decisions: It's surely not as simple as determining whether scientific findings were manipulated for political purposes.

As explained in a recent National Public Radio story, now we're seeing the first challenge to federal evidence-gathering under this new regime: It's directed at the Bureau of Land Management (a branch of the Department of the Interior). More about that in a moment.

Why is this so hard? It's been rough going for agencies issuing scientific integrity policies. The basics are straightforward enough: Preventing people from twisting or quashing scientific evidence. But here are some reasons why the process is so problematic.

  1. Scope. There's no bright line identifying what "science" should be included in an assessment, and therefore subjected to integrity requirements. This complicates things enough when you're working with "objective" evidence. It's even trickier when you bring in fuzzier stuff, like the dismal science (economics), or risk assessment. How can we say for sure what should be considered? Our assumptions about what's important determine what evidence we recognize - whether consciously or subconsciously. Our choices are influenced by our values - but we may not be fully aware of our values, or we may not want to articulate them in a transparent way.
  2. Dissemination. Each agency's integrity policy is supposed to provide for open communication, and guide how evidence is presented to the public (see the 2010 guidelines mentioned earlier). It wouldn't serve anyone to have a free-for-all; consistent, controlled dissemination can improve usefulness and understanding. But not everyone agrees on the rights and responsibilities of scientists who want to discuss their findings with the public.
  3. Whistle blowing. Among the substantial hurdles is the handling of whistle blowers. (I suppose if such a policy is going to have teeth, the people who want to blow whistles need to feel they can do so without losing their heads.)
  4. Transparency. Scientific groups - such as the Union of Concerned Scientists - say they still want to see external accountability under these policies. So far, investigations of misconduct are internal.

12291 all over again? Thirty years ago, President Reagan signed Executive Order 12291, requiring cost-benefit analysis for 'major' federal regulations (those expected to impact the U.S. economy by $100 million or more). Clinton issued a similar order in 1993. In theory, this should have de-politicized some agency decision-making processes. The results (or lack thereof) were the subject of my doctoral dissertation.

As with the new mandate for scientific integrity, a requirement to weigh regulatory costs against benefits leaves lots of room for interpretation, and requires value judgments. When EPA issues a rule under the Clean Air Act, it's difficult enough to estimate how many hospital visits or early deaths are caused by a particular type of airborne particulate matter. But figuring out the social costs is harder still: Requiring businesses & governments to reduce those emissions can lead to job cuts and economic loss, which themselves cause poverty and negative health impacts.

Challenging BLM's process. Citing the new scientific integrity policy, a group called Public Employees for Environmental Responsibility (PEER) has filed a complaint against the BLM, saying "The U.S. Bureau of Land Management is carrying out an ambitious plan to map ecological trends throughout the Western U.S. but has directed scientists to exclude livestock grazing as a possible factor in changing landscapes....  [O]ne of the biggest scientific studies ever undertaken by BLM was fatally skewed from its inception by political pressure.... As a result, the assessments do not consider massive grazing impacts even though trivial disturbance factors such as rock hounding are included, [and they] limit consideration of grazing-related information only when combined in an undifferentiated lump with other native and introduced ungulates (such as deer, elk, wild horses and feral donkeys)." I didn't know we had a feral ungulate problem. But I digress.

This is a good example of how choices about collecting evidence can strongly influence the results. NPR explains that the Dept. of Interior has a scientific integrity officer who is responsible for investigating allegations of political interference. I wish him Godspeed.


Tuesday, 16 August 2011

Is 'six degrees of separation' fact or fiction? Social scientists collect evidence to find out.

We've always heard about Six Degrees of Separation. Now let's see if evidence backs it up. As explained in the Mercury News, "The world's population has almost doubled since social psychologist Stanley Milgram's famous but flawed 'Small World' experiment gave people a new way to visualize their interconnectedness with the rest of humanity. Something else has also changed - the advent of online social networks, particularly Facebook's 750 million members, and that's what researchers plan to use."

You can join in. Social scientists at Yahoo! and Facebook have launched the Small World Experiment, "designed to test the hypothesis that anyone in the world can get a message to anyone else in just 'six degrees of separation' by passing it from friend to friend. Sociologists have tried to prove (or disprove) this claim for decades, but it is still unresolved.

"Now, using Facebook we finally have the technology to put the hypothesis to a proper scientific test. By participating in this experiment, you'll not only get to see how you're connected to people you might never otherwise encounter, you will also be helping to advance the science of social networks."

Photo credit: Film-Buff Movie Reviews, where they play Six Degrees of Kevin Bacon. I chose this picture of young Kevin because this weekend I saw a trailer for the Footloose remake. Some things should probably stay un-remade. (I was there to see Crazy, Stupid, Love, which I highly recommend. Kevin's in that one, too.)