Saturday, April 13, 2019

How Hattie’s Research Helps (and Doesn’t Help) Improve Student Achievement


Hattie Discusses What to Consider, Not How to Implement It . . . More Criticisms, Critiques, and Contexts

[CLICK HERE for the full Blog message]

Dear Colleagues,

Introduction

   By the time you read this Blog, I will have just landed in Singapore where I am one of six presenters at the World EduLead 2019 Conference [CLICK HERE] sponsored by the International Association for Scholastic Excellence (INTASE).

   During my week here, I will be presenting two full-day Master Classes, two Keynotes, and a Symposium with Michael Fullan (School Leadership), Carol Ann Tomlinson (Differentiated Instruction), and three other international education greats.

   Altogether, I will be presenting the following:

  • Seven Evidence-Based Strategies to Systemic Success in Schools
  • The Seven C’s of Success: Strengthening Staff Relationships to Ensure Student Success

  • School Reform: Strategic Planning, Shared Leadership, and Student Success
  • Helping Hattie Work: Translating Meta-Analysis into Meaningful Student Learning Outcomes

   While re-researching John Hattie’s work for last full-day presentation, I uncovered new “criticisms, critiques, and contexts” that motivated me to update at least two past Hattie Blog messages with this new one.

   In this Blog, then, we will describe the concerns in detail, and then discuss examples of how Hattie’s work can be effectively and defensibly—from a science-to-practice perspective—for students, by staff, and in schools.

   To accomplish this, the full Blog message will (a) briefly overview the concerns; (b) present a primer on meta-analysis; (c) quote from the concerns of three notable researchers; (d) discuss how to go from “effect to effective practice;” and (e) describe the questions to ask the “outside” Hattie consultant— before you hire him or her.

[CLICK HERE for the full Blog message]
_ _ _ _ _ _ _ _ _ _

A Brief Overview of Concerns with Hattie’s Research 

   Over the past decade especially, John Hattie has become internationally-known for his meta-meta-analytic research into the variables that most-predict students’ academic achievement.  Indeed, some view his different Visible Learning books (which have now generated a “Hattie-explosion” of presentations, workshops, institutes, and “certified” Hattie consultants) as the books of an educational “Bible” that shows educators “the way” to succeed with students.

   As such, Hattie has assumed a “rock star” status. . . which creates an illusion that his work is “untouchable,” that it cannot be critiqued, and that it certainly can’t be wrong.

   As of this writing, Hattie’s research is based on the synthesis of over 1,500 meta-analyses comprising more than 90,000 studies involving more than 300 million students around the world.  In more statistical terms, Hattie takes others’ published meta-analyses—investigating, for example, a specific educational approach (e.g., cooperative learning) or intervention (e.g., Reading Recovery), and he pools them together—statistically conducting a meta-meta-analysis.

   In doing this, he averages the effect sizes from many other meta-analyses that themselves have pooled research that investigated—once again—the effect of one psychoeducational variable, strategy, intervention, or approach on student achievement.
_ _ _ _ _

   While the magnitude and sheer effort of what Hattie has done is impressive. . . there are a number of major methodological problems with his statistical approaches and interpretations; and a number of additional major science-to-practice implementation problems. 

   To foreshadow the more comprehensive discussion later in this Blog, below is an example of one of his primary methodological problems, and one of his primary implementation problems.

   Educators need to fully understand these problems in order to be able to benefit— especially on behalf of their students—from this research.
_ _ _ _ _

An Example of a Methodological Problem in Hattie’s Research

  One major methodological problem is that Hattie’s statistical analyses may be flawed.  

   More specifically, a number of notable statisticians (see the section on this below) have questioned whether the effect sizes from different independent meta-analyses can be averaged and pooled into a single meta-meta-analytical effect size—which is exactly what Hattie is doing.

   As such, they don’t believe that the statistical approach used by Hattie in his research is defensible. . . which means that some of his research results may be incorrect.

   Metaphorically, what Hattie is doing is akin to averaging the average temperatures for 100 years of each day in March. . . and then saying that the 100-year average temperature for March in, say, Washington, D.C. is 48 degrees (it actually is—I looked this up).

   While you can statistically calculate this, the conclusion—regarding the 48 degree average temperature—may not be functionally accurate or, more importantly, meaningful (if you are planning a trip to DC). 

   First of all, in a typical year, Washington, D.C.’s March temperature may range from 37 degrees on one day to 59 degrees on another day—a variance of 22 degrees.  So, even in looking at one year’s worth of March temperatures, you need to statistically address the temperature range during any specific month. . . and then you need to look at this variability over 100 years. 

   Given all of this, the 48 degree 100-year average clearly does not accurately tell the entire story.

   The “single” temperature is compounded by the fact that there may be different “micro-climates” in Washington, D.C.  Thus, the daily temperature on any one March 15th, for example, may be 45 degrees in the Northwest part of the city, but 52 degrees in the Southeast part.

   Finally, from year to year. . . over 100 years. . . there may be some seasons that are colder or warmer than others.  Not to get political, but if we were to factor in the impact of Global Warming, it may be that the most-recent 10-year March temperature is significantly warmer than the average temperatures for the 90 years before. . . and, therefore, more accurate and meaningful for our current needs.
_ _ _ _ _

   There is, at least, one additional embedded issue.  Measuring temperature is scientifically far more reliable and valid than the diverse measures used in different studies (or at different times in a school) to measure student achievement.  A temperature is measured by a thermometer, and most thermometers will give basically the same reading because they are scientifically calibrated instruments.

   With the meta-analyses used by Hattie, different researchers operationalize “student achievement” (as an independent outcome measure) in different ways.  Even if a bunch of them operationalize student achievement the same way, they still may use different measurement tools or metrics. . . that provide significantly different results. 

   Thus, the measurement of achievement is going to have far more variability from Hattie study to study than a thermometer in Washington, D.C. in March.
_ _ _ _ _

An Example of an Implementation Problem in Hattie’s Research

  The one major implementation problem that we will discuss right now is that, in a specific effect size area, educators need to know the implementation methods that were used in all of the studies included in the original meta-analytic studies that Hattie pooled into his meta-meta-analyses.  

   The point here is that, unless a program or intervention has been standardized in a specific effect area, and the same program or same intervention implementation steps were used in every study included in a meta-analysis or Hattie’s meta-meta-analyses in that area, it is possible that one implementation approach contributed more to the positive effect size on student achievement than another approach.

   For example, given Hattie’s current data, “cognitive task analysis” has a 1.29 effect size relative to positively impacting student achievement.  It is unlikely, however, that every study in every meta-analysis pooled by Hattie used the same step-by-step implementation process representing “cognitive task analysis.”

   Thus, Hattie’s research tells us what to consider (i.e., cognitive task analysis), but not necessarily the specific research-validated steps in how to implement it.

   For an individual school to implement the cognitive task analysis approach or steps that contributed most to the positive effect size that Hattie reports, its leaders need to know—statistically and relative to their implementation steps—what individual studies were integrated into the meta-analyses and Hattie’s meta-meta-analysis.

   But they also need to know which studies were done with the same type of students (e.g., gender, socio-economic status, race, geographical location, type and quality of school, etc.) that they are currently teaching in their school.

   That is, it may be that the students involved in the meta-analytic studies used by Hattie do not match the students in the schools that we are working with.  Thus, while the research used by Hattie may be “good” research (for some students in some schools in some communities), it may not be the “right” research for our students, schools, and community.

   To summarize so far:  If schools are going to use Hattie’s research in the most effective way for their specific students, a Multiple-gating process of decision-making must be used.

   This Multiple-gating Process should include:

  • Step 1.  Identify your school’s history and status, resources and capacity, and current positive and needed outcome relative to student achievement.
  • Step 2.  Determine which Hattie variables will most improve student achievement—with a constant awareness that many of these variables will interact or are interdependent.
  • Step 3.  Evaluate the methodological and statistical quality and integrity of the meta-analytic studies that Hattie included in his meta-meta-analyses.
NOTE:  If Hattie’s meta-meta-analysis has flaws or included flawed meta-analytic studies, identify the best separate meta-analysis studies and continue this multiple-gating process.
  •  Step 4.  Evaluate the demographics and other background characteristics of the schools, staff, and students involved in the meta-analytic studies used by Hattie in his meta-meta-analyses to validate that they match the school demographics and background characteristics where you plan to implement the program, strategy, or intervention.
  • Step 5.  Using and analyzing Hattie’ best meta-meta-analytic study (or the best individual meta-analysis studies—as immediately above), identify what program(s) or strategy(ies), and what specific implementation approaches and steps were most responsible for the positive effects on student achievement.
  • Step 6.  Finalize the select of your program or strategy, and its implementation approaches and steps, and develop an Implementation Action Plan that identifies who will be involved in implementation, what training and resources they need, how you will engage the students (staff, and parents), how you will evaluate the short-and long-term student achievement outcomes, and what will be the implementation steps and timelines.
  • Step 7.  Resource, train, engage, implement, evaluate, fine-tune, implement, and evaluate.
_ _ _ _ _

   As we proceed to the next section of this Blog, let me be clear.  This Blog was not written to criticize or denigrate, in any way, Hattie on a personal or professional level.  He is a prolific researcher and writer, and his work is quite impressive.

   However, the full Blog message will critique the statistical and methodological underpinnings of meta- and meta-meta-analytic research, and discuss its strengths and limitations.  But most essentially, the focus ultimately will be on delineating the research-to-practice implications of Hattie’s work, and how to implementation it with students in the most effective and efficient ways.
_ _ _ _ _

   To this end, and once again, it is important that educators understand:
  • The strengths and limitations of meta-analytic research—much less meta-meta-analytic research;
  • What conclusions can be drawn from the results of sound meta-analytic research;
  • How to transfer sound meta-analytic research into actual school- and classroom-based instruction or practice; and
  • How to decide if an effective practice in one school, classroom, or teacher is “right” for your school, classrooms, and teachers.
[CLICK HERE for the full Blog message]

   While this all provides a “working outline,” let’s look at some more details.
_ _ _ _ _ _ _ _ _

A Primer on Meta-Analysis

What is it?

   A meta-analysis is a statistical procedure that combines the effect sizes from separate studies that have investigated common programs, strategies, or interventions.  The procedure results in a pooled effect size that provides a more reliable and valid “picture” of the program or intervention’s usefulness or impact because it involves more subjects, more implementation trials and sites, and (usually) more geographic and demographic diversity.  Typically, an effect size of 0.40 is used as the “cut-score” where effect sizes above 0.40 reflect a “meaningful” impact.

   Significantly, when the impact (or effect) of a “treatment” is consistent across separate studies, a meta-analysis can be used to identify the common effect.  When effect sizes differ across studies, a meta-analysis can be used to identify the reason for this variability.
_ _ _ _ _

How it is done?

   Meta-analytic research typically follows some common steps.  These involve:
  • Identifying the program, strategy, or intervention to be studied
  • Completing a literature search of relevant research studies
  • Deciding on the selection criteria that will be used to include an individual study’s empirical results
  • Pulling out the relevant data from each study, and running the statistical analyses
  • Reporting and interpreting the meta-analytic results
   As with all research, and as reflected in the steps above, there are a number of subjective decisions that those completing a meta-analytic study must make.  And, these decisions could be sound, or they could be not so sound.  They could be defensible, or they could be arbitrary and capricious.  They could be well-meaning, or they could be biased or self-serving. 

   Thus, there are good and bad meta-analytic studies.  And, educators are depending on the authors of each meta-analytic study (or, perhaps the journal reviewers who are accepting the study for publication) to include only those studies that are sound.

   By extension, educators also are depending on Hattie to include only those well-designed and well-executed meta-analytic studies in his meta-meta-analyses.

   But, unfortunately, this may not be the case.

   In his 2009 Visible Learning book, Hattie states (pg. 11), “There is. . . no reason to throw out studies automatically because of lower quality.”

   This suggests that Hattie may have included some lower quality meta-analytic studies in some (which ones?) of his many meta-meta-analyses.

   Indeed. . . What criteria did he use to when including some lesser-quality meta-analytic studies?  How did he rationalize including even one lower quality study?  But—most importantly—how did these lower quality studies impact the results of the effect sizes and functional implications of the research?

   These are all important questions that speak directly to the educators who are trying to decide which Hattie-endorsed approaches to use in their pursuit of improved student achievement scores.  These questions similarly relate to educators’ decisions on how to effectively implement the approaches that they choose.
_ _ _ _ _

How do you Interpret an Effect Size?

   As noted above, Hattie (and other researchers) use an effect size of 0.40 as the “cut-score” or “hinge point” where a service, support, strategy, program, or intervention has a “meaningful” impact on student achievement.

   Visually, Hattie represents the continuum of effect sizes as a “Barometer” (see below).


   But this doesn’t tell the entire story.  In fact, some researchers are very uncomfortable with this barometer and how Hattie characterizes some of the effect sizes along the continuum.
_ _ _ _ _

   Matthew A. Kraft, from Brown University, is one such researcher.  In his December, 2018 working paper, Interpreting Effect Sizes of Education Interventions, Brown identified five guidelines when interpreting effect sizes in education.

[CLICK HERE for this paper]

   Kraft’s five guidelines are cited below.  For a detailed discussion of each—with their implications and practical examples, go to the complete Blog message.

[CLICK HERE for the full Blog message]
  • Guideline #1.  The results from correlational studies, when presented as effect sizes, are not causal effects.  Moreover, effect sizes from descriptive and correlational studies are often larger than causal studies.
  •  Guideline #2.  The magnitude of effect sizes depends on what outcomes are evaluated and when these outcomes are measured.
  •  Guideline #3.  Effect sizes are impacted by subjective decisions researchers make about the study design and analyses.
  •  Guideline #4.  Strong program or intervention effect sizes must be covaried with how much it costs to implement the program or intervention—both relative to the initial start-up and ongoing maintenance.
  • Guideline #5.  The ease or difficulty in scaling-up a program or intervention also matters when evaluating the policy relevance of effect sizes.
_ _ _ _ _ _ _ _ _ _

Others’ Concerns with Hattie’s Research 

   To fully consider the concerns with Hattie’s research, it important to include two additional voices.

   In a past Blog, we discussed the concerns of Dr. Robert Slavin from John’s Hopkins University.  These concerns are summarized in the full Blog message.

   In addition, we add the perspectives of Drs. Pierre-Jerome Bergeron and Lysanne Rivard (from the University of Ottawa and McGill University, respectively) who wrote a 2017 article in the McGill Journal of Education titled, “How to Engage in Pseudoscience with Real Data: A Criticism of John Hattie’s Arguments in Visible Learning from the Perspective of a Statistician.”

   In their article, they make the following points:
  • Hattie’s meta-meta-analyses ignore the presence of negative probabilities; He confounds correlation and causality.
  • Hattie believes that Effect Sizes from separate meta-analytic studies can be compared because Cohen’s d is a measure without a unit/metric; his averages, therefore, do not make sense.
  •  In conducting meta-meta-analyses, Hattie is comparing Before Treatment versus After Treatment results, not (as in the original meta-analyses he uses) Treatment versus Control Group results.
  • Hattie pools studies that have different definitions (and measurements of) student achievement, and treats them as one and the same.
  • Hattie believes that effects below zero are bad. Between 0 and 0.4 we go from “developmental” effects to “teacher” effects. Above 0.4 represents the desired effect zone. There is no justification for this classification.
[CLICK HERE for the full Blog message with more details and quotes from Slavin, Bergeron, and Rivard]
_ _ _ _ _ _ _ _ _ _

How Do You Go from Effect to Effective Practice?

   In the most-current (October, 2018) version of Hattie’s Visible Learning effect sizes, Hattie has organized more than 250 variables into clusters that include: Student, Curricula, Home, School, Classroom, Teacher, and Teaching. 

   In the Figure below, I have listed the top eight effect sizes with their respective “Areas of Research Focus.”

   I have also added a descriptor identifying whether each variable can be changed through an external intervention.  Thus, I am saying that “Students’ Self-Reported Grades,” “Teacher Estimates of Student Achievement,” and a “Teacher’s Credibility with his/her Students” cannot be changed in a sustained way through some type of intervention, and that—even if they could—they would not causally change student achievement.

   Parenthetically, in most cases, these three variables were independent variables in the research investigated by Hattie.


   At this point, we need to discuss how to go from “effect to effective practice.”  To do this, we need to understand exactly what each of the variables in the Figure actually are.
  
   And . . . OK . . . I’ll admit it. 

   As a reasonably experienced school psychologist, I have no idea what that vast majority of these approaches actually involve at a functional school, classroom, teacher, or student level. . . much less what methods and implementation steps to use.

   To begin to figure this out, we would need to take the following research-to-practice steps:
  • Go back to Hattie’s original works and look at his glossaries that define each of these terms
  •  Analyze the quality of each Hattie meta-meta-analysis in each area
  • Find and analyze each respective meta-analysis within each meta-meta-analysis
  • Find and evaluate the studies included in each meta-analysis, and determine which school-based implementation methods (among the variety of methods included in each meta-analysis) are the most effective or “best” methods— relative to student outcomes
  • Translate these methods into actionable steps, while also identifying the provide the professional development and support needed for sound implementation
  • Implement and evaluate the short- and long-term results
   If we don’t do this, our districts and schools will be unable to select the best approaches to enhance their student achievement and implement these approaches in the most effective and efficient ways?

   This, I believe, is what the researchers are not talking about.
_ _ _ _ _

The Method is Missing

   To demonstrate the research-to-practice points immediately above, the full Blog message analyzes two high-effect-size approaches on Hattie’s list:
  • Response to Intervention (Effect Size: 1.09)
  • Interventions for Students with Learning Needs (Effect Size: 0.77)
[CLICK HERE for the full Blog message]
_ _ _ _ _ _ _ _ _ _

The Questions to Ask the Outside “Hattie Consultants”

   In order for districts and schools to know exactly what implementation steps are needed to implement effective “Hattie-driven” practices so that their students can benefit from a particular effect, we need to “research the research.”

   And yet, the vast majority of districts—much less schools—have the personnel with the time and skills to do this.

   To fill this gap:  We now have a “cottage industry” of “official and unofficial” Hattie consultants who are available to assist.

   But how do districts and schools evaluate these consultants relative to their ability, experience, and skills to deliver effective services?

   With no disrespect intended, just because someone has been trained by Hattie, has heard Hattie, or has read Hattie—that does not give them the expertise, across all of the 250+ rank-ordered influences on student learning and achievement, to analyze and implement any of the approaches identified through Hattie’s research.

   And so, districts and schools need to ask a series of specific questions when consultants say that their consultation is guided by Hattie’s research.

   Among the initial set of questions are the following:

1.   What training and experience do you have in evaluating psychoeducational research as applied to schools, teaching staff, and students—including students have significant academic and/or social, emotional, or behavioral challenges?

2.   In what different kinds of schools (e.g., settings, grade levels, socio-economic status, level of ESEA success, etc.) have you consulted, for how long, in what capacity, with what documented school and student outcomes—and how does this experience predict your consultative success in my school or district?

3.   When guided by Hattie’s (and others’) research, what objective, research-based processes or decisions will you use to determine which approaches our district or school needs, and how will you determine the implementation steps and sequences when helping us to apply the selected approaches?

4.   What will happen if our district or school needs an approach that you have no experience or expertise with?

5.   How do you evaluate the effectiveness of your consultation services, and how will you evaluate the short- and long-term impact of the strategies and approaches that you recommend be implemented in our district or school?
_ _ _ _ _ _ _ _ _ _

Summary

   Once again, none of the points expressed in this Blog are personally about John Hattie.  Hattie has made many astounding contributions to our understanding of the research in areas that impact student learning and the school and schooling process.

   However, many of my points relate to the strengths, limitations, and effective use of research reports using meta-analysis and meta-meta-analyses. 

   If we are going to translate this research to sound practices that impact student outcomes, educational leaders need to objectively and successfully understand, analyze, and apply the research so that they make sound system, school, staff, and student-level decisions.

   To do this, we are advocating for and described (see above) a Multiple-gated decision-making approach.

   In the end, schools and districts should not invest time, money, professional development, supervision, or other resources in programs or interventions that have not been fully validated for use with their students and/or staff. 

   Such investments are not fair to anyone—especially when they (a) do not delivering the needed results, (b) leave students further behind, and/or (c) fail and create staff resistance to “the next program”—which might, parenthetically, be the “right” program.
_ _ _ _ _

   I hope that this discussion has been useful to you.

   As always, I look forward to your comments. . . whether on-line or via e-mail.

   If I can help you in any of the areas discussed in this Blog, I am always happy to provide a free one-hour consultation conference call to help you clarify your needs and directions on behalf of your students, staff, school(s), and district.

[CLICK HERE for the full Blog message]

Best,

Howie