Task Force Report on Methodology and Empirically Supported Treatments
Donald Moss and Jay Gunkelman
In June 2001 Donald Moss, then President, Association for Applied Psychophysiology and Biofeedback (AAPB) and Jay Gunkelman, then President, Society for Neuronal Regulation (SNR), appointed a Task Force to develop standards on research methodology and on the empirical support of treatments. Theodore J. LaVaque represented AAPB as co-chair, and D. Corydon Hammond represented SNR as co-chair. The AAPB Neurofeedback and sEMG Divisions supported the Task Force and named delegates.
There have been several recent instances in which researchers have made critical statements about biofeedback lacking efficacy. The Association for the Advancement of Behavior Therapy newsletter (The Behavior Therapist) published an article critical of neurofeedback (Lohr, Meunier, Parker, & Kline, 2001). Reuters Health issued a press release reporting William Mullally's headache research, and his statement that biofeedback is too expensive and not effective for headache. An AAPB response to the Mullaly research is forthcoming (Moss, Andrasik, McGrady, Perry, & Baskin, 2001). The New England Journal of Medicine published a landmark article challenging the placebo effect (Hrobjartsson, & Gotzche, 2001). In a follow up to the NEJM study, a science reporter highlighted a biofeedback hypertension study, and stated that just entering a study was as effective as biofeedback in treating hypertension.
Practitioners announce new applications regularly, yet as a field we fail to discriminate among first line well documented treatments, and experimental new applications. The current health care movements toward evidence based medicine and "best practices" standards will leave biofeedback behind, unless we better validate/support and rate our own treatment protocols.
The Task Force worked diligently for four months, reviewing a massive body of research reports on methodology and efficacy studies. The American Psychological Association addressed many similar issues in developing its guidelines on the empirical validation of psychological treatments (APA, 1995; Task Force on Promotion and Dissemination, 1995; Chambless et al, 1996, 1998). Review of the APA efforts provided significant guidance and some of the framework for the AAPB/SNR Task Force in developing guidelines for rating the efficacy of biofeedback and neurofeedback treatments. The Task Force also reviewed ethical issues regarding research on humans subjects, addressed in two critical documents, the Declaration of Helsinki (World Medical Association, 2000) and the Belmont Report (Department of Health and Human Services, 1979).
The Task Force produced a "Template," which has now been approved as a policy guideline by both the AAPB and SNR Boards. This Template provides our field with a strong set of methodological standards, by which we can classify applications at one of five levels of efficacy, according to the quality and quantity of outcome research which has supported each application: Level 1. Not empirically supported, Level 2. Possibly efficacious, Level 3. Probably efficacious, Level 4. Efficacious, and Level 5. Efficacious and specific. Regular use of this new template to assess the efficacy of mind-body therapies will give credence to our better treatment protocols.
Both AAPB and SNR extend gratitude to the Task Force, its chairs, members, and reviewers, for providing guidelines for rating applications of biofeedback and neurofeedback.
Participants in the Task Force Included:
Chairs: Theodore J. LaVaque, Ph.D., and D. Corydon Hammond, PhD
Members: David Trudeau, M.D., Vincent Monastra, Ph.D., John Perry, PhD, Paul Lehrer, PhD
Reviewers: Douglas Matheson, PhD, Richard Sherman, PhD
American Psychological Association (1995). Template for developing guidelines: Interventions for mental disorders and psychosocial aspects of physical disorders. Policy document. Washington, DC: American Psychological Association.
Chambless, D.L., Sanderson, W. C., Shoham, V., Johnson, B., Pope, K. S., Crits-Christoph, P., et al. (1996). An update on empirically validated therapies. The Clinical Psychologist, 49 (2), 5-18.
Chambless, D.L., Baker, M. J., Baucom, D. H., Calhoun, K. S., Crits-Christoph, P., Daiuto, A., et al. (1998). Update on empirically validated therapies: II. The Clinical Psychologist, 51 (1), 3-16.
Department of Health and Human Services (1979). The Belmont report. http://ohrp.osophs.dhhs.gov/humansubjects/guidance/belmont.htm
Hrobjartsson, A., & Gotzsche, P. C. (2001). Is the placebo powerless: An analysis of clinical trials comparing placebo with no treatment. New England Journal of Medicine, 344 (21), 1594-1602.
Lohr, J. M., Meunier, S. A., Parker, L. M., & Kline, J. R. (2001). Neurotherapy does not qualify as an empirically supported behavioral treatment for psychological disorders. The Behavior Therapist, 24,, 97-104.
Moss, D., Andrasik, F., McGrady, A., Perry, J. D., & Baskin, S. M. (2001). Biofeedback can help headache sufferers. Biofeedback Newsmagazine, 29 , 11-13. Task Force on Promotion and Dissemination of Psychological Procedures. (1995). Training in and dissemination of empirically validated psychological treatments: Report and recommendations. The Clinical Psychologist, 48 (1), 3-23.
World Medical Association (2000). The Declaration of Helsinki. 52nd WMA General Assembly, Edinburgh, Scotland. http://www.wma.net
Address correspondence to Donald P. Moss, Psychological Services Center, 1703 S. Despelder, Grand Haven, MI 49417; email: email@example.com or Jay Gunkelman, P.O. Box 152, Grizzly Flats, CA 95636, email: firstname.lastname@example.org
TEMPLATE FOR DEVELOPING GUIDELINES FOR THE EVALUATION OF THE CLINICAL EFFICACY OF PSYCHOPHYSIOLOGICAL INTERVENTIONS Efficacy Template Taskforce
Co-chairs:Theodore J. La Vaque, Ph.D. and D. Corydon Hammond, Ph.D. Committee: David Trudeau, M.D.,Vincent Monastra, Ph.D., John Perry, Ph.D., and Paul Lehrer, Ph.D. Reviewers: Douglas Matheson, Ph.D., Richard Sherman, Ph.D.
A: Preamble The charge to this Task Force requires the development of a template that will assist the Efficacy and Practice Guideline Panels in their review of the literature related to the clinical efficacy of psychophysiological interventions. The Panels will be required to use accepted scientific and clinical standards for determining whether a beneficial effect of treatment can be demonstrated. This document is intended as the template that will serve as a guideline for the Panels' task. The ultimate goal is that of developing meaningful efficacy databases and practice guidelines for such interventions. This task force was created as a collaborative effort by two professional societies (Association for Applied Psychophysiology and Biofeedback or AAPB; Society for Neuronal Regulation or SNR) to assist in providing a systematic framework for comprehensiveness and consistency in that endeavor. The guidelines that eventuate will, to the best extent possible, recognize the interdisciplinary nature of clinical interventions and, to the best extent possible, be developed by interdisciplinary panels and will be applicable to practitioners from all disciplines. The practice guidelines that are developed from this template are solely for the benefit of the individuals who seek intervention and assistance, whether they are referred to as patients, clients, or "consumers." The efficacy statements and practice guidelines that result from this template will be regarded as informational and educational rather than as criteria for criticism.
B: Panel Procedures Treatment guidelines that eventuate from this important process will greatly influence the health and well being of consumers of the services. Determinations of efficacy will be undertaken with that heavy responsibility foremost in mind. Therefore, the panel procedures will be open to public examination, and members of the panels will be free of actual or apparent conflict of professional or financial interest.
I. Panel Membership:
1. The Panels will be formed by the boards of the professional societies (AAPB and SNR) responsible for creating and approving this Template.
2. The Panels will reflect a range of disciplines with documented expertise regarding delivery of the services under consideration, as well as individuals with expertise in the scientific and statistical methodology required to assess the intervention under examination. The Panels may call upon consultants with expertise in other relevant areas of efficacy and clinical utility such as health care economics, public health, public service, and clinical guideline construction. The inclusion of a consumer advocate familiar with the condition under examination may provide an invaluable contribution.
3. All nominees for Panel membership or consultation will be required to fully disclose any potential or actual conflicts of interest, financial or professional. Such disclosures will be evaluated by bodies assigned to that responsibility.
4. The Panel selection and qualification for membership and the Panel procedures and decision making process will be open to examination by members of the sponsoring professional organization(s).
II. Panel Process:
1. Each Panel will first define and agree upon their process and method, determining the target condition, patient population, interventions to be included and/or excluded, provider type and service settings. Other concerns, such as the diagnostic specificity of the condition under consideration, may be included for review in the process.
2. Timelines for the Panel process will be established indicating approximately when the report will be released for review.
3. Strategies for reviewing evidence will be explicitly outlined by each Panel. All available data and evidence related to the stated goal(s) of the panel will be available to the panel and reviewers. The Panels will document the "highest level of evidence" available (i.e., Randomized Control Trial, or RCT, etc.) and determine whether there has been independent replication of the data.
4. Reports of adverse effects will be examined and included in the reports.
5. A full report will be prepared by each Panel documenting its findings and recommendations. The recommendations will be accompanied by documentation of the rationale and level of evidence available for developing the recommendations.
6. Each Panel report will specify areas related to efficacy and clinical utility that require further research before adequate recommendations can be made.
7. The external validity of the recommendations will be kept in mind by the Panel. Do the recommendations lead to improved therapeutic outcomes in the treatment of the condition reviewed? The validity of the recommendations will be examined retrospectively by public consideration of the substance and quality of the evidence cited.
8. Each Panel will make recommendations regarding the frequency with which the recommendations and/or guidelines will be revised.
9. The organizational Panel Development Boards may choose to create a panel for the purpose of broadly addressing the effectiveness of psychophysiologically-based operant procedures in modifying autonomic activity, muscle activity, and brainwave activity. These reviews could include evidence from different problem areas.
10. Panel recommendations will be publicized for review and comment prior to being formally adopted. Any concerns or comments will be fully and fairly considered prior to adoption. Thus, from its inception, each Panel will function with the awareness that the Panel processes will be open to review.
C: Efficacy: Having the desired effect.
1. The purpose of outcome research methodology is to evaluate the extent to which an intervention can be regarded as efficacious, and to evaluate the level of confidence professionals and consumers may place in such judgments. The efficacy reports and practice guidelines that result from the Panel activities will be based upon clinical expertise and comprehensive and systematic analysis of research data which appears in peer-reviewed literature.
2. The ability to meaningfully assess outcome (efficacy) studies assumes a basic knowledge of clinical experimental design and analysis methods as well as ethical standards for human research.
3. The ability to utilize practice guidelines assumes a fundamental level of training and education, possession of a foundation of relevant knowledge, and effective clinical assessment skills that enable the clinical practitioner to set priorities and make sound decisions regarding treatment method and focus. It is recognized that no set of guidelines can be regarded as absolute criteria. When practice guidelines are developed, it will be recognized that at certain times clinical imperatives may require a modification of the "best practice" guidelines justified by a compelling professional rationale. The judgment of an experienced clinician is important, especially when research data are limited.
4. Technological and clinical advances sometimes occur at a rapid rate, so it will be necessary to review the guidelines on a regular basis so they remain current and relevant. Reviews will be updated at least every three years.
5. This document itself will be reviewed on a regular (annual) basis to determine whether it provides adequate and effective guidance in light of changes that may occur regarding treatment standards, diagnostic standards, ethical standards, or research standards.
II. Scientific Considerations:
Clinical psychophysiology uses variables that are quantifiable. The diagnostic criteria, independent variables, dependent variables, and measurable intervening variables will be included in the Panel deliberations. In methodological terms, the independent variables are specific and subject to experimental manipulation (e.g., sensor placement, bandwidths and frequencies, reinforcement contingencies). The dependent variables (e.g., physiological event being measured, response to treatment) can also be clearly specified. Often, intervening variables (e.g., change in brainwave features, improved motor unit recruitment) can be quantified.
1. The "condition of interest" (COI) will be clearly identified and operationally defined. Most often, the COI will be a diagnostic entity recognized in either the most current Diagnostic and Statistical Manual (DSM) published by the American Psychiatric Association, or in the most current International Classification of Diseases handbook (ICD) published by the World Health Organization. If the COI is not a recognized diagnostic entity but addresses symptomatic, cognitive, or behavioral conditions (e.g., "cognitive brightening," " optimal functioning"), the condition being treated will be operationally defined in a manner that permits objective assessment and replication.
2. Guidelines developed by the Panels will reflect the technical features and parameters of the particular treatment modality under review.
3. Panels will evaluate the extent to which reported interventions are amenable to empirical analysis and replication by independent researchers.
4. If particular intervening variables (e.g., change in motor unit recruitment, shift in brainwave frequencies) have been hypothesized to be relevant to the clinical outcome, the Panel will evaluate the available empirical evidence for these mechanisms.
5. Relevant variables such as age, gender, comorbidity, and treatment history will be identified.
6. Outcome measures will be meaningfully related to the diagnostic criteria or COI operational definition.
While the primary function of the Panels is to examine evidence for efficacy, the Panels will also examine whether treatment specificity has been demonstrated.
1. Necessary or sufficient treatment components: The primary function of the various Panels is that of reviewing and reporting on evidence for treatment efficacy. Additionally, each Panel will report whether the study design permitted a determination as to whether a particular treatment component is necessary to the treatment outcome, rather than just sufficient to accomplish the treatment outcome. As an example, it may be questioned whether the manipulation of a particular brainwave frequency at a particular electrode location is necessary for successful treatment outcome. Would the manipulation of a different brainwave frequency achieve the same result? If not, then it would appear that manipulation of the particular frequency is a necessary component of the treatment protocol. If a different frequency serves as well, the variable may be sufficient to produce change, but other variable manipulations may facilitate symptomatic improvement as well. In that case, the presence of a particular variable in the clinical protocol may be sufficient to produce a desired result, but it is not a necessary component. If that question is not tested in the design, it cannot be assumed that the particular treatment variable is necessary to the treatment outcome.
A particular aspect of the intervention (independent variable) may be nonspecific as a component of the treatment protocol, but, at the same time, some element or characteristic of the class to which that independent variable belongs may be necessary to the desired outcome. For instance, a manipulation of a particular brainwave frequency may not be necessary to the desired outcome, but the manipulation of at least some brainwave frequency may be necessary to the desired outcome.
2. The Panel will determine whether "Placebo" nonspecificity can be ruled out as the dominant effect. A more general concern about specificity is whether there is simply some general characteristic of the treatment protocol that is responsible for the desired change. In that case, none of the particular treatment variables are really necessary for symptomatic improvement. Rather, simply participating in the study and receiving empathy and encouragement from the therapist may be sufficient to produce symptomatic improvement. It is known that in some areas of applied psychophysiology, very different protocols are reported to produce similar beneficial results. Thus, the protocols cannot be differentiated on the basis of outcome. Similar problems are encountered in other therapies.
IV: Clinical Utility:
Panels will recognize the distinction between "efficacy" and "clinical utility" and incorporate these distinctions into their deliberations and writings.
1. Efficacy ("having the desired effect") refers to the determination of treatment effect derived from a systematic evaluation obtained in a controlled clinical trial.
2. Clinical utility refers to the practical value of an intervention, or the extent to which it is practical and possible to translate the findings from efficacy studies into normal clinical practice. As an example, an intervention may be found to be efficacious, but meet such consumer resistance, require such expensive equipment or extensive training to be implemented, or be so cost prohibitive that clinical utility is compromised on a practical level. Additionally, a particular intervention may be found to be efficacious in a highly controlled study in which other factors (comorbidity, treatment frequency, and so forth) are effectively controlled. The same protocol may be ineffective in the "real world" of clinical practice when those variables are uncontrolled.
D: Hierarchy of Evidence: Clinical Trials and Efficacy Studies
Treatment efficacy is determined from reports of clinical trials and/or case studies. The scientific and clinical credibility regarding treatment efficacy derives from a combination of clinical experience and clinical trials. Interventions often begin as anecdotal reports or case studies and ultimately are subject to formal and increasingly rigorous clinical trials, finally to be accepted as either "empirically supported" or rejected as useless. The particular design and structure of the clinical trials and the number of independent replications are important considerations for determining the degree to which a particular intervention can be claimed to be empirically supported or empirically validated. There is a generally accepted hierarchy of "scientific power" for each clinical trial design.
1. Anecdotal evidence: This type of report is generally regarded as being without scientific value, but is simply a narrative report about the observation that a particular treatment appears to have "worked". Anecdotal reports cannot be used as a basis for regarding a treatment as efficacious by scientifically-minded health care professionals, no matter how many anecdotal reports are identified. Anecdotal reports may serve as a reason to examine the intervention in more controlled settings.
2. Uncontrolled Case Study: This is also quite weak, scientifically. A good case study report will at least have the virtue of clear specification of the relevant independent and dependent variables that may assist in developing more effective controlled studies. Case studies that are regarded as "strong" will include quantified pre-post measures and, ideally, follow-up measures. A strong case report may be followed by a series of case reports demonstrating positive outcome leading to controlled clinical trials.
3. Historical control: The historical control assumes that the course and nature of the disorder or disease is so well documented that an intervention may be demonstrated to be effective without the necessity of controls internal to the study. This is rarely the case. An example of an effective historical control might be a disorder or disease that is known to be so universally lethal (Ebola virus, hemorrhagic fever) or debilitating (e.g., Alzheimer's Disease) that any intervention that increases survival rate or prolongs quality of life is regarded as efficacious. Even then, the intervention will be subjected to increasingly rigorous studies to identify the effective independent and intervening variables.
4. Observational studies: Retrospective or prospective studies that are case controlled but not randomized or blinded. Several meta analysis studies have found that there is no difference between case controlled and prospective randomized studies in terms of outcome, suggesting the validity of observational studies may approach that of other designs that have either randomized or blinded assignment.
5. Wait list or "intention to treat" controls: This is a frequent control design used in clinical settings. Subjects will ideally be randomly assigned to wait list or treatment conditions. It is a design that is intended to control for changes that may be attributed to the natural course of the disorder, passage of time, self determined "other treatments", subject expectations, and so forth. Subjects participating in this study receive the diagnostic evaluation both upon entry into the wait list group and after a predetermined amount of time has passed, usually equal to the time required for the investigational treatment to be accomplished. It is assumed that differences between the wait list controls and investigational outcomes are attributable to the investigational treatment variables. This design does not, however, control for the ubiquitous presence of "nonspecific variables" that are present in every clinical study of an investigational treatment, nor does it control for experimenter bias. There is some concern that merely participating in the initial diagnostic procedures may alter the repeated measures in the absence of any intervention. In a clinical setting, there may be ethical concerns about randomization into wait list or treatment conditions based upon the principle that the researcher has determined who will receive treatment and who will be deferred. Informed consent becomes critical.
6. Within -Subject and Intrasubject -Replication Designs. The most common example of this is the "A-B-A" design. This design can be a powerful demonstration of the specificity of the investigational independent variable. It does not, however, control for experimenter bias unless the clinician is "blind" to the reversal contingencies. There may be ethical concerns inherent in purposely attempting to demonstrate symptom reversal in the "B" condition for the sole purpose of experimental manipulation.
7. Single blind, random assignment control design, either sham or active (behavioral, psychological, or pharmacological) treatment controls: This design "blinds" the subject to the particular treatment condition to which they have been assigned, thus controlling for some of the nonspecific subject variables that may operate in the study (expectancy, acquiescent response bias, etc.). It does not control for experimenter bias. Again, there may be ethical considerations in the use of sham treatment controls if known and effective treatments are already available. This will be briefly discussed later. There are also significant statistical issues associated with the active treatment control design that the panel will consider.
8. Double blind control studies, sham or active controls, random assignment: This design "blinds" both the subject and the experimenter to the treatment condition, thus controlling for both subject variables and experimenter bias. The double blind sham controlled design is considered, from a purely scientific perspective, to be the sine qua non of clinical trial designs, although some have argued in favor of the double blind design which contains both active and sham controls as well as the investigational condition.
9. Treatment Equivalence or Treatment Superiority Designs: This design compares the investigational treatment to a known and accepted standard treatment. The ethical issue of a "no treatment" control is circumvented, but other design and analysis issues arise related to "assay sensitivity". The standard treatment will have a known placebo response profile for the COI. A demonstration of "equivalence" may not be meaningful if the study is poorly accomplished or if the standard does not "separate" from placebo or sham effectively.
10. Other Designs: Other more sophisticated clinical trial designs are available ("Double Dummy", "Solomon Four Group") are available, but will not be treated here. The panel will be prepared to evaluate each clinical trial report individually and as part of a "meta-analysis" in reaching a conclusion about the degree to which the intervention is empirically supported as an intervention for a particular COI.
E. Criteria for Levels of Evidence of Efficacy
Level 1: Supported only by anecdotal reports and/or case studies in non-peer reviewed venues. Not empirically supported.
Level 2: Possibly Efficacious: At least one study of sufficient statistical power with well identified outcome measures, but lacking randomized assignment to a control condition internal to the study.
Level 3: Probably Efficacious: Multiple observational studies, clinical studies, wait list controlled studies, and within subject and intrasubject replication studies that demonstrate efficacy.
a.) In a comparison with a no-treatment control group, alternative treatment group, or sham (placebo) control utilizing randomized assignment, the investigational treatment is shown to be statistically significantly superior to the control condition or the investigational treatment is equivalent to a treatment of established efficacy in a study with sufficient power to detect moderate differences, and
b.) The studies have been conducted with a population treated for a specific problem, for whom inclusion criteria are delineated in a reliable, operationally defined manner, and
c.) The study used valid and clearly specified outcome measures related to the problem being treated and
d.) The data are subjected to appropriate data analysis, and
e.) The diagnostic and treatment variables and procedures are clearly defined in a manner that permits replication of the study by independent researchers, and
f.) The superiority or equivalence of the investigational treatment have been shown in at least two independent research settings.
Level 5: Efficacious and specific
a.) The investigational treatment has been shown to be statistically superior to credible sham therapy, pill, or alternative bona fide treatment in at least two independent research settings.
F: Other Considerations that Contribute to Confidence in Studies Efficacy
1. Outcome measures will be relevant to the disorder as diagnosed or operationally defined. Studies using multiple outcome measures are considered to be stronger demonstrations of efficacy than studies using single outcome measures.
2. Measures of any changes in life functioning, such as occupational, social, family function and subjective well being will be evaluated by the panel.
3. Iatrogenic complications reported in the literature will be reported by the Panel.
4. The Panel will differentiate between measures that produce mere statistical significance vs. those that also produce demonstrable clinically significant changes.
5. As discussed earlier, the intervention variables will be reported in sufficient detail that the procedure could be replicated consistently across clinical settings.
6. The Panel will examine "intent to treat" data specifying attrition due to drop out or refusal.
7. Ultimately, there will be long-term follow-up studies of the intervention effects.
8. Replication, using appropriate designs and analysis, by two or more independent sites contributes significantly to the credibility of the reports. While randomized assignment to treatment conditions may represent the currently accepted scientific ideal, recent meta-analyses have indicated that non-randomized observational studies can produce similar results (Benson & Hartz, 2000; Britton et al., 1998; Concato, Shah, & Horwitz, 2000)
G: Ethical Standards for Research:
The panels will examine the research reports in light of published ethical standards governing human subject research. Two well-known documents are the Declaration of Helsinki, published by the World Medical Association (2000), and the Belmont Report, published by the Department of Health and Human Services (1979). Ethically critical issues such as informed consent, protections for vulnerable populations ( e.g., children, developmentally disabled, cognitively impaired, severe psychopathology), and appropriate use of placebo or sham controls will be examined. The final reports will discuss, in a general sense, any ethical lapses or problems identified. Studies may be scientifically sound but ethically unacceptable.
Benson, K., & Hartz, A. J. (2000). A comparison of observational studies and randomized, controlled trials. New England Journal of Medicine, 342(25), 1878-1886.
Britton, A., McPherson, K., McKee, M., Sanderson, C., Black, N., & Bain, C. (1998). Choosing between randomized and non-randomized studies: a systematic review. Health Technology Assessment, 2(13), 1-124.
Concato, J., Shah, N., & Horwitz, R. I. (2000). Randomized, controlled trials, observational studies, and the hierarchy of research designs. New England Journal of Medicine, 342(25), 1887-1892.
Department of Health and Human Services (1979). The Belmont Report. http://ohrp.osophs.dhhs.gov/humansubjects/guidance/belmont.htm
World Medical Association (2000). The Declaration of Helsinki. 52nd WMA General Assembly, Edinburgh, Scotland. http://www.wma.net