Real Time Quality Assurance of Depression Ratings in Psychiatric Clinical Trials
10 Pages Posted: 16 Aug 2023
Date Written: August 10, 2023
Abstract
Rationale: Decades of experience with antidepressant clinical trials suggests that careful standardization of psychometric raters and administration of standardized rating scales is critical to minimizing variance and demonstrating statistical significance around the primary regulatory endpoints. Although careful selection of human subjects remains a primary focus, minimization of intra-rater and inter-rater variation is key. Failing to monitor study site ratings in real-time can lead to clinical trial failure. However, the challenge of maintaining a 3-point maximum allowable difference on Montgomery-Åsberg Depression Rating Scale (MADRS) between site raters and central “master raters” has led some to call for allowing a 6-point difference in congruence for registrational trials. The purpose of this report is to demonstrate that with accountable rater training, real-time electronic transmission of rating sessions, real-time measurement of inter-rater reliability, and ongoing rater-improvement training, a 3-point concordance in inter-rater reliability is feasible and capable of achieving rater concordance well in excess of 90% in challenging patient populations.
Setting: Patients who suffer from Bipolar Disorder (BP) experience a suicide rate, that is 10-30 fold higher than the population incidence.1 Because 20% of untreated BP patients complete suicide, and 20-60% may attempt suicide, new pharmacological treatment interventions are necessary to address this significant unmet medical need. The primary objective of the NRx100-001 Study is to test the hypothesis that treatment with NRX-101 is superior to lurasidone (a standard of care medicine) in improving symptoms of depression as measured by the Montgomery Åsberg Depression Rating Scale (MADRS-10) total score in adults with severe bipolar depression and subacute suicidal ideation or behavior (SSIB).
Methods: Audio files of all MADRS rating sessions were collected in real-time by the study sponsor in a blinded protocol. Real-time MADRS data review was conducted by a team of psychometric experts with an average of 20 years of psychometric rating experience in a clinical research setting. Congruence was defined as a difference in MADRS total score of 3 points or less (i.e., 5% of a 60-point scale).
Results: Using a strict 3-point requirement for congruence, in a sample of 113 ratings (37 patients), the Inter-Rater Reliability (IRR) between site ratings and blinded independent Sponsor ratings was 93.4%. The absolute mean difference in MADRS rating pairs was 1.78 points (95% CI: 1.50-2.06). When the standard for congruence was relaxed to 6 points, IRR increased to 97%.
Conclusions: Rigorous rater training and standardization, together with real-time monitoring of site raters by central “master raters” can achieve a high degree of rating reliability, using a strict 3-point standard for rating concordance.
Note:
Clinical Trial Registration Details: The underlying clinical trial is registered at www.clinicaltrials.gov NCT03395392.
Funding Information: Research was funded by NRx Pharmaceuticals, Inc.
Conflict of Interests: JCJ, and MK own equity in NRx Pharmaceuticals. MK, MTS, IRS, CK, and JCJ are compensated by NRx Pharmaceuticals. Lavin Statistical Associates is compensated for independent statistical analysis by NRx Pharmaceuticals.
Ethical Approval: All treatment of human subjects was approved by the Advarra, IRB.
Keywords: Montgomery-Åsberg Depression Rating Scale (MADRS), Interrater Reliability (IRR), Concordance, Congruence, Psychometric Testing, Clinical Trial Design, Bipolar Disorder
Suggested Citation: Suggested Citation