Assessing Interobserver Variability of Cosmetic Outcome Assessment in Breast Cancer Patients Undergoing Breast-Conservation Surgery

Anees B. Chagpar,Elizabeth Berger,Michael Alperovich,Gregory Zanieski,Tomer Avraham,Donald R. Lannin
DOI: https://doi.org/10.1245/s10434-021-10442-y
IF: 4.339
2021-07-15
Annals of Surgical Oncology
Abstract:BackgroundInter-rater reliability between breast surgical oncologists and reconstructive surgeons using cosmesis scales, and the correlation between their observations and patients' own subjective assessments, is poorly understood.MethodsPatients undergoing BCS in a prospective trial rated their cosmetic outcome on a Likert scale (poor/fair/good/excellent) at the postoperative and 1-year time points; photographs were also taken. Three breast surgical oncologists (not involved in these cases) and two reconstructive surgeons were asked to independently rate cosmesis using the Harvard/NSABP/RTOG scale.ResultsOverall, 55 and 17 patients had photographs and Likert self-evaluations at the postoperative and 1-year time points, respectively. There was poor agreement between surgeon and patient ratings postoperatively [kappas − 0.042 (p = 0.659), 0.069 (p = 0.226), and 0.076 (p = 0.090) for the breast surgical oncologists; and 0.018 (p = 0.689) and 0.112 (p = 0.145) for the reconstructive surgeons], and poor interobserver agreement between surgeons of the same specialty (kappa − 0.087, 95% confidence interval [CI] − 0.091 to − 0.082, p = 0.223 for breast surgical oncologists; and kappa − 0.150, 95% CI − 0.157 to − 0.144, p = 0.150, for reconstructive surgeons). At 1 year, the interobserver agreement between breast surgical oncologists was better (kappa 0.507, 95% CI 0.501–0.512, p < 0.001); however, there was still poor correlation between the reconstructive surgeons (kappa − 0.040, 95% CI − 0.049 to − 0.031, p = 0.772). Agreement between surgeon and patient ratings remained poor at this time point [kappas − 0.115 (p = 0.477), 0.177 (p = 0.245), and 0.101 (p = 0.475) for breast surgical oncologists; and 0.335 (p = 0.037) and −0.118 (p = 0.221) for reconstructive surgeons].ConclusionDespite gradation scales for measuring cosmesis after BCS, high levels of agreement between surgeons is lacking and these do not always reflect patients' subjective assessments.
oncology,surgery
What problem does this paper attempt to address?