Development of a keyword library for capturing PRO-CTCAE-focused "symptom talk" in oncology conversations

Brigitte N Durieux; Samuel R Zverev; Elise C Tarbi; Anne Kwok; Kate Sciacca; Kathryn I Pollak; James A Tulsky; Charlotta Lindvall

doi:10.1093/jamiaopen/ooad009

Development of a keyword library for capturing PRO-CTCAE-focused "symptom talk" in oncology conversations

JAMIA Open. 2023 Feb 9;6(1):ooad009. doi: 10.1093/jamiaopen/ooad009. eCollection 2023 Apr.

Authors

Brigitte N Durieux¹, Samuel R Zverev^{1

2}, Elise C Tarbi^{1

3}, Anne Kwok¹, Kate Sciacca^{1

4}, Kathryn I Pollak^{5

6}, James A Tulsky^{1

7}, Charlotta Lindvall^{1

7

8}

Affiliations

¹ Department of Psychosocial Oncology and Palliative Care, Dana-Farber Cancer Institute, Boston, Massachusetts, USA.
² NYU School of Medicine, New York University, New York, New York, USA.
³ Department of Nursing, University of Vermont, Burlington, Vermont, USA.
⁴ Department of Palliative Medicine, Brigham and Women's Hospital, Boston, Massachusetts, USA.
⁵ Department of Population Health Sciences, Duke University School of Medicine, Duke University, Durham, North Carolina, USA.
⁶ Cancer Prevention and Control Program, Duke Cancer Institute, Duke University, Durham, North Carolina, USA.
⁷ Department of Medicine, Brigham and Women's Hospital, Boston, Massachusetts, USA.
⁸ Harvard Medical School, Harvard University, Boston, Massachusetts, USA.

Abstract

Objectives: As computational methods for detecting symptoms can help us better attend to patient suffering, the objectives of this study were to develop and evaluate the performance of a natural language processing keyword library for detecting symptom talk, and to describe symptom communication within our dataset to generate insights for future model building.

Materials and methods: This was a secondary analysis of 121 transcribed outpatient oncology conversations from the Communication in Oncologist-Patient Encounters trial. Through an iterative process of identifying symptom expressions via inductive and deductive techniques, we generated a library of keywords relevant to the Patient-Reported Outcome version of the Common Terminology Criteria for Adverse Events (PRO-CTCAE) framework from 90 conversations, and tested the library on 31 additional transcripts. To contextualize symptom expressions and the nature of misclassifications, we qualitatively analyzed 450 mislabeled and properly labeled symptom-positive turns.

Results: The final library, comprising 1320 terms, identified symptom talk among conversation turns with an F1 of 0.82 against a PRO-CTCAE-focused gold standard, and an F1 of 0.61 against a broad gold standard. Qualitative observations suggest that physical symptoms are more easily detected than psychological symptoms (eg, anxiety), and ambiguity persists throughout symptom communication.

Discussion: This rudimentary keyword library captures most PRO-CTCAE-focused symptom talk, but the ambiguity of symptom speech limits the utility of rule-based methods alone, and limits to generalizability must be considered.

Conclusion: Our findings highlight opportunities for more advanced computational models to detect symptom expressions from transcribed clinical conversations. Future improvements in speech-to-text could enable real-time detection at scale.

Keywords: computing methodologies; natural language processing; patient-reported outcome measures; signs and symptoms; symptom assessment.