Leveraging sentiment analysis for classifying patient complaints
Abstract
Unsolicited patient complaints expressed in patients' own words provide an evidential basis for identifying and mitigating patient safety and financial risks associated with physicians and other practitioners. Classifying patient complaints is complicated by the complexity of linguistic representation. Current practice relies upon manual classification, which limits scalability. An automatic approach can potentially improve response time and scale, thereby enhancing opportunities to promote physician accountability for safe and respectful care. This research seeks to automate the classification of patient complaints to improve triage and response. We process a corpus of patient complaints data collected by the Patient Advocacy Reporting System (PARS) developed at Vanderbilt and associated institutions. Our method is to map each complaint to a vector based on enhanced Linguistic Inquiry and Word Count (LIWC) lexicons and to train a Naïve Bayes classifier over those vectors. We compare it to both Term Frequency-Inverse Document Frequency (TF-IDF) and the best case results of any classifier over bag of words features. Our classifier outperforms traditional complaint analysis approaches, which disregard sentiment. Our classifier yields 3% greater accuracy overall than traditional approaches. For the SAFETY OF ENVIRONMENT label, our classifier had an accuracy of 84% (compared to 50% for traditional) and a sensitivity of 96% (compared to 0% for traditional). We conclude that patient sentiments conveyed in complaints are often overlooked yet can be valuable in analyzing such complaints to identify and mitigate patient safety and provider financial risks. We demonstrate that inferring the complaint sentiment leads to improved classification accuracy.