Text-based Classification as a Data Mining Technique

The Case of Consumer Complaints Dataset

Authors

  • yara salman Department of Computer Engineering and Automatic Control, Faculty of Mechanical and Electrical Engineering, Latakia University (Formerly Tishreen), Latakia, Syria. https://orcid.org/0009-0008-2496-3043

Keywords:

Data mining, consumer complaint database, text classification, SVM classifier.

Abstract

Text-based classification is a technique that can be used to identify different types of data from an application perspective. Various research is underway to identify methods for identifying data categories from a set of input data. In this paper, we implemented an SVM model to classify text contained within consumer complaints in the Consumer Financial Protection Bureau (CFPB) database.

The data was preprocessed and then split into training and test data, after which features were extracted in preparation for model building. A set of tests were conducted to validate the selected classifier, and experimental results showed that the SVM classifier achieved an accuracy of Training data accuracy is 99.576% and test data accuracy is 82.72%. The proposed model offers an incremental approach to text classification because it dynamically trains the classifier from a new set of user-provided data.

Published

2025-10-05