Klasifikasi Pertanyaan Berbahasa Indonesia Menggunakan Algoritma  Support Vector Machine dan Seleksi Fitur Mutual Information

syechky al qodrin aruda; Novi  Yusliani; Alvi Syahrini

doi:10.5281./4796/5.jupiter.2022.10

Classification of Indonesian Questions Using the Support Vector Machine Algorithm and Mutual Information Feature Selection

Authors

syechky al qodrin aruda Universitas Sriwijaya
Novi Yusliani Universitas Sriwijaya
Alvi Syahrini Universitas Sriwijaya

DOI:

https://doi.org/10.5281./4796/5.jupiter.2022.10

Abstract

Text classification can be used to organize, arrange and categorize a text. Text

classification can be used for all text documents even if a text has a large number of features.

However, the large number of features can cause reduced accuracy in the performance results

of the classification system because there are some features that have less relevance to a text

category. The Mutual Information feature selection method combined with the Support Vector

Machine (SVM) algorithm is used to improve performance results in the classification process

for Indonesian question documents by eliminating features with weights below the threshold.

The results showed that the use of the Mutual Information feature selection method on the SVM

classification algorithm was able to produce the best performance with an accuracy value of

0.92, precision: 0.93, recall: 0.89, f-measure: 0.9, computation time: 7 s and number of features: 240.

Keywordsâ€” Text Classification, Feature Selection, Support Vector Machine, Mutual Information

Downloads

Download data is not yet available.

Downloads

PDF (Bahasa Indonesia)

Published

2022-10-26

How to Cite

al qodrin aruda, syechky, Yusliani, N. ., & Syahrini, A. (2022). Classification of Indonesian Questions Using the Support Vector Machine Algorithm and Mutual Information Feature Selection. JUPITER: Jurnal Penelitian Ilmu Dan Teknologi Komputer, 14(2-a), 44–52. https://doi.org/10.5281./4796/5.jupiter.2022.10