Efficient Parameter Selection For Svm: The Case Of Business Intelligence Categorization

Keywords

Business intelligence; Gaussian kernels; Intelligence categorization; Parameter selection; Percentiles; Support Vector Machines; SVM

Abstract

Support Vector Machines (SVM) is a widely used technique for classifying high-dimensional data, especially in security and intelligence categorization. However, the performance of SVM can be adversely affected by poorly selected parameter values. Current approaches to SVM parameter selection mainly rely on extensive cross validation or anecdotal information, which can be inefficient and ineffective. In this research, we propose an efficient algorithm called Percentile-SVM (P-SVM) for selecting the parameter pair, (γ, C), of SVM with Gaussian kernels on metric data. P-SVM searches only a handful of percentiles of the squared Euclidean distances of data points to select the best pair of parameter values. To validate the algorithm, we applied P-SVM to categorizing business intelligence factors extracted from 6,859 sentences of 231 online news articles about four major companies in the information technology sector. The results show that P-SVM achieved a significant improvement in precision, recall, F-measure, and AUC over the LibSVM package (with default parameter values) used in WEKA, a widely used data mining software. These findings provide useful implication for relevant research and security informatics applications.

Publication Date

8-8-2017

Publication Title

2017 IEEE International Conference on Intelligence and Security Informatics: Security and Big Data, ISI 2017

Number of Pages

158-160

Document Type

Article; Proceedings Paper

Personal Identifier

scopus

DOI Link

https://doi.org/10.1109/ISI.2017.8004897

Socpus ID

85030265113 (Scopus)

Source API URL

https://api.elsevier.com/content/abstract/scopus_id/85030265113

This document is currently not available here.

Share

COinS