Journal of Computer and Communications

Volume 5, Issue 7 (May 2017)

ISSN Print: 2327-5219   ISSN Online: 2327-5227

Google-based Impact Factor: 1.34  Citations  

Quality Assessment of Training Data with Uncertain Labels for Classification of Subjective Domains

HTML  XML Download Download as PDF (Size: 1696KB)  PP. 152-168  
DOI: 10.4236/jcc.2017.57014    1,056 Downloads   1,510 Views   Citations
Author(s)

ABSTRACT

In order to improve the performance of classifiers in subjective domains, this paper defines a metric to measure the quality of the subjectively labelled training data (QoSTD) by means of K-means clustering. Then, the QoSTD is used as a weight of the predicted class scores to adjust the likelihoods of instances. Moreover, two measurements are defined to assess the performance of the classifiers trained by the subjective labelled data. The binary classifiers of Traditional Chinese Medicine (TCM) Zhengs are trained and retrained by the real-world data set, utilizing the support vector machine (SVM) and the discrimination analysis (DA) models, so as to verify the effectiveness of the proposed method. The experimental results show that the consistency of likelihoods of instances with the corresponding observations is increased notable for the classes, especially in the cases with the relatively low QoSTD training data set. The experimental results also indicate the solution how to eliminate the miss-labelled instances from the training data set to re-train the classifiers in the subjective domains.

Cite this paper

Dai, Y. (2017) Quality Assessment of Training Data with Uncertain Labels for Classification of Subjective Domains. Journal of Computer and Communications, 5, 152-168. doi: 10.4236/jcc.2017.57014.

Cited by

[1] Complexity perception classification method for tongue constitution recognition
Artificial intelligence in medicine, 2019

Copyright © 2020 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.