ABSTRACT
Rule-based classification systems have been widely used in real world applications because of the easy interpretability of rules. Many traditional rule-based classifiers prefer small rule sets to large rule sets, but small classifiers are sensitive to the missing values in unseen test data. In this paper, we present a larger classifier that is less sensitive to the missing values in unseen test data. We experimentally show that it is more accurate than some benchmark classifies when unseen test data have missing values.
- Agrawal, R. & Srikant, R. (1994), Fast algorithms for mining association rules in large databases, in 'Proceedings of the Twentieth International Conference on Very Large Databases', Santiago, Chile, pp. 487--499. Google ScholarDigital Library
- Batista, G. E. A. P. A. & Monard, M. C. (2003), 'An analysis of four missing data treatment methods for supervised learning', Applied Artificial Intelligence17(5--6), 519--533.Google ScholarCross Ref
- Bayardo, R. & Agrawal, R. (1999), Mining the most interesting rules, in 'Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining', ACM Press, N.Y., pp. 145--154. Google ScholarDigital Library
- Bayardo, R., Agrawal, R. & Gunopulos, D. (2000), 'Constraint-based rule mining in large, dense database', Data Mining and Knowledge Discovery Journal4(2/3), 217--240. Google ScholarDigital Library
- Blake, E. K. C. & Merz, C. J. (1998), 'UCI repository of machine learning databases, http://www.ics.uci.edu/~mlearn/MLRepository.html'.Google Scholar
- Breiman, L. (1996), 'Bagging predictors', Machine Learning24, 123--140. Google ScholarDigital Library
- Clark, P. & Boswell, R. (1991), Rule induction with CN2: Some recent improvements, in 'Machine Learning - EWSL-91', pp. 151--163. Google ScholarDigital Library
- Clark, P. & Niblett, T. (1989), 'The CN2 induction algorithm', Machine Learning3(4), 261--283. Google ScholarDigital Library
- Freund, Y. & Schapire, R. E. (1996), Experiments with a new boosting algorithm, in 'International Conference on Machine Learning', pp. 148--156. *citeseer.nj.nec.com/freund96experiments.htmlGoogle Scholar
- Freund, Y. & Schapire, R. E. (1997), 'A decision-theoretic generalization of on-line learning and an application to boosting', Journal of Computer and System Sciences55(1), 119--139. Google ScholarDigital Library
- Han, J., Pei, J. & Yin, Y. (2000), Mining frequent patterns without candidate generation, in 'Proc. 2000 ACM-SIGMOD Int. Conf. on Management of Data (SIGMOD'00)', May, pp. 1--12. Google ScholarDigital Library
- Li, J., Shen, H. & Topor, R. (2002), 'Mining the optimal class association rule set', Knowledge-Based System15(7), 399--405.Google ScholarDigital Library
- Li, J., Topor, R. & Shen, H. (2002), Construct robust rule sets for classification, in 'Proceedings of the eighth ACMKDD international conference on knowledge discovery and data mining', ACM press, Edmonton, Canada, pp. 564 -- 569. Google ScholarDigital Library
- Li, W., Han, J. & Pei, J. (2001), CMAR: Accurate and efficient classification based on multiple class-association rules, in 'Proceedings 2001 IEEE International Conference on Data Mining (ICDM 2001)', IEEE Computer Society Press, pp. 369--376. Google ScholarDigital Library
- Liu, B., Hsu, W. & Ma, Y. (1998), Integrating classification and association rule mining, in 'Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (KDD-98)', pp. 27--31.Google Scholar
- Michalski, R., Mozetic, I., Hong, J. & Lavrac, N. (1986), The AQ15 inductive learning system: an overview and experiments, in 'Proceedings of IMAL 1986', Université de Paris-Sud, Orsay.Google Scholar
- Mingers, J. (1989), 'An empirical comparison of selection measures for decision tree induction', Machine Learning3, 319--342. Google ScholarDigital Library
- Mitchell, T. M. (1997), Machine Learning, McGraw-Hill. Google ScholarDigital Library
- Quinlan, J. R. (1993), C4.5: Programs for Machine Learning, Morgan Kaufmann, San Mateo, CA. Google ScholarDigital Library
- Rissanen, J. (1983), 'A universal prior for the integers and estimation by MDL', Ann. of Statistics11(2), 416--431.Google Scholar
- Yin, X. & Han, J. (2003), CPAR: Classification based on predictive association rules, in 'Proceedings of 2003 SIAM International Conference on Data Mining (SDM'03)'.Google Scholar
Index Terms
- Using association rules to make rule-based classifiers robust
Recommendations
A greedy classification algorithm based on association rule
Classification and association rule discovery are important data mining tasks. Using association rule discovery to construct classification systems, also known as associative classification, is a promising approach. In this paper, a new associative ...
Robust Rule-Based Prediction
This paper studies a problem of robust rule-based classification, i.e., making predictions in the presence of missing values in data. This study differs from other missing value handling research in that it does not handle missing values but builds a ...
Association Classification Based on Compactness of Rules
WKDD '09: Proceedings of the 2009 Second International Workshop on Knowledge Discovery and Data MiningAssociative classification has high classification accuracy and strong flexibility. However, it still suffers from overfitting since the classification rules satisfied both minimum support and minimum confidence are returned as strong association rules ...
Comments