New Attribute Construction In Mixed Datasets Using Clustering Algorithms

Research Article
Sagunthaladevi.S and Bhupathi Raju Venkata Rama Raju
DOI: 
xxx-xxxxx-xxxx
Subject: 
science
KeyWords: 
Clustering, Classification, Prediction, Clustering Algorithm for Mixed Dataset (CAMD), Clustering Algorithm for Categorical Dataset (CACD), Clustering Algorithm for Numerical Dataset (CAND).
Abstract: 

Classification is a challenging task in data mining technique. The main aim of Classification is to build a classifier based on some cases with some attributes to describe the objects or one attribute to describe the group of the objects. Then group the similar data into number of classifiers and it assigns items in a collection to target categories or classes. Finally classifier is used to predict the group attributes of new cases from the domain based on the values of other attributes. Various classification algorithms have been developed to group data into classifiers. However, those classification algorithms works effectively either on pure numeric data or on pure categorical data and most of them performs poorly on mixed categorical and numerical data types. Previous classification algorithms do not handled outliers perfectly. To overcome those disadvantages, this paper presents Clustering Algorithm for Mixed Dataset (CAMD) Algorithm for clustering. CAMD algorithm divides into CACD and CAND algorithms for Numerical and Categorical datasets separately to improve the performance of clustering.