##plugins.themes.academic_pro.article.main##
Abstract
There are several approaches of outlier detection employed in many study areas amongst which distance based and density based outlier detection techniques have gathered most attention of researchers.So we are using hybrid of these two methods.The existing system uses distance based method for outlier detection and K-means as clustering method.But distance based method has limitation that it fails for non-uniform datasets.The k-means method requires number of clusters to form as input which is difficult for real life datasets which contains millions of attributes and rows.So we move to proposed model.The proposed model uses hybrid of distance and density outlier detection methods and weighted squuzer method for clustering.Most of the models deals with only single datasets.Here the project deals with mixed datasets.Future scope will be to handle dyanamic data.
Keywords—outlierdetection,weighted squeezer clustering, hybrid, distance ,density based, k-means, mixed datasets.##plugins.themes.academic_pro.article.details##
References
2. Scalable distance-based outlier detection over high-volume data streams Cao, Lei ; Yang, Di ; Wang, Qingyang ; Yu, Yanwei ;Wang, Jiayuan ; Rundensteiner, Elke A.,2014
3. RODHA: Robust Outlier Detection using Hybrid Approach A. Mira*, D.K. Bhattacharyya, S. Saharia,2012
4. Improved Hybrid clustering and distance based method for outlier removal, P. Murugavel ,2011
5. utlier Detection over datasets using cluster based and distance based approach, S.D.Pachgade ,2012
6. S. Bay and M. Schwabacher. Distance-based outliers in near linear time with randomization and a simple pruning rule. In Proceedings of the Ninth ACM SIGKDD, pages 29-38. Keleuven Press
7. M. M. Breunig, H. P. Kriegel, R. T. Ng, and J. Sander. Lof: Identifying density-based local outliers. In Proceedings of ACM SIGMOD on Management of Data, pages 386-395.
8. A Two-Step Method for Clustering Mixed Categroicaland Numeric DataMing-Yi Shih*, Jar-Wen Jheng and Lien-Fu Lai ,2010
9. Scalable and Efficient Outlier Detection in Large Distributed Data Sets with Mixed-Type Attributes by Anna Koufakou
10. Improving Categorical Data Clustering Algorithm byWeighting Uncommon Attribute Value MatchesZengyou He, Xiaofei Xu, Shenchun Deng
11. A Framework for Clustering Mixed Attribute Type Datasets ,,Jongwoo Lim .
12. E. M. Knorr, R. T. Ng, and V. Tucakov. Distance-based outliers: Algorithms and applications. VLDB Journal, 8: 237-253, 2000.
13. http://archive.ics.uci.edu/ml/datasets
14. A k-mean clustering algorithm for mixed numeric and categorical data,Amir Ahmad a,*, Lipika Dey b
15. Squeezer: An efficient algorithm for clustering mixed data,Deng shengum
16. Clustering mixed numerical and categorical data:A cluster ensemble approach, shenchugn deng
17. A Framework for Clustering Mixed Attribute Type Datasets Jongwoo Lim1, Jongeun Jun2, Seon Ho Kim2 and Dennis McLeod1
18. Clustering the Mixed Numerical and Categorical Dataset using Similarity Weight and Filter Method ,M. V. Jagannatha Reddy1 and B. Kavitha2
19. Hybrid Algorithm for Clustering Mixed Data Sets,V.N. Prasad Pinisetty
20. Outlier Detection Techniques,Hans-Peter Kriegel, Peer Kröger, Arthur Zimek
21. Unifying Density-Based Clustering and Outlier Detection, Yunzin Tao
22. A Modified Density Based Outlier Mining Algorithm for Large Dataset, Peng Yang
23. An Efficient Strategy to detect outlier transactions, http://www.ijsce.org/attachments/File/v3i6/F2037013614.pdf, International Journal for Soft Computing and Engg., Madhu Nashipudimath, Anjali Barmade.