Araştırma Makalesi
BibTex RIS Kaynak Göster

A STUDY ON DETERMINATION OF OUTLIER OBSERVATIONS BY USING CHI-SQUARE THRESHOLD VALUE

Yıl 2019, Cilt: 2 Sayı: 1, 7 - 10, 01.01.2019

Öz

Kaynakça

  • Calenge C, Darmon G, Basille M, Loison A, Jullien JM. 2008. The factorial decomposition of the Mahalanobis distances in habitat selection studies. Ecology, 89(2): 555–566, doi: 10.1890/06-1750.1.
  • Egan WJ, Morgan SL. 1998. Outlier detection in multivariate analytical chemical data. Anal Chem, 70(11):2372–2379, doi: 10.1021/ac970763d.
  • Farber O, Kadmon R. 2003. Assessment of alternative approaches for bioclimatic modeling with special emphasis on the Mahalanobis distance. ECMOD, 160 (1-2):115-130, doi: 10.1016/S0304-3800(02)00327-7.
  • Gogoi P, Bhattacharyya DK, Borah B, Kalita JK. 2011. A survey of outlier detection methods in network anomaly identification. Computer Journal, 54(4):570-588, doi: 10.1093/comjnl/bxr026.
  • Gupta M, Gao J, Aggarwal C, Han J.2013. Outlier detection for temporal data : A survey. IEEE TKDE, 26(9): 2250-2267,doi: 10.1109/TKDE.2013.184.
  • Hodge VJ,Austin J. 2004. A survey of outlier detection methodologies. Artif Intell Rev, 22:85–126, doi: 10.1023/B:AIRE.0000045502.10941.a9.
  • Hubert M, Van Der Veeken S. 2008. Outlier detection for skewed data. In Journal of Chemometrics, 22(3-4):235-246, doi: 10.1002/cem.1123.
  • Liu H, Shah S, Jiang W. 2004. On-line outlier detection and data cleaning. CCEND, 28(9):1635-1647,doi: 10.1016/j.compchemeng.2004.01.009.
  • Maesschalck RD, Jouan-Rimbaud D, Massart DL. 2000. The Mahalanobis distance. Chemometrics and Intelligent Laboratory Systems, 50:1–18, doi: 10.1016/S0169-7439(99)00047-7.
  • Pei Y, Zaïane O. 2006. A synthetic data generator for clustering and outlier analysis. Department of Computing science, University of Alberta.URL: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.73.5133&rep=rep1&type=pdf (accesess date:10.09.2018).
  • Rousseeuw PJ, Hubert M. 2011. Robust statistics for outlier detection. WIREs Data Mining Knowl Discov, 1(1):73-79 doi: 10.1002/widm.2. Singh, K, Upadhyaya S.2012. Outlier detection: applications and techniques. IJCSI, 9(1): 307-323.
  • Teng M. 2010. Anomaly detection on time series. 2010. IEEE International Conference on Progress in Informatics and Computing, 1:603-608. doi: 10.1109/PIC.2010.5687485.
  • Ting JA, D’Souza A, Schaal S. 2007a. Automatic outlier detection: A Bayesian approach.IEEE International Conference on Robotics and Automation. 2489-2494, doi: 10.1109/ROBOT.2007.363693.
  • Ting JA, Theodorou E, Schaal S. 2007b. A Kalman filter for robust outlier detection. IEEE International Conference on Intelligent Robots and Systems, 1514-1519. doi: 10.1109/IROS.2007.4399158.
  • Url1: http://onlinestatbook.com/2/advanced_graphs/q-q_plots.html (access date: 09.10.2018).
  • Url2: http://rstat.web.tr(access date: 09.10.2018). Url3: https://onlinecourses.science.psu.edu/stat414/ node/154/(access date: 09.10.2018).
  • Xiang S, Nie F, Zhang C. 2008. Learning a Mahalanobis distance metric for data clustering and classification. Pattern Recognition, 41(12):3600-3612 doi: 10.1016/j.patcog.2008.05.018.

A Study on Determination of Outlier Observations by Using Chi-Square Threshold Value

Yıl 2019, Cilt: 2 Sayı: 1, 7 - 10, 01.01.2019

Öz

Outlier
observations are observations that are out of the tendency of all observations
in a data set. The observations come out in situations such as faulty
observation, incorrect data entry. It is important to be able to identify these
observations as the results of statistical analysis, for example such as
multiple regression analysis, can be quite sensitive against to these
observations. Outlier observations are mostly determined by using distance
calculation, statistical test and density based approaches. In this study, the
distances of each observation vector to the center were calculated with
Mahalanobis distance by using R program. For this purpose, the features such as
hematokrit (htc), hemoglobin (hgb), mean platelet volume (mpv), platelet
distribution width (pdw), nonbacterial prostatitis (nbp) and pulse pressure
values measured in the blood of 315 heart patients were examined as data set.
As a result of the research, sixteen observations were found as outlier
observation. It is thought that the result of this study will help the
researchers trying to find out especially the outlier observations.

Kaynakça

  • Calenge C, Darmon G, Basille M, Loison A, Jullien JM. 2008. The factorial decomposition of the Mahalanobis distances in habitat selection studies. Ecology, 89(2): 555–566, doi: 10.1890/06-1750.1.
  • Egan WJ, Morgan SL. 1998. Outlier detection in multivariate analytical chemical data. Anal Chem, 70(11):2372–2379, doi: 10.1021/ac970763d.
  • Farber O, Kadmon R. 2003. Assessment of alternative approaches for bioclimatic modeling with special emphasis on the Mahalanobis distance. ECMOD, 160 (1-2):115-130, doi: 10.1016/S0304-3800(02)00327-7.
  • Gogoi P, Bhattacharyya DK, Borah B, Kalita JK. 2011. A survey of outlier detection methods in network anomaly identification. Computer Journal, 54(4):570-588, doi: 10.1093/comjnl/bxr026.
  • Gupta M, Gao J, Aggarwal C, Han J.2013. Outlier detection for temporal data : A survey. IEEE TKDE, 26(9): 2250-2267,doi: 10.1109/TKDE.2013.184.
  • Hodge VJ,Austin J. 2004. A survey of outlier detection methodologies. Artif Intell Rev, 22:85–126, doi: 10.1023/B:AIRE.0000045502.10941.a9.
  • Hubert M, Van Der Veeken S. 2008. Outlier detection for skewed data. In Journal of Chemometrics, 22(3-4):235-246, doi: 10.1002/cem.1123.
  • Liu H, Shah S, Jiang W. 2004. On-line outlier detection and data cleaning. CCEND, 28(9):1635-1647,doi: 10.1016/j.compchemeng.2004.01.009.
  • Maesschalck RD, Jouan-Rimbaud D, Massart DL. 2000. The Mahalanobis distance. Chemometrics and Intelligent Laboratory Systems, 50:1–18, doi: 10.1016/S0169-7439(99)00047-7.
  • Pei Y, Zaïane O. 2006. A synthetic data generator for clustering and outlier analysis. Department of Computing science, University of Alberta.URL: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.73.5133&rep=rep1&type=pdf (accesess date:10.09.2018).
  • Rousseeuw PJ, Hubert M. 2011. Robust statistics for outlier detection. WIREs Data Mining Knowl Discov, 1(1):73-79 doi: 10.1002/widm.2. Singh, K, Upadhyaya S.2012. Outlier detection: applications and techniques. IJCSI, 9(1): 307-323.
  • Teng M. 2010. Anomaly detection on time series. 2010. IEEE International Conference on Progress in Informatics and Computing, 1:603-608. doi: 10.1109/PIC.2010.5687485.
  • Ting JA, D’Souza A, Schaal S. 2007a. Automatic outlier detection: A Bayesian approach.IEEE International Conference on Robotics and Automation. 2489-2494, doi: 10.1109/ROBOT.2007.363693.
  • Ting JA, Theodorou E, Schaal S. 2007b. A Kalman filter for robust outlier detection. IEEE International Conference on Intelligent Robots and Systems, 1514-1519. doi: 10.1109/IROS.2007.4399158.
  • Url1: http://onlinestatbook.com/2/advanced_graphs/q-q_plots.html (access date: 09.10.2018).
  • Url2: http://rstat.web.tr(access date: 09.10.2018). Url3: https://onlinecourses.science.psu.edu/stat414/ node/154/(access date: 09.10.2018).
  • Xiang S, Nie F, Zhang C. 2008. Learning a Mahalanobis distance metric for data clustering and classification. Pattern Recognition, 41(12):3600-3612 doi: 10.1016/j.patcog.2008.05.018.
Toplam 17 adet kaynakça vardır.

Ayrıntılar

Birincil Dil İngilizce
Konular Mühendislik
Bölüm Research Articles
Yazarlar

Fahrettin Kaya 0000-0003-1666-4859

Esra Yavuz

Şeyma Koç

Ömer Faruk Karaokur

Yayımlanma Tarihi 1 Ocak 2019
Gönderilme Tarihi 13 Ekim 2018
Kabul Tarihi 18 Kasım 2018
Yayımlandığı Sayı Yıl 2019 Cilt: 2 Sayı: 1

Kaynak Göster

APA Kaya, F., Yavuz, E., Koç, Ş., Karaokur, Ö. F. (2019). A Study on Determination of Outlier Observations by Using Chi-Square Threshold Value. Black Sea Journal of Engineering and Science, 2(1), 7-10.
AMA Kaya F, Yavuz E, Koç Ş, Karaokur ÖF. A Study on Determination of Outlier Observations by Using Chi-Square Threshold Value. BSJ Eng. Sci. Ocak 2019;2(1):7-10.
Chicago Kaya, Fahrettin, Esra Yavuz, Şeyma Koç, ve Ömer Faruk Karaokur. “A Study on Determination of Outlier Observations by Using Chi-Square Threshold Value”. Black Sea Journal of Engineering and Science 2, sy. 1 (Ocak 2019): 7-10.
EndNote Kaya F, Yavuz E, Koç Ş, Karaokur ÖF (01 Ocak 2019) A Study on Determination of Outlier Observations by Using Chi-Square Threshold Value. Black Sea Journal of Engineering and Science 2 1 7–10.
IEEE F. Kaya, E. Yavuz, Ş. Koç, ve Ö. F. Karaokur, “A Study on Determination of Outlier Observations by Using Chi-Square Threshold Value”, BSJ Eng. Sci., c. 2, sy. 1, ss. 7–10, 2019.
ISNAD Kaya, Fahrettin vd. “A Study on Determination of Outlier Observations by Using Chi-Square Threshold Value”. Black Sea Journal of Engineering and Science 2/1 (Ocak 2019), 7-10.
JAMA Kaya F, Yavuz E, Koç Ş, Karaokur ÖF. A Study on Determination of Outlier Observations by Using Chi-Square Threshold Value. BSJ Eng. Sci. 2019;2:7–10.
MLA Kaya, Fahrettin vd. “A Study on Determination of Outlier Observations by Using Chi-Square Threshold Value”. Black Sea Journal of Engineering and Science, c. 2, sy. 1, 2019, ss. 7-10.
Vancouver Kaya F, Yavuz E, Koç Ş, Karaokur ÖF. A Study on Determination of Outlier Observations by Using Chi-Square Threshold Value. BSJ Eng. Sci. 2019;2(1):7-10.

                                                24890