Makine öğrenmesi Destek Vektör Makinesi Metin Madenciliği Standart Dosya Planı Belge Sınıflama
Technological opportunities and an increase in the production of records have made inevitable the necessity of new management methods. Documents produced in public institutions in Turkey are organized and managed according to the Standard File Plan. Under the relevant law, it is mandatory to determine the subject of official correspondence from the File Plan and add the relevant codes to the records. The correct selection of these codes is essential for the healthy operating of the research- investigation processes and the successful completion of the access processes. However, incorrect codes have been given depending on institutional, personal, or managerial conditions that will interrupt the life cycle of records. Artificial intelligence applications can be utilized to minimize such misapplications and to make records classification more powerful.
This study, which is intended to automatically assign standard file plan codes with a machine learning approach to the records produced in electronic record management systems, consists of two parts as theoretical and analysis based. Firstly the difficulties of the automatic record classification were discussed in theory by using the standard file plan. Then the classification of records with machine learning was analyzed. Not to overcome the various administrative and prejudicial barriers, as well as the absence of an authority unit such as the institutional archive, were concerned document management, training, and auditing create a gap, and this will hamper automatic classification. Therefore, the necessity to reclassify records has made it necessary to work with a small data set. For this reason, the records analyzed in the study consist of records sent to the researcher of this study within the institution in the last six months. After the reclassification of 265 records in total, records on unique subjects excluded. As a result of the application of text mining techniques on the body and subject areas of the records was obtained a dataset consisting of 169 records. From this data set, provided that each subject is proportional, one-third (1/3) of the records had been randomly selected. Supported Vector Machine (SVM) algorithm used in machine learning and recently popular in the information sector was run on this dataset consisting of 112 classified records and 57 unclassified records. As a result of the study, when the manual classification and automatic classification compared, the accuracy rate was 87.72%. In other words, 87.72% of the records were classified correctly with the machine learning approach.
Machine Learning Support Vector Machine Text Mining Standart File Plan Record Classification
Primary Language | Turkish |
---|---|
Subjects | Library and Information Studies |
Journal Section | Peer- Reviewed Articles |
Authors | |
Publication Date | December 31, 2019 |
Submission Date | December 3, 2019 |
Published in Issue | Year 2019 |