Research Article
BibTex RIS Cite
Year 2024, Volume: 7 Issue: 2, 220 - 225, 18.12.2024
https://doi.org/10.54565/jphcfum.1501651

Abstract

References

  • [1] B. Debus, H. Parastar, P. Harrington, and D. Kirsanov, “Deep learning in analytical chemistry,” TrAC - Trends in Analytical Chemistry, vol. 145, p. 116459, 2021, doi: 10.1016/j.trac.2021.116459.
  • [2] D. Fooshee et al., “Deep learning for chemical reaction prediction,” Molecular Systems Design and Engineering, vol. 3, no. 3, pp. 442–452, 2018, doi: 10.1039/c7me00107j.
  • [3] T. F. G. G. Cova and A. A. C. C. Pais, “Deep Learning for Deep Chemistry: Optimizing the Prediction of Chemical Patterns,” Frontiers in Chemistry, vol. 7, no. November, pp. 1–22, 2019, doi: 10.3389/fchem.2019.00809.
  • [4] G. B. Goh, N. O. Hodas, and A. Vishnu, “Deep learning for computational chemistry,” Journal of Computational Chemistry, vol. 38, no. 16, pp. 1291–1307, 2017, doi: 10.1002/jcc.24764.
  • [5] K. Rajan, A. Zielesny, and C. Steinbeck, “DECIMER: towards deep learning for chemical image recognition,” Journal of Cheminformatics, vol. 12, no. 1, pp. 1–9, 2020, doi: 10.1186/s13321-020-00469-w.
  • [6] D. Jha et al., “ElemNet: Deep Learning the Chemistry of Materials From Only Elemental Composition,” Scientific Reports, vol. 8, no. 1, pp. 1–13, 2018, doi: 10.1038/s41598-018-35934-y.
  • [7] C. Hasselgren and T. I. Oprea, “Artificial Intelligence for Drug Discovery: Are We There Yet?,” Annual Review of Pharmacology and Toxicology, vol. 64, pp. 527–550, Jan. 2024, doi: 10.1146/ANNUREV-PHARMTOX-040323-040828.
  • [8] J. Zhang et al., “Artificial Intelligence Enhanced Molecular Simulations,” Journal of Chemical Theory and Computation, vol. 19, no. 14, pp. 4338–4350, Jul. 2023, doi: 10.1021/ACS.JCTC.3C00214.
  • [9] M. Krenn, J. Landgraf, T. Foesel, F. M.-P. R. A, and undefined 2023, “Artificial intelligence and machine learning for quantum technologies,” APS, vol. 107, no. 1, Jan. 2023, doi: 10.1103/PhysRevA.107.010101.
  • [10] D. Angelis, F. Sofos, and T. E. Karakasidis, “Artificial Intelligence in Physical Sciences: Symbolic Regression Trends and Perspectives,” Archives of Computational Methods in Engineering, vol. 30, no. 6, pp. 3845–3865, Jul. 2023, doi: 10.1007/S11831-023-09922-Z.
  • [11] S. Bird et al., “Natural language processing with Python: analyzing text with the natural language toolkit,” 2009, Accessed: May 05, 2024. [Online]. Available: https://books.google.com/books?hl=tr&lr=&id=KGIbfiiP1i4C&oi=fnd&pg=PR5&dq=Steven+Bird,+Ewan+Klein,+and+Edward+Loper+(2009).+Natural+Language+Processing+with+Python.+O’Reilly+Media+Inc.+https://www.nltk.org/book/&ots=Y5zjE4JDJ-&sig=23ju8nwX2UOQcxrfCoy4r4xqUlE.
  • [12] M. Naeem, F. Rustam, A. Mehmood, … I. A.-P. C., and undefined 2022, “Classification of movie reviews using term frequency-inverse document frequency and optimized machine learning algorithms,” peerj.com, Accessed: May 05, 2024. [Online]. Available: https://peerj.com/articles/cs-914/.
  • [13] M. A. Hearst, S. T. Dumais, E. Osuna, J. Platt, and B. Scholkopf, “Support vector machines,” IEEE Intelligent Systems and their applications, vol. 13, no. 4, pp. 18–28, 1998.
  • [14] I. Rish, “An empirical study of the naive Bayes classifier,” cc.gatech.edu, 2001, Accessed: Aug. 28, 2021. [Online]. Available: https://www.cc.gatech.edu/~isbell/reading/papers/Rish.pdf.
  • [15] B. Ghojogh and M. Crowley, “Linear and Quadratic Discriminant Analysis: Tutorial,” Jun. 2019, Accessed: May 02, 2024. [Online]. Available: http://arxiv.org/abs/1906.02590.
  • [16] L. E. Peterson, “K-nearest neighbor,” Scholarpedia, vol. 4, no. 2, p. 1883, 2009.
  • [17] C. Peng, K. Lee, G. I.-T. journal of educational, and undefined 2002, “An introduction to logistic regression analysis and reporting,” Taylor & Francis, vol. 96, no. 1, pp. 3–14, 2002, doi: 10.1080/00220670209598786.
  • [18] M. P.-I. journal of remote sensing and undefined 2005, “Random forest classifier for remote sensing classification,” Taylor & Francis, vol. 26, no. 1, pp. 217–222, Jan. 2005, doi: 10.1080/01431160412331269698.
  • [19] C. Bentéjac, A. Csörgő, and G. Martínez-Muñoz, “A comparative analysis of gradient boosting algorithms,” Artificial Intelligence Review, vol. 54, no. 3, pp. 1937–1967, Mar. 2021, doi: 10.1007/S10462-020-09896-5.

Extracting Meaningful Information from Turkish Chemistry and Physics Texts with Machine Learning

Year 2024, Volume: 7 Issue: 2, 220 - 225, 18.12.2024
https://doi.org/10.54565/jphcfum.1501651

Abstract

This study emphasizes the importance of processing a dataset consisting of Turkish chemistry and physics texts created by us through artificial intelligence systems. A model is proposed to pave the way for artificial intelligence-based analyses and discoveries in the basic sciences of chemistry and physics. Chemistry and physics, the basic sciences, are critical in many industrial, medical, and environmental applications. However, significant data analysis is required to access and understand information in these areas. This study aims to demonstrate the effectiveness of machine learning methods in extracting meaningful information from Turkish chemistry and physics texts. For this purpose, the tokenization process is first performed, and then the features are extracted with Term frequency-inverse document frequency (TF-IDF) and Bag-of-Words (BOW) methods. The combined features are classified separately with Support Vector Machine (SVM), Naive Bayes (NB), Quadratic Discriminant Analysis (QDA), k-nearest Neighbors (KNN), Logistic Regression (LR), Random Forest (RF), and Gradient Boosting (GB) algorithms. According to the classification results, the best calculation time and the most successful accuracy rate are obtained with NB at 95%. These results are essential for artificial intelligence systems to understand and process information correctly. It shows that scientists and researchers can access information faster and accelerate scientific discovery using Turkish sources. Such artificial intelligence models can also be essential in education, providing students with a more effective and personalized learning experience. Therefore, processing Turkish chemistry and physics texts with artificial intelligence systems is essential in including studies conducted in this language in global studies in scientific research, education, and industrial applications.

References

  • [1] B. Debus, H. Parastar, P. Harrington, and D. Kirsanov, “Deep learning in analytical chemistry,” TrAC - Trends in Analytical Chemistry, vol. 145, p. 116459, 2021, doi: 10.1016/j.trac.2021.116459.
  • [2] D. Fooshee et al., “Deep learning for chemical reaction prediction,” Molecular Systems Design and Engineering, vol. 3, no. 3, pp. 442–452, 2018, doi: 10.1039/c7me00107j.
  • [3] T. F. G. G. Cova and A. A. C. C. Pais, “Deep Learning for Deep Chemistry: Optimizing the Prediction of Chemical Patterns,” Frontiers in Chemistry, vol. 7, no. November, pp. 1–22, 2019, doi: 10.3389/fchem.2019.00809.
  • [4] G. B. Goh, N. O. Hodas, and A. Vishnu, “Deep learning for computational chemistry,” Journal of Computational Chemistry, vol. 38, no. 16, pp. 1291–1307, 2017, doi: 10.1002/jcc.24764.
  • [5] K. Rajan, A. Zielesny, and C. Steinbeck, “DECIMER: towards deep learning for chemical image recognition,” Journal of Cheminformatics, vol. 12, no. 1, pp. 1–9, 2020, doi: 10.1186/s13321-020-00469-w.
  • [6] D. Jha et al., “ElemNet: Deep Learning the Chemistry of Materials From Only Elemental Composition,” Scientific Reports, vol. 8, no. 1, pp. 1–13, 2018, doi: 10.1038/s41598-018-35934-y.
  • [7] C. Hasselgren and T. I. Oprea, “Artificial Intelligence for Drug Discovery: Are We There Yet?,” Annual Review of Pharmacology and Toxicology, vol. 64, pp. 527–550, Jan. 2024, doi: 10.1146/ANNUREV-PHARMTOX-040323-040828.
  • [8] J. Zhang et al., “Artificial Intelligence Enhanced Molecular Simulations,” Journal of Chemical Theory and Computation, vol. 19, no. 14, pp. 4338–4350, Jul. 2023, doi: 10.1021/ACS.JCTC.3C00214.
  • [9] M. Krenn, J. Landgraf, T. Foesel, F. M.-P. R. A, and undefined 2023, “Artificial intelligence and machine learning for quantum technologies,” APS, vol. 107, no. 1, Jan. 2023, doi: 10.1103/PhysRevA.107.010101.
  • [10] D. Angelis, F. Sofos, and T. E. Karakasidis, “Artificial Intelligence in Physical Sciences: Symbolic Regression Trends and Perspectives,” Archives of Computational Methods in Engineering, vol. 30, no. 6, pp. 3845–3865, Jul. 2023, doi: 10.1007/S11831-023-09922-Z.
  • [11] S. Bird et al., “Natural language processing with Python: analyzing text with the natural language toolkit,” 2009, Accessed: May 05, 2024. [Online]. Available: https://books.google.com/books?hl=tr&lr=&id=KGIbfiiP1i4C&oi=fnd&pg=PR5&dq=Steven+Bird,+Ewan+Klein,+and+Edward+Loper+(2009).+Natural+Language+Processing+with+Python.+O’Reilly+Media+Inc.+https://www.nltk.org/book/&ots=Y5zjE4JDJ-&sig=23ju8nwX2UOQcxrfCoy4r4xqUlE.
  • [12] M. Naeem, F. Rustam, A. Mehmood, … I. A.-P. C., and undefined 2022, “Classification of movie reviews using term frequency-inverse document frequency and optimized machine learning algorithms,” peerj.com, Accessed: May 05, 2024. [Online]. Available: https://peerj.com/articles/cs-914/.
  • [13] M. A. Hearst, S. T. Dumais, E. Osuna, J. Platt, and B. Scholkopf, “Support vector machines,” IEEE Intelligent Systems and their applications, vol. 13, no. 4, pp. 18–28, 1998.
  • [14] I. Rish, “An empirical study of the naive Bayes classifier,” cc.gatech.edu, 2001, Accessed: Aug. 28, 2021. [Online]. Available: https://www.cc.gatech.edu/~isbell/reading/papers/Rish.pdf.
  • [15] B. Ghojogh and M. Crowley, “Linear and Quadratic Discriminant Analysis: Tutorial,” Jun. 2019, Accessed: May 02, 2024. [Online]. Available: http://arxiv.org/abs/1906.02590.
  • [16] L. E. Peterson, “K-nearest neighbor,” Scholarpedia, vol. 4, no. 2, p. 1883, 2009.
  • [17] C. Peng, K. Lee, G. I.-T. journal of educational, and undefined 2002, “An introduction to logistic regression analysis and reporting,” Taylor & Francis, vol. 96, no. 1, pp. 3–14, 2002, doi: 10.1080/00220670209598786.
  • [18] M. P.-I. journal of remote sensing and undefined 2005, “Random forest classifier for remote sensing classification,” Taylor & Francis, vol. 26, no. 1, pp. 217–222, Jan. 2005, doi: 10.1080/01431160412331269698.
  • [19] C. Bentéjac, A. Csörgő, and G. Martínez-Muñoz, “A comparative analysis of gradient boosting algorithms,” Artificial Intelligence Review, vol. 54, no. 3, pp. 1937–1967, Mar. 2021, doi: 10.1007/S10462-020-09896-5.
There are 19 citations in total.

Details

Primary Language English
Subjects Physical Chemistry (Other)
Journal Section Articles
Authors

Mücahit Karaduman 0000-0002-8087-4044

Muhammed Yıldırım 0000-0003-1866-4721

Publication Date December 18, 2024
Submission Date June 15, 2024
Acceptance Date July 26, 2024
Published in Issue Year 2024 Volume: 7 Issue: 2

Cite

APA Karaduman, M., & Yıldırım, M. (2024). Extracting Meaningful Information from Turkish Chemistry and Physics Texts with Machine Learning. Journal of Physical Chemistry and Functional Materials, 7(2), 220-225. https://doi.org/10.54565/jphcfum.1501651
AMA Karaduman M, Yıldırım M. Extracting Meaningful Information from Turkish Chemistry and Physics Texts with Machine Learning. Journal of Physical Chemistry and Functional Materials. December 2024;7(2):220-225. doi:10.54565/jphcfum.1501651
Chicago Karaduman, Mücahit, and Muhammed Yıldırım. “Extracting Meaningful Information from Turkish Chemistry and Physics Texts With Machine Learning”. Journal of Physical Chemistry and Functional Materials 7, no. 2 (December 2024): 220-25. https://doi.org/10.54565/jphcfum.1501651.
EndNote Karaduman M, Yıldırım M (December 1, 2024) Extracting Meaningful Information from Turkish Chemistry and Physics Texts with Machine Learning. Journal of Physical Chemistry and Functional Materials 7 2 220–225.
IEEE M. Karaduman and M. Yıldırım, “Extracting Meaningful Information from Turkish Chemistry and Physics Texts with Machine Learning”, Journal of Physical Chemistry and Functional Materials, vol. 7, no. 2, pp. 220–225, 2024, doi: 10.54565/jphcfum.1501651.
ISNAD Karaduman, Mücahit - Yıldırım, Muhammed. “Extracting Meaningful Information from Turkish Chemistry and Physics Texts With Machine Learning”. Journal of Physical Chemistry and Functional Materials 7/2 (December 2024), 220-225. https://doi.org/10.54565/jphcfum.1501651.
JAMA Karaduman M, Yıldırım M. Extracting Meaningful Information from Turkish Chemistry and Physics Texts with Machine Learning. Journal of Physical Chemistry and Functional Materials. 2024;7:220–225.
MLA Karaduman, Mücahit and Muhammed Yıldırım. “Extracting Meaningful Information from Turkish Chemistry and Physics Texts With Machine Learning”. Journal of Physical Chemistry and Functional Materials, vol. 7, no. 2, 2024, pp. 220-5, doi:10.54565/jphcfum.1501651.
Vancouver Karaduman M, Yıldırım M. Extracting Meaningful Information from Turkish Chemistry and Physics Texts with Machine Learning. Journal of Physical Chemistry and Functional Materials. 2024;7(2):220-5.

© 2018 Journal of Physical Chemistry and Functional Materials (JPCFM). All rights reserved.
For inquiries, submissions, and editorial support, please get in touch with nbulut@firat.edu.tr or visit our website at https://dergipark.org.tr/en/pub/jphcfum.

Stay connected with JPCFM for the latest research updates on physical chemistry and functional materials. Follow us on Social Media.

Published by DergiPark. Proudly supporting the advancement of science and innovation.https://dergipark.org.tr/en/pub/jphcfum