Evaluating ChatGPT’s Effectiveness in Providing Medical Information for Pregnant Women with Rheumatic Diseases

Bahar Özdemir Ulusoy; Can Ozan Ulusoy

doi:10.38136/jgon.1581349

Clinical Research

Romatizmal Hastalığı Olan Hamile Kadınlara Tıbbi Bilgi Sağlamada ChatGPT'nin Etkinliğinin Değerlendirilmesi

Year 2025, Volume: 22 Issue: 1, 38 - 44, 22.03.2025

Bahar Özdemir Ulusoy , Can Ozan Ulusoy

https://doi.org/10.38136/jgon.1581349

Abstract

Amaç
ChatGPT'nin bir sağlık bilgi kaynağı olarak artan kullanımı, doğruluğunun ve yeterliliğinin değerlendirilmesi ihtiyacını vurgulamaktadır. Bu çalışmada, ChatGPT'nin (versiyon 3.5) romatizmal hastalığı olan hamile kadınların sıkça sorduğu sorulara Türkçe ve İngilizce yanıt vermedeki doğruluğu ve yeterliliği değerlendirilerek, romatoloji ve anne-fetal tıbbı alanlarında farklı dillerde güvenilir bir hasta bilgi kaynağı olma potansiyeli değerlendirilmiştir.
Gereç ve Yöntemler:
Gebelik ve romatizmal hastalıklarla ilgili toplam 36 soru Google'dan elde edildi ve yedi alt gruba ayrıldı. Sorular, ChatGPT'ye hem Türkçe hem de İngilizce olarak yöneltildi ve yanıtlar, bir romatolog (Uzman 1) ve bir perinatolog (Uzman 2) tarafından 4 puanlık bir ölçekle değerlendirildi. İstatistiksel analiz için Mann-Whitney U testi kullanıldı (p < 0.05 anlamlı kabul edildi).
Sonuçlar:
ChatGPT'nin İngilizce yanıtları, Türkçe yanıtlarına kıyasla daha yüksek bir doğruluk ve tamlık oranı göstermiştir. İngilizcede yanıtların %91,6'sı tam doğru olarak değerlendirilirken, Türkçede bu oran %75,0 olmuştur. Uzman 1, Türkçe yanıtlar için ortalama puanı 3,64 ± 0,54 ve İngilizce yanıtlar için 3,89 ± 0,31 olarak değerlendirmiştir; bu fark istatistiksel olarak anlamlıdır (p = 0,023). Uzman 2, Türkçe yanıtları ortalama 3,83 ± 0,37 ve İngilizce yanıtları ortalama 3,94 ± 0,23 puanla değerlendirmiştir ve istatistiksel olarak anlamlı bir fark yoktur (p = 0,136).
Tartışma:
ChatGPT, romatizmal hastalığı olan hamile kadınlar için erişilebilir bir bilgi kaynağı olarak umut vaat etmekte, ancak İngilizce olmayan yanıtlarında sınırlamalar bulunmaktadır. Bu durum, dil modellerinin dile özgü eğitiminde iyileştirme gereğini vurgulamaktadır. ChatGPT'nin birden fazla dil ve tıbbi uzmanlık alanındaki performansını keşfetmek için daha fazla araştırma yapılması önerilmektedir.

Keywords

ChatGPT, Romatizmal hastalıklar, Gebelik, Dil modelleri, Hasta eğitimi

Supporting Institution

yok

References

1. https://openai.com/index/chatgpt/ [Available from: https://openai.com/index/ 2. chatgpt/. Brown TB. Language models are few-shot learners. arXiv preprint arXiv:200514165. 2020. 3. Patel SB, Lam K. ChatGPT: the future of discharge summaries? The Lancet Digital Health. 2023;5(3):e107-e8.

Evaluating ChatGPT’s Effectiveness in Providing Medical Information for Pregnant Women with Rheumatic Diseases

Year 2025, Volume: 22 Issue: 1, 38 - 44, 22.03.2025

Bahar Özdemir Ulusoy , Can Ozan Ulusoy

https://doi.org/10.38136/jgon.1581349

Abstract

Objective:
The growing use of ChatGPT as a source of health information highlights the need to assess its accuracy and adequacy. This study evaluated the accuracy and adequacy of ChatGPT (version 3.5) in responding to frequently asked questions from pregnant women with rheumatic diseases in both Turkish and English, aiming to assess its potential as a reliable source of patient information across languages in rheumatology and maternal-fetal medicine.
Material and Methods:
A total of 36 questions related to pregnancy and rheumatic diseases were obtained from Google and divided into seven subgroups. Questions were posed to ChatGPT in both Turkish and English and responses were evaluated on a 4-point scale by a rheumatologist (Expert 1) and a perinatologist (Expert 2). Mann-Whitney U test was used for statistical analysis (p < 0.05 was considered significant).
Results:
ChatGPT’s English responses demonstrated a higher rate of accuracy and completeness compared to its Turkish responses. In English, 91.6% of answers were rated as correct, compared to 75.0% in Turkish. Expert 1 rated the average score for Turkish responses as 3.64 ± 0.54 and for English responses as 3.89 ± 0.31, a difference that was statistically significant (p = 0.023). Expert 2 rated Turkish responses with an average score of 3.83 ± 0.37 and English responses with an average score of 3.94 ± 0.23, with no statistically significant difference (p = 0.136).
Conclusion:
ChatGPT demonstrates promise as an accessible source of information for pregnant women with rheumatic disease, but has limitations in its non-English responses. This highlights the need for improvement in language-specific training of language models. Further research is recommended to explore the performance of ChatGPT across multiple languages and medical specialties.

Keywords

ChatGPT, Rheumatic diseases, Pregnancy, Language models, Patient education

Ethical Statement

This study did not involve human subjects and was therefore determined to be exempt from IRB review

Supporting Institution

none

References

1. https://openai.com/index/chatgpt/ [Available from: https://openai.com/index/ 2. chatgpt/. Brown TB. Language models are few-shot learners. arXiv preprint arXiv:200514165. 2020. 3. Patel SB, Lam K. ChatGPT: the future of discharge summaries? The Lancet Digital Health. 2023;5(3):e107-e8.

There are 1 citations in total.

Details

Primary Language	English
Subjects	Obstetrics and Gynaecology, Health Services and Systems (Other)
Journal Section	Research Articles
Authors	Bahar Özdemir Ulusoy 0000-0003-4711-4921 Can Ozan Ulusoy 0009-0005-7931-5172
Publication Date	March 22, 2025
Submission Date	November 7, 2024
Acceptance Date	December 25, 2024
Published in Issue	Year 2025 Volume: 22 Issue: 1

Cite

Vancouver	Özdemir Ulusoy B, Ulusoy CO. Evaluating ChatGPT’s Effectiveness in Providing Medical Information for Pregnant Women with Rheumatic Diseases. JGON. 2025;22(1):38-44.

Download Cover Image

Article Files

Full Text