Use of Large Language Models for Medical Synthetic Data Generation in Mental Illness

dc.contributor.authorAygün İ.
dc.contributor.authorKaya M.
dc.date.accessioned2024-07-22T08:03:04Z
dc.date.available2024-07-22T08:03:04Z
dc.date.issued2023
dc.description.abstractData quantity and quality are very important for the development of medical artificial intelligence research. Nowadays, thanks to easier access to data, studies in this field produce very successful results. However, many factors such as protection of patient rights in medical data and confidentiality of personal data prevent researchers from directly accessing the data. For this reason, synthetic data creation studies are often needed both to expand the training and test sets and to create sample cases to be used in the relevant field. In this study, various synthetic patient data are created to be presented to a language model that enables the detection of psychological disorders through patient text. Synthetic data sets were produced with 200 artificial patient data created with popular LLM examples ChatGPT and Google Bard. The quality of synthetic data was measured with the help of a pre-trained BERT model using these datasets. In the experiments, it was observed that chatbots that generate instant data, such as ChatGPT and Google Bard, produced successful results at rates of 89% and 86% with the language representation model. With the experimental results, it appears that LLM studies can provide more successful results than advanced language models in various medical text production tasks. © The Institution of Engineering & Technology 2023.
dc.identifier.DOI-ID10.1049/icp.2024.1033
dc.identifier.issn27324494
dc.identifier.urihttp://akademikarsiv.cbu.edu.tr:4000/handle/123456789/12106
dc.language.isoEnglish
dc.publisherInstitution of Engineering and Technology
dc.subjectComputational linguistics
dc.subjectDeep learning
dc.subjectDiseases
dc.subjectChatGPT
dc.subjectDeep learning
dc.subjectGoogle bard
dc.subjectGoogle+
dc.subjectLanguage model
dc.subjectLarge language model
dc.subjectMental illness
dc.subjectPatient data
dc.subjectSynthetic data
dc.subjectSynthetic data generations
dc.subjectHospital data processing
dc.titleUse of Large Language Models for Medical Synthetic Data Generation in Mental Illness
dc.typeConference paper

Files