Bosniak classification of renal cysts using large language models: a comparative study

dc.contributor.authorHacıbey, İbrahim
dc.contributor.authorKaba, Esat
dc.date.accessioned2025-11-17T07:41:18Z
dc.date.issued2025
dc.departmentRTEÜ, Tıp Fakültesi, Dahili Tıp Bilimleri Bölümü
dc.description.abstractBackground: The Bosniak classification system is widely used to assess malignancy risk in renal cystic lesions, yet inter-observer variability poses significant challenges. Large language models (LLMs) may offer a standardized approach to classification when provided with textual descriptions, such as those found in radiology reports. Objective: This study evaluated the performance of five LLMs-GPT-4 (ChatGPT), Gemini, Copilot, Perplexity, and NotebookLM-in classifying renal cysts based on synthetic textual descriptions mimicking CT report content. Methods: A synthetic dataset of 100 diagnostic scenarios (20 cases per Bosniak category) was constructed using established radiological criteria. Each LLM was evaluated using zero-shot and few-shot prompting strategies, while NotebookLM employed retrieval-augmented generation (RAG). Performance metrics included accuracy, sensitivity, and specificity. Statistical significance was assessed using McNemar's and chi-squared tests. Results: GPT-4 achieved the highest accuracy (87% zero-shot, 99% few-shot), followed by Copilot (81-86%), Gemini (55-69%), and Perplexity (43-69%). NotebookLM, tested only under RAG conditions, reached 87% accuracy. Few-shot learning significantly improved performance (p< 0.05). Classification of Bosniak IIF lesions remained challenging across models. Conclusion: When provided with well-structured textual descriptions, LLMs can accurately classify renal cysts. Few-shot prompting significantly enhances performance. However, persistent difficulties in classifying borderline lesions such as Bosniak IIF highlight the need for further refinement and real-world validation.
dc.identifier.citationHacibey, I., & Kaba, E. (2025). Bosniak classification of renal cysts using large language models: a comparative study. Bosniak-Klassifikation von Nierenzysten unter Verwendung von Large-Language-Modellen: Vergleichsstudie. Radiologie (Heidelberg, Germany), 10.1007/s00117-025-01499-x. Advance online publication. https://doi.org/10.1007/s00117-025-01499-x
dc.identifier.doi10.1007/s00117-025-01499-x
dc.identifier.issn2731-7048
dc.identifier.issn2731-7056
dc.identifier.pmid40851045
dc.identifier.urihttps://doi.org/10.1007/s00117-025-01499-x
dc.identifier.urihttps://hdl.handle.net/11436/11487
dc.identifier.wosWOS:001556457600001
dc.identifier.wosqualityQ4
dc.indekslendigikaynakWeb of Science
dc.indekslendigikaynakPubMed
dc.institutionauthorKaba, Esat
dc.language.isoen
dc.publisherSpringer Heidelberg
dc.relation.ispartofDie Radiologie
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı
dc.rightsinfo:eu-repo/semantics/closedAccess
dc.subjectBosniak classification
dc.subjectFew-shot learning
dc.subjectLarge language models
dc.subjectRenal cysts
dc.subjectSynthetic data
dc.titleBosniak classification of renal cysts using large language models: a comparative study
dc.typeArticle

Dosyalar

Orijinal paket

Listeleniyor 1 - 1 / 1
Yükleniyor...
Küçük Resim
İsim:
hacıbey-2025.pdf
Boyut:
642.45 KB
Biçim:
Adobe Portable Document Format

Lisans paketi

Listeleniyor 1 - 1 / 1
Yükleniyor...
Küçük Resim
İsim:
license.txt
Boyut:
1.17 KB
Biçim:
Item-specific license agreed upon to submission
Açıklama: