Bosniak classification of renal cysts using large language models: a comparative study

Hacıbey, İbrahim; Kaba, Esat

Bosniak classification of renal cysts using large language models: a comparative study

dc.contributor.author	Hacıbey, İbrahim
dc.contributor.author	Kaba, Esat
dc.date.accessioned	2025-11-17T07:41:18Z
dc.date.issued	2025
dc.department	RTEÜ, Tıp Fakültesi, Dahili Tıp Bilimleri Bölümü
dc.description.abstract	Background: The Bosniak classification system is widely used to assess malignancy risk in renal cystic lesions, yet inter-observer variability poses significant challenges. Large language models (LLMs) may offer a standardized approach to classification when provided with textual descriptions, such as those found in radiology reports. Objective: This study evaluated the performance of five LLMs-GPT-4 (ChatGPT), Gemini, Copilot, Perplexity, and NotebookLM-in classifying renal cysts based on synthetic textual descriptions mimicking CT report content. Methods: A synthetic dataset of 100 diagnostic scenarios (20 cases per Bosniak category) was constructed using established radiological criteria. Each LLM was evaluated using zero-shot and few-shot prompting strategies, while NotebookLM employed retrieval-augmented generation (RAG). Performance metrics included accuracy, sensitivity, and specificity. Statistical significance was assessed using McNemar's and chi-squared tests. Results: GPT-4 achieved the highest accuracy (87% zero-shot, 99% few-shot), followed by Copilot (81-86%), Gemini (55-69%), and Perplexity (43-69%). NotebookLM, tested only under RAG conditions, reached 87% accuracy. Few-shot learning significantly improved performance (p< 0.05). Classification of Bosniak IIF lesions remained challenging across models. Conclusion: When provided with well-structured textual descriptions, LLMs can accurately classify renal cysts. Few-shot prompting significantly enhances performance. However, persistent difficulties in classifying borderline lesions such as Bosniak IIF highlight the need for further refinement and real-world validation.
dc.identifier.citation	Hacibey, I., & Kaba, E. (2025). Bosniak classification of renal cysts using large language models: a comparative study. Bosniak-Klassifikation von Nierenzysten unter Verwendung von Large-Language-Modellen: Vergleichsstudie. Radiologie (Heidelberg, Germany), 10.1007/s00117-025-01499-x. Advance online publication. https://doi.org/10.1007/s00117-025-01499-x
dc.identifier.doi	10.1007/s00117-025-01499-x
dc.identifier.issn	2731-7048
dc.identifier.issn	2731-7056
dc.identifier.pmid	40851045
dc.identifier.uri	https://doi.org/10.1007/s00117-025-01499-x
dc.identifier.uri	https://hdl.handle.net/11436/11487
dc.identifier.wos	WOS:001556457600001
dc.identifier.wosquality	Q4
dc.indekslendigikaynak	Web of Science
dc.indekslendigikaynak	PubMed
dc.institutionauthor	Kaba, Esat
dc.language.iso	en
dc.publisher	Springer Heidelberg
dc.relation.ispartof	Die Radiologie
dc.relation.publicationcategory	Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı
dc.rights	info:eu-repo/semantics/closedAccess
dc.subject	Bosniak classification
dc.subject	Few-shot learning
dc.subject	Large language models
dc.subject	Renal cysts
dc.subject	Synthetic data
dc.title	Bosniak classification of renal cysts using large language models: a comparative study
dc.type	Article

Dosyalar

Orijinal paket

Listeleniyor 1 - 1 / 1

İsim:: hacıbey-2025.pdf
Boyut:: 642.45 KB
Biçim:: Adobe Portable Document Format

İndir

Lisans paketi

Listeleniyor 1 - 1 / 1

İsim:: license.txt
Boyut:: 1.17 KB
Biçim:: Item-specific license agreed upon to submission
Açıklama:

İndir

Koleksiyon

WoS İndeksli Yayınlar Koleksiyonu
PubMed İndeksli Yayınlar Koleksiyonu
TF, Dahili Tıp Bilimleri Bölümü Koleksiyonu