LLMs for Diabetes Prediction
Benchmarking large language models against classical ML classifiers in a safety-critical clinical setting.
Published as From Chat to Checkup: Can Large Language Models Assist in Diabetes Prediction? in IEEE Xplore, this study examines the reliability and interpretability of generative models for decision support.
What it does
- Benchmarks LLM-based inference against traditional machine-learning classifiers on the Pima Indians Diabetes Dataset.
- Analyses model consistency and failure conditions, evaluating how trustworthy generative models are in a safety-critical context.
- Contributes empirical evidence to the broader question of trustworthy AI in decision-support systems.
Tech stack: Python · Large Language Models · scikit-learn · model benchmarking