Automated Requirements Conformance Evaluation

An extension of my thesis work that treats requirement generation as a conformance problem: how far do LLM-generated user stories deviate from standard requirement specifications, and can those deviations be detected automatically?

What it does

Systematically classifies structural conformance failures in LLM-generated user stories, building a taxonomy of deviation types from standard requirement formats.
Provides Python evaluation scripts that automate structural and semantic comparison between generated artefacts and reference baselines.
Connects generative-model consistency failures to the broader problem of detecting semantic divergence in evolving software artefacts.

Tech stack: Python · NLP · automated artefact validation