Automated Requirements Conformance Evaluation
A taxonomy and automated tooling for detecting structural conformance failures in LLM-generated requirements.
An extension of my thesis work that treats requirement generation as a conformance problem: how far do LLM-generated user stories deviate from standard requirement specifications, and can those deviations be detected automatically?
What it does
- Systematically classifies structural conformance failures in LLM-generated user stories, building a taxonomy of deviation types from standard requirement formats.
- Provides Python evaluation scripts that automate structural and semantic comparison between generated artefacts and reference baselines.
- Connects generative-model consistency failures to the broader problem of detecting semantic divergence in evolving software artefacts.
Tech stack: Python · NLP · automated artefact validation