Because while testing software is difficult, testing LLMs is MUCH MUCH more difficult and we're only just getting started
This might become an even bigger issue in models that are trained using synthetic data causing larger degree of hallucinations in the model output.
This might become an even bigger issue in models that are trained using synthetic data causing larger degree of hallucinations in the model output.