Skip to content

reliability-evalCalibration-first LLM evaluation

Not just "did the model get it right?" but "can you trust how confident it was?"

Released under the MIT License.