Benchmarking Llms Via Uncertainty Quantification Paper