Large Language Models Are Not Fair Evaluators