Can Large Language Models Be An Alternative To Human Evaluation