Can Large Language Models Understand Spatial Audio