Fine Tuning Large Vision Language Models