Multi Head Cross Modal Attention Issues