Fine Tuning Vision Language Model Paligemma On Custom Captions