affiliation not provided to SSRN
Diffusion model, Attention mechanism, Multimodal Generation, Fine-tuning