China
Foshan University
Adaptive multi-text union learning;text-to-image synthesis;cross-modal generation