Sast: Semantic-Aware Stylized Text-to-Image Generation

15 Pages Posted: 12 Feb 2025

See all articles by XINYUE SUN

XINYUE SUN

affiliation not provided to SSRN

yongzhen ke

Tiangong University

Jing Guo

Tiangong University

Shuai Yang

Tiangong University

Kai Wang

Tiangong University

Abstract

The pre-trained text-to-image diffusion probabilistic model has achieved excellent quality, showing users good visual effects and attracting many users to use creative text to control the generated results. For users' detailed generation requirements, using reference images to "stylize" text-to-image is more common because they cannot be fully explained in limited language. However, there is a style deviation between the images generated by existing methods and the style reference images, contrary to the human perception that similar semantic object regions in two images with the same style should share style. To solve this problem, this paper proposes a semantic-aware style transfer method (SAST) to strengthen the semantic-level style alignment between the generated image and style reference image. First, we lead language-driven semantic segmentation trained on the COCO dataset into a general style transfer model to capture the mask that the text in the style reference image focuses on.

Keywords: Computing methodologies, Artificial Intelligence, Computer Vision, Computer vision representations, Image representationse

Suggested Citation

SUN, XINYUE and ke, yongzhen and Guo, Jing and Yang, Shuai and Wang, Kai, Sast: Semantic-Aware Stylized Text-to-Image Generation. Available at SSRN: https://ssrn.com/abstract=5134301 or http://dx.doi.org/10.2139/ssrn.5134301

XINYUE SUN

affiliation not provided to SSRN ( email )

No Address Available

Yongzhen Ke (Contact Author)

Tiangong University ( email )

China

Jing Guo

Tiangong University ( email )

China

Shuai Yang

Tiangong University ( email )

China

Kai Wang

Tiangong University ( email )

China

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
39
Abstract Views
137
PlumX Metrics