Rd-Fgm: A Novel Model for High-Quality and Diverse Food Image Generation And  Ingredient Classification

14 Pages Posted: 17 Apr 2024

See all articles by Jing Wang

Jing Wang

Shandong Normal University

Yuanjie Zheng

Shandong Normal University

Junxia Wang

Shandong Normal University

Xiao Xiao

affiliation not provided to SSRN

Jing Sun

Weifang University

Sujuan Hou

Shandong Normal University

Abstract

Food image generation plays a crucial role in evaluating multiple food ingredients, predicting dietary preferences, recommending food, and computing dietary nutrition. However, this task is challenging due to the large variation in the appearance of recipe components, the difficulty in aligning multi-modal features, and the lack of diversity in generated data. To address these challenges, we propose a novel RecipeCLIP-Diffusion Food Generation Model (RD-FGM), which facilitates high-quality diverse image generation while accomplishing multi-modal feature alignment. Specifically, the RecipeCLIP model implements a multi-ingredient embedding of image-text pairs for aligning contextual features. Additionally, we devise a multi-conditional guided diffusion model that achieves data distribution learning and generation control. We evaluate RD-FGM on both the large-scale Recipe1M dataset and the VIREO Food-172 Chinese dataset, and our results demonstrate the effectiveness and versatility of RD-FGM. Furthermore, we conducted experiments to assess its effectiveness in ingredient classification using the VIREO Food-172 and ETH Food-101 datasets. The designed multi-ingredient embedding utilized in RD-FGM alignment of contextual features, improving ingredient classification performance compared to baselines. The capability to generate realistic food images from textual recipes opens up new avenues for exploring culinary creations, food and ingredients classification, promising various applications in the food industry and beyond.

Keywords: Food computing, Food image generation, Diffusion model, Multi-modal joint embedding, Computer vision

Suggested Citation

Wang, Jing and Zheng, Yuanjie and Wang, Junxia and Xiao, Xiao and Sun, Jing and Hou, Sujuan, Rd-Fgm: A Novel Model for High-Quality and Diverse Food Image Generation And  Ingredient Classification. Available at SSRN: https://ssrn.com/abstract=4798524 or http://dx.doi.org/10.2139/ssrn.4798524

Jing Wang (Contact Author)

Shandong Normal University ( email )

Jinan
China

Yuanjie Zheng

Shandong Normal University ( email )

Jinan
China

Junxia Wang

Shandong Normal University ( email )

Jinan
China

Xiao Xiao

affiliation not provided to SSRN ( email )

No Address Available

Jing Sun

Weifang University ( email )

5147 Dongfeng E St
Kuiwen
Weifang
China

Sujuan Hou

Shandong Normal University ( email )

Jinan
China

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
12
Abstract Views
68
PlumX Metrics