Jiaxin Zhu

Southwestern University of Finance and Economics (SWUFE)

Chengdu

China

SCHOLARLY PAPERS

2

DOWNLOADS

178

TOTAL CITATIONS

15

Scholarly Papers (2)

1.

Fishllm: Multimodal Instruction Tuning with Large Language Models for Fish Classification and Detection

Number of pages: 39 Posted: 07 Oct 2024
Southwestern University of Finance and Economics (SWUFE), Southwestern University of Finance and Economics (SWUFE), Southwestern University of Finance and Economics (SWUFE), Southwestern University of Finance and Economics (SWUFE) and University of Alberta
Downloads 144 (442,008)
Citation 15

Abstract:

Loading...

Multimodal LLM, Fish detection, Fish classification, ChatGPT

2.

V-Sparse: Temporal-Spatial Visual Compression and Coarse-to-Fine Alignment for Text-Video Retrieval

Number of pages: 48 Posted: 10 Feb 2025
affiliation not provided to SSRN, affiliation not provided to SSRN, Southwestern University of Finance and Economics (SWUFE), Southwestern University of Finance and Economics (SWUFE), Southwestern University of Finance and Economics (SWUFE), Southwest University of Finance and Economics and University of Alberta
Downloads 34 (1,011,955)

Abstract:

Loading...

Cross-modal retrieval, text-video retrieval, video semantic compression, granularity alignment