Video understanding, Temporal action detection, Long-term motion representation
Image restoration, transformer, encoder-decoder, latent layer, attention.
Image quality assessment, Vision Transformer, Visual prompt tuning