affiliation not provided to SSRN
Keyword spotting, masked autoencoders, Self-supervised Learning, visual transformers, siamese neural networks, PHOC embedding