Abstract: The aim of video moment retrieval and highlight detection is to locate moments in videos and estimate the saliency scores of video clips given user queries. Although models based on audio ...