Smart Crop and Smart Preview via Video Understanding
Understanding video content has been a focus for video-sharing platforms. It is one of the most important driving forces for the growth in distribution, discovery, user experience and monetization. Instream video understanding is the technology area where we analyze and utilize finer granularity video signals in the spatial and the temporal domains. The fine-grained spatial and temporal signals can be used for consumer facing products or used as signals for downstream models and pipelines. For example, in the spatial domain, we identify the salient regions inside each frame, which enables a system to automatically reframe a horizontal (landscape) video into a vertical (portrait) one. In the temporal domain, we identify the highlight score of each frame, which enables us to identify the highlight moments inside a video and create a video trailer.