: For temporal features (actions and movements).
: For multimodal features that link video content to text descriptions. VIape_mp4
: The video is broken down into individual images (frames). : For temporal features (actions and movements)