Amazon Video Rekognition
How it Works
Video Rekognition actions work in two steps: one action sends the job to AWS, and another retrieves the results later. AWS keeps the results for 7 days, so you can run the retrieval action anytime within that period without being charged again. If you try to retrieve the results after 7 days, the action will fail because the results are no longer available.
How you set up these actions will depend on your workflow, but here’s a simple example: you can trigger a video rekogition job using a checkbox or pick list in CatDV. A second worker action can then run every 2 minutes to check for results—usually it only needs to check once or twice before the results are ready. To monitor the progress of a job, you can use the field aws.ai.rekognition.job.status—if it says IN_PROGRESS, the job is still running. For more technical details, see the ‘Troubleshooting & Advanced Notes’ section.
Videos submitted to Amazon Rekognition can be scanned for the same types of content as images, in addition to detecting Technical Cues. Videos are analyzed at a granularity of 500ms intervals, and clip markers are created showing the different ranges in which certain items appear. For example you might see a marker from 0s - 5s which says “Forest, Bear” and then a marker from 5s - 8s which just says “Forest”, meaning that the Bear (thankfully) left the frame at the 5s mark.
Import Worker Actions
While you're free to define your own workflows, pre-defined workflows are included in the WorkerActions folder to help you get started quickly. Simply drag and drop these files into the Worker GUI to import them. With minimal configuration, your workflow will be ready to use.
WorkerActions/Amazon AI – Submit Rekogition Jobs.catdv
WorkerActions/Amazon AI – Retrieve Rekognition Jobs.catdv
Worker Action Settings
The settings available in the Worker Plugin for Submit Rekognition Job are:
S3 Volume | The S3 remote volume identifier is required to access the S3 bucket. If you haven't set up a remote volume yet, please refer to the Authentication section. Note: The identifier must be enclosed in square brackets []. |
AWS Access Key (optional) | Define only if you wish to use a different account to perform the operation. |
AWS Secret Key (required) | Used as part of the authentication. |
Action | Submit Job. Submit a video rekognition request to AWS. Retrieve Job. Get video rekognition result from AWS. |
Set Parameters | Define the request parameters using CatDV fields or set them manually on the worker. |
Detection Types | Specifies what Amazon Rekognition looks for in a video—such as objects, faces, text, or unsafe content—to generate useful metadata for tagging and analysis. Label: Detects objects, scenes, and concepts in the video(e.g., car, tree, wedding). Faces: Detects faces and facial attributes like age range, emotions, or gender. Celebrities: Identifies well-known people in the video using Amazon’s celebrity database. Text: Detects and extracts text from video (printed or handwritten). Unsafe Images: Flags potentially inappropriate or unsafe content (e.g., nudity or violence). Technical Cues: Identifies content structure like black frames, color bars, or end credits. Scene Description: Generates natural-language descriptions of scenes to improve content understanding. Custom Faces: Matches detected faces against a custom face collection you’ve indexed. See Index Faces. |
The settings available in the Worker Plugin for Retrieve Rekognition Job are:
S3 Volume | The S3 remote volume identifier is required to access the S3 bucket. If you haven't set up a remote volume yet, please refer to the Authentication section. Note: The identifier must be enclosed in square brackets []. |
AWS Access Key (optional) | Define only if you wish to use a different account to perform the operation. |
AWS Secret Key (required) | Used as part of the authentication. |
Action | Submit Job. Submit a video rekognition request to AWS. Retrieve Job. Get video rekognition result from AWS. |
Set Parameters | Define the request parameters using CatDV fields or set them manually on the worker. |
Detection Types | Specifies what Amazon Rekognition looks for in a video—such as objects, faces, text, or unsafe content—to generate useful metadata for tagging and analysis. Label: Detects objects, scenes, and concepts in the video(e.g., car, tree, wedding). Faces: Detects faces and facial attributes like age range, emotions, or gender. Celebrities: Identifies well-known people in the video using Amazon’s celebrity database. Text: Detects and extracts text from video (printed or handwritten). Unsafe Images: Flags potentially inappropriate or unsafe content (e.g., nudity or violence). Technical Cues: Detects structural elements such as black frames, color bars, end credits, and distinct shots or scene changes. For this detection type, minimum marker interval and allow repeat matches settings are ignored to ensure full coverage of detected cues. Scene Description: Generates natural-language descriptions of scenes to improve content understanding. Custom Faces: Matches detected faces against a custom face collection you’ve indexed. See Index Faces. |
Minimum Confidence | Sets the threshold (0–100) to filter results based on Amazon Rekognition's confidence score. |
Minimum Marker Interval (secs) | Displays only the first detected instance within each marker interval. To show every instance, set the interval to 0. |
Allow Repeat Matches | When enabled, allows multiple instances of the same detection to appear, even if they occur close together. |
Show Confidence in Marker Name | Set confidence score as Marker name. e.g. ‘Confidence 83.55’ |
Draw Bounding Boxes | Bounding boxes are used to visually highlight detected elements—such as objects, faces, or celebrities—by drawing rectangles around them in the video. For the bounding box to appear, a color must be specified using a valid CSS color name (e.g., “lightblue”) or hex code (e.g., “#3698CF”). This feature is supported for detection types like Labels, Faces, Celebrities, and Custom Faces and is available in the CatDV Web Client as part of its marker annotation feature. Each detected instance is given its own marker with a bounding box, making it easier to identify and track movement within the video. Note: Bounding boxes are not yet supported in the CatDV Desktop Client. Editing clip markers in the Desktop Client (13.0.12 as of this writing) will remove any bounding box data. |
Bounding Box Colour | Sets the color of the rectangle used to highlight detections. Choose a distinct color to ensure visibility against the video content. |
Advanced Properties
For system builders and power users, here are some additional plugin options which can be set in the Worker Node Advanced Properties box:
Property key | Valid range (default) | Description |
amazonAI.outputRawResponseDataToField | <field identifer> | Writes a JSON object to the field of your choosing e.g. clip[ai.json] - note these responses can be half a megabyte or larger which doesn’t sound like much but can crowd your database if you’re not careful, so it is recommended only to use this feature in a transient nature and then remove the extra data when finished. |
amazonAI.outputRawResponseDataToFile | <path> | Writes a JSON object to an external text file of your choosing (e.g. /Volumes/MyData/myResponse.txt) which you could parse separately and then toss away, rather than keeping all of the raw response data in the CatDV database. |
amazonAI.transcodeSizeThreshold | Any number (250) | In megabytes; set this to 0 to always transcode media for analysis. |
Best Practices
When submitting video analysis actions, CatDV will automatically upload the original media file if it is H.264 and less than 250MB in size. Media of other encoding types or above 250MB in size will be transcoded at 540p, 30fps for analysis to prevent huge uploads / storage costs. See Advanced Properties for more options around this.
Video analysis in particular can become expensive at scale. For this reason it is recommended that you do not automatically run video analysis on all of your content, but rather selectively feed or curate your video analysis jobs. Amazon has a limit of 20 concurrent jobs per account - currently the plugin does not have a queueing mechanism for running more than 20 jobs at once, but that functionality may be added in a future update.
Usage Cost
With Amazon Rekognition Video you pay only for what you use. There are no resources to provision, no upfront costs, or minimum fees.
Amazon AI USD pricing as of Summer 2020 is:
· 12 months free - 1000 minutes per month
· then $0.10 per minute, per category (Labels, Faces, Celebrities, Text, Unsafe Images)
· or $0.05 per minute for technical analysis (Shots, Technical Cues)
For latest pricing information refer to Amazon official site https://aws.amazon.com/rekognition/pricing/ .
Please be mindful of the AWS service costs associated with each AI operation, especially as you scale up usage of the plugin. Quantum Corp. does not accept any liability for AWS service costs incurred as a result of using this software, regardless of if the usage incurred was intentional or not.