Video Task Tracking Tool

Thousands of Hours Footage Become Insights in Seconds

Innovation & Automation

Vid.Supervisor is a machine learning model that runs over a video to identify and tag behaviours. The goal is to decrease the man-power need to codify a video via a software that has a flexible fit for multiple retail business cases. In the complete solution, human input will be minimum, with only the role of reviewer remaining.

As the monotonous tasks are completed by AI, client team will confirm the automatically assigned tags (improving the algorithm’s accuracy), while client employees can concentrate their energies on work that requires thinking and analysis.

Vid.Supervisor it is ready for use in a variety of retail projects, first easing the creation of codebooks and ultimately reducing the time needed to codify a video by over 90%.

Requirements → Key Features

Person Detection

For staff tracking, we will use a people detector to be able to position them at a pre-defined Location at any moment in time (e.g. at workstation). This System will track every Person in frame, displaying a bounding box around them (optional) for easy visualization. It will also assign a unique ID to each (called here a detection ID), for future compatibility.

People Recognition

To extend the capabilities of the people recognition feature, facial recognition will be incorporated to link a person’s activity through any span of time (day, week, year …). Assuming we see a person’s face at some point while they’re in frame, we can retroactively tag their entire time in the frame, meaning video sewing and identification of activities per person is available.

Sewing Videos

From this process a separate video can be created for each individual staff member/tag/location, containing only the activity of that person/tag/location. This would be sewn together from entire frames in which that person/tag/location appears or from cropped frames containing only that person’s activity/tag/location.

Requirements → Technologies

  • DynamoDB – fully managed NoSQL database, the video logs will be stored here as time series
  • AWS Lambda – Serverless services that will provide video logs ingestion and serving
  • API Gateway – Aggregates all the backend services APIs as a single point of contact for the web application
  • AWS S3 – Cloud storage for the video files
  • Machine Learning – Evolving automatic processing that learns from and later automates the human tasks
  • CloudFront – CDN (Content Delivery Network) – caches the content closer to the app user’s geographical location, for faster streaming
  • Web Application – the frontend User Interface application, in which the users log-in and perform actions like uploading videos, generating reports, reviewing videos, and others

Video Task Tracking Tool Use Case

Want to contribute?

Let’s talk.