Video Data Annotation: What it is and Real Life Use Cases

Category: AI Insights

Published date: 11.03.2021

Read time: 9 min

Just like image annotation, video annotation is one of the major processes researchers are relying on to help machines recognize objects within their surroundings through computer vision. When performing data annotation work, the annotators would have to detect the moving objects using a wide range of techniques to make them recognizable to the machines. In this article, we will take a deep dive into video annotation and look at some of the industries where its importance is growing, the various types of data annotation methods, and a lot of other details.

What is Video Annotation? 

Video annotation is when you capture each object in the video with frame-by-frame annotated lines making the moving objects recognizable for computers or machines. It is a little bit more difficult than image annotation since the object of interest is in motion

Another challenge is usually the amount of data that has to be annotated. Since each video clip needs to be annotated frame-by-frame the amount of data can add up quickly. This is why a lot of companies developing machine learning projects tend to outsource this type of work to a data annotation service provider like Mindy Support.

What Industries Are Increasingly Relying On Video Annotation? 

Video annotation is often used in the automotive sector to train machine learning algorithms that power autonomous vehicles. This is what allows self-driving cars to recognize thighs like street lights, other cars, pedestrians, street signs, and anything else they might encounter on the road. Also, tracking human activity and pose estimation is used by video game companies to create the games we all love. This involves accurately annotating things like peoples’ facial expressions and how they and how they pose while performing various actions. Later on, we will provide some interesting use cases where Mindy Support annotated videos to create hockey and soccer games, but before we get to that, let’s take a look at the various types of data annotation. 

Types of Data Annotation 

There are many different types of data annotation and the choice of which one to use will depend on each individual project. The most common video data annotation techniques include: 

  • Landmark annotation – This is when you put points or landmarks on peoples’ faces in videos to identify facial features and expressions. 
  • Semantic segmentation  – The goal of semantic image segmentation is to label each pixel of an image with a corresponding class of what is being represented. This is one of the most detailed forms of data annotation. 
  • 3D Cuboid annotation – The data annotator would draw a cube around the object which allows the system to recognize the object’s length, width, and height. 
  • Polygon annotation – Since cuboids are limited to right angles, polygon annotation is useful for more lines and angles. Basically, the annotator would need to define the parameters around the object from each side.
  • Polyline annotation – This method is often used to annotate training datasets for autonomous vehicles to accurately detect the street lanes and road markings. All of this would need to be annotated with polylines so the system can detect lanes, defining bicycle lanes, directions, divergence, and opposite direction traffic to perceive the surroundings for safe and trouble-free driving.


There are many different situations or data annotation types where the above-mentioned techniques could be used. Such possible video data annotation types include: 

  • Video with object tracking – This is annotating a video with labels for entities and spatial locations for entities that are detected in the video or video segments provided.
  • Broken into frames – Sometimes you will need to label the objects in each given frame that are not in motions, as opposed to the object tracking mentioned above. 
  • Points of action – This would involve placing points to annotate all of the motions and allow the system to understand how the people or objects in the video are moving. 
  • Labeling – This is labeling all of the objects and other items of interest that the system would need to recognize. 

Challenges of Video Annotation 

There are many specific challenges presented by video data annotation. Such challenges include: 

  • Simply getting the annotation done – One of the challenges presented by video data annotation is that the objects are not stilled and annotators have to capture the moving object on the computer screen. This is why the videos are usually converted into other smaller clips like GIF files and then individual objects are identified for annotation.
  • Maintaining a high accuracy level – Data annotation is a very tedious and monotonous task and if the data annotator is not completely focused on their job, it can be difficult to maintain a high accuracy level. 
  • The large volume of data -Then we need to consider the sheer size of the datasets. Since a lot of video training data is necessary to train a machine learning system and this video would be further broken down into segments, the amount of data that would need to be annotated increases quickly. 
  • Choosing a service provider – All of this brings us to finding the right outsourcing provider who can handle all of your video data annotation needs since it would not be very efficient to do such work in-house. The outsourcing provider that you ultimately choose already has a lot of data annotators on staff since this would allow them to launch your project quicker and also scale your project since the amount of data can quickly add up. 


Now that we know about the various techniques, types, and challenges of video data annotation, let’s take a look at some use cases. 

Video Annotation Uses Cases From Mindy Support

A little bit earlier we mentioned that video data annotation is used to create video games and we recently started working on some interesting projects to create hockey and soccer games:

  • Video Annotation and tagging: Hockey games – From video games to live sports activities, every action can be monitored to make it usable as training data for AI and machine learning-based models in the gaming industry. For this project, we had to annotate live hockey games according to the client’s requirements and specifying all the events during the match. We allocated 50 trained specialists with ENG for this project.
  • Video annotation and tagging for a soccer game – We are working with a company that provides software for the analysis of sporting matches. The project is related to watching games and annotating the events during the game such as passes, outs, goals, etc. During the process, we had to specify the timestamps of the events, the team name, event, comment, and other specific attributes. A team of 100 annotators was trained for this project.

Mindy Support Can Help You With All of Your Video Data Annotation Needs

Mindy Support has extensive experience implementing and actualizing video data annotation projects of various sizes and complexities. We are one of the largest BPO providers in Eastern Europe with more than 2,000 employees across six locations all over Ukraine. Our size and location allow us to source and recruit the necessary number of candidates within a short time frame and we will be able to scale your team quickly without sacrificing the quality of the data annotation work. Our rigorous QA processes ensure that all of the work is done correctly the first time and that the necessary accuracy is maintained throughout the project. Contact us today to learn more about how we can help you. 


    Stay connected with our latest updates by subscribing to our newsletter.

      ✔︎ Well done! You're on the list now