Video Annotation Services

Video annotation is an important process in preparing

training datasets for deep learning and machine learning

models in the automotive, gaming, AR/VR development

and many other industries.


What is Video Annotation?

Video annotation for machine learning involves breaking down the video into frames and preparing all of these frames with various techniques. The exact number of frames that will need to be annotated will depend on the length of the video and frames per second (fps). For example, you may have a video clip that’s only 60 seconds long, but if the frame rate is 60 fps, that’s 3,600 static images that need to be annotated. As you can imagine, this is a very time-consuming process which is why a lot of companies outsource such work to a video annotation company.

How Mindy Support can help companies with video annotation challenges

There are many reasons why video data annotation is a challenging process. First of all, since the object of interest is in motion, this makes the task of labeling objects correctly to get precise outcomes more difficult. Also, we need to keep in mind the huge volumes of video annotation that are usually required. In the example mentioned above, we talked about the number of static images that will need to be annotated in a 60-second video. Now, imagine if you have a video that’s several minutes long. The workload will increase by several orders of magnitude and become very time-consuming, which is why video annotation outsourcing is such an attractive option for many companies. Finally, the large number of events that need to be tracked in the video can overlap. This is challenging for the annotation because it requires a high level of accuracy, up to milliseconds, which is quite difficult and requires the right technical approach

Types of Video Annotation Services We Provide

2D Bounding Boxes

This annotation method involves superimposing a rectangular 2D box over the object of interest in each frame, which helps the system identify the objects in the real world. This method is often used for annotating video for the automotive and security as well as media and entertainment industries.

3D Boxes

3D boxes offer the system more insights into objects in the image, specifically the length, width and height. Therefore, it is slightly more accurate than the 2D box method mentioned earlier. It is often used to annotate videos for the automotive sector to give the system an understanding of the traffic situation. In addition, cuboids are also used to create algorithms for the operation of robots and drones since they need to analyze not only the objects themselves and their sizes, but also their placement in three-dimensional space and the distance between them.

Lines and Splines

This annotation method is used to delineate boundaries between one part of an image and another. It is often used in the automotive industry to delineate all of the various road lines. However, it can also be used for annotations where a particular region needs to be annotated as a boundary.


Polygons are very useful for annotating irregularly shaped images that do not fit well into rectangular frames. It detects the exact shape and size of the object and also ensures more precise localization. This type of annotation is used in the automotive sector to annotate all of the objects on the road.


This method involves placing keypoints over the area of interest Precisely detect shape variations for motion tracking, facial landmark detection, and hand gesture recognition. This is often used for things like facial recognition in security systems and also in video games for tracking the movements of characters.

Labeling / Tagging

Data annotations tag or label the objects in the frames. This trains the machine learning system to identify objects in the real world. This is also useful for tracking peoples’ movements in the real world by labeling the sequence of events on which all of these actions were taking place with labels.

Classification / Categorization

This method involves classifying or categorizing certain events in the video. This method is useful if you need your product to identify specific movements or actions. It is often used in the gaming, VR and security industries. Classification can be applied to the entire video and it can describe the quality, usefulness or compliance of the video with the stated message.

Event tracking

Event Tracking does not involve annotating the frames themselves, but the video tracks, localizing and labeling events of interest in time. This method is used for detecting all events of interest. Fragments with events can overlap, and video tracks can be duplicated for annotation, if the project involves a multi-class group of labels.

Video Annotation Use Cases

Helping Doctors and Researchers Facilitate Immunology Research

Our data annotators needed to track the movements of various cells and account for the trajectors of each cell. It was very important that everything was done right since this would be later used for immunology research.

Results Delivered to the Client

  • 98% Quality Rate
  • 5-6 Videos were annotated every week
  • We assembled an entire team in 4 days and annotated 10,000+ images

Video Annotation for IT & Computer Software

Managers and stakeholders need to be able to monitor the pace of construction and make sure safety precautions are being followed. We helped the client lower costs and keep the project on schedule by annotating more than 300 hours of video.

Results Delivered to the Client

  • 98% Quality Rate
  • 5-6 Videos were annotated every week
  • We assembled an entire team in 4 days and annotated 10,000+ images

Action Tracking People’s Movements in Videos Using Labeling

The client is working on a very large project which required tracking people’s movements in videos. This included things like running, walking, bike-riding and many other things. We needed to label the sequence of events when all of these movements were taking place.

Results Delivered to the Client

  • 15 action classes annotated
  • >95% accuracy
  • 30 full-time data annotators working on project

Video Annotation for Sports

We are working with a company that provides software for the analysis of sporting matches. We had to specify the timestamps of the events, the team name, event, comment, and other specific attributes.

Results Delivered to the Client

  • 98% Quality Rate
  • 5-6 Videos were annotated every week
  • We assembled an entire team in 4 days and annotated 10,000+ images

Tagging the Make and Models of Cars in Videos

The client was working on a product that would use computer vision to distinguish between all of the various makes and models of cars. We needed to annotate a dataset of 50,000 short videos that were taken in various parking lots.

Results Delivered to the Client

  • 50,000 videos annotated
  • Classified cars into 750 classes
  • Everything was done in 15 days
  • >95% quality score

Video Annotation for Human Emotions

We needed to classify the emotional coloring for the whole video by the dialogs in it. Annotate time periods on video with different sounds. Choose the best corresponding label for the type of sound. We hired data annotators with knowledge of German to do the work.

Results Delivered to the Client

  • 98% Quality Rate
  • 5-6 Videos were annotated every week
  • We assembled an entire team in 4 days and annotated 10,000+ images

Why Choose Us

Let’s Expand with Mindy!

    I have read and agree to the Privacy Policy

    We have a minimum threshold for starting any new project, which is 735 productive man-hours a month (equivalent to 5 graphic annotators working on the task monthly).