New Six-Week Weather Forecast Can Help Companies Better Understand Climate Change
A couple of months ago, we told you about how researchers and meteorologists are using AI to predict extreme weather. However, AI can also be used to predict regular weather forecasts. This can be useful for things like helping companies understand the impacts of climate change and rising CO2 emissions. Farmers can also use this information to better prepare their crops for the upcoming weather patterns. In this article we will take a look at how scientists are using AI to predict weather patterns and the data annotation required to train the ML algorithms.
Background Information
The world’s leading weather forecasting institutions currently rely on computationally expensive weather models running on massive supercomputers. In order to have predictive skill for forecasts two to six weeks in the future, large ensembles of many nearly identical runs of these models are required, but the computational resources needed for these ensembles scales with the number of forecasts run. Since the resources needed rapidly approaches modern-day computing limits, researchers explored the possibility of using computationally cheap weather models based on machine learning algorithms which learn to reproduce the evolution of weather.
As a result, they created a Deep Learning Weather Prediction (DLWP) model that can predict a six-week forecast at 1.4° resolution.The machine-learning model is capable of running 320 forecasts in three minutes on a single workstation, while the state-of-the-art model from the European Center for Medium-Range Weather Forecasts (ECMWF) utilizes supercomputers to run 50 forecasts. The model produces realistic forecasts of weather events and is even capable of nearly matching the performance of the ECMWF ensemble for forecasts of temperature four to six weeks in the future.
How Was the DLWP Model Trained?
The model was trained on gridded reanalysis data from the Climate Forecast System (CFS) Reanalysis. Researchers used data from January 1, 1979 to December 31, 2010. They chose this particular data set because its low resolution is ideal for computationally efficiency. To further increase computational efficiency, they decided to subset only the northern hemisphere, as cross-equatorial transport is usually not significant on time scales of a few days. Data from 2007–2010 were set aside for the test set used in final model performance evaluation. We used the time periods from 1979–2002 for model training and 2003–2006 for model validation. Distinct periods for training, validation, and testing were selected to avoid including in the evaluation data times that have high correlation with neighboring times in the training data.
It should be noted that the satellite data contains a lot more information than weather patterns. There is also information such as cloud patterns, soil moisture and drought stress in plants. In fact, it is the capability of an AI system to handle such a wealth of data coming in from satellites that makes it such an improvement over existing complex equations. The AI system can crunch thousands of forecasts simultaneously to deliver a clearer statistical picture with radically fewer resources than conventional equations. Some suggest the performance advances will be measured in as many as five orders of magnitude and use a fraction of the power.
What Types of Data Annotation are Necessary to Train the Model?
In the previous section we talked about the training datasets that were used. This raw data needs to be prepared using various data annotation methods to allow the ML algorithms to learn what they need. Since the system needs to understand all kinds of cloud formations since this can be a sign that there will be rain, data annotators would need to contour such formations with polygons. Also, if there are specific weather patterns the system needs to know since they are a tell-tale sign of a certain weather event, this could be done through simple labeling, or it could be done through more specific types of annotation such as semantic segmentation.
Trust Mindy Support With All of Your Data Annotation Needs
Mindy Support is a global company for data annotation and business process outsourcing, trusted by several Fortune 500 and GAFAM companies, as well as innovative startups. With 9 years of experience under our belt and offices and representatives in Cyprus, Poland, Romania, The Netherlands, India, and Ukraine, Mindy Support’s team now stands strong with 2000+ professionals helping companies with their most advanced data annotation challenges.