A new geospatial foundation AI model designed to improve carbon emission tracking and monitor the impact of climate change has been built by IBM. Using satellite data from Nasa, the open source model is being published on the Hugging Face platform. It is the first time Nasa has collaborated with a tech vendor on such a model.
IBM says the model was trained on images from the Harmonized Landsat Sentinel-2 (HLS) satellites from Nasa with images taken over the course of a year. The geospatial foundation model is built on top of enterprise technologies developed by IBM as part of its watsonx.ai platform, launched in May. The platform is being presented by IBM as a solution to a lack of labelled data in both the scientific and enterprise fields.
The models have since been refined on labelled data for flood and burn scar mapping over the same period. With additional fine tuning, the base model can be redeployed for tasks like tracking deforestation, predicting crop yields, or detecting and monitoring greenhouse gases. IBM and Nasa researchers are also working with Clark University to further adapt the open source model for a wider range of applications including time-series segmentation and similarity research.
One of the biggest problems facing companies and climate scientists is a lack of labelled data, or data in an accessible format. A study published earlier this year by Microsoft and Tata Consultancy Services found 80% of businesses were failing to disclose operational emissions targets. This is part due to a lack of data throughout the supply chain and of global trends. IBM argues that AI can help simplify this process.
While foundations models are trained on large datasets of unlabelled data, they can be fine tuned for specific use cases and deployed using labeled data. This means the geospatial model published by IBM can be re-tuned based on company information, or data for a specific scientific use to improve analysis.
While it is being published to Hugging Face, which allows developers to freely share AI models, initially, a commercial version will also be available on watsonx later this year. This, the company says, would allow enterprises to utilise the information in emission tracking and net zero targets.
IBM AI’s role in emission data and tracking
“The essential role of open-source technologies to accelerate critical areas of discovery such as climate change has never been clearer,” said Sriram Raghavan, vice president, IBM Research AI. “By combining IBM’s foundation model efforts aimed at creating flexible, reusable AI systems with Nasa’s repository of Earth-satellite data and making it available on Hugging Face, we can leverage the power of collaboration to implement faster and more impactful solutions that will improve our planet.”
One of the fine-tuned models already published by Nasa and IBM looks at burn scars across the US. These are marks remaining from wildfires. IBM says the model could be trained with 75% less labelled data than the current state-of-the-art model due to the pre-trained foundation model acting as a base. This will significantly improve tracking and predictions for wildfires and allow the model itself to run more efficiently.
“AI remains a science-driven field, and science can only progress through information sharing and collaboration,” said Jeff Boudier, head of product and growth at Hugging Face. “This is why open source AI and the open release of models and datasets are so fundamental to the continued progress of AI, and making sure the technology will benefit as many people as possible.”
NASA’s chief science data officer Kevin Murphy said foundation models have the potential to change the way observational data is analysed. “By open sourcing the model and making it available to the world, we hope to multiply its impact.”