The promising potential of vision language models for the generation of textual weather forecasts

December 03, 2025

Reading time: 5 minute

...

📝 Original Info

Title: The promising potential of vision language models for the generation of textual weather forecasts
ArXiv ID: 2512.03623
Date: 2025-12-03
Authors: Edward C. C. Steele, Dinesh Mane, Emilio Monti, Luis Orus, Rebecca Chantrill-Cheyette, Matthew Couch, Kirstine I. Dale, Simon Eaton, Govindarajan Rangarajan, Amir Majlesi, Steven Ramsdale, Michael Sharpe, Craig Smith, Jonathan Smith, Rebecca Yates, Holly Ellis, Charles Ewen

📝 Abstract

Despite the promising capability of multimodal foundation models, their application to the generation of meteorological products and services remains nascent. To accelerate aspiration and adoption, we explore the novel use of a vision language model for writing the iconic Shipping Forecast text directly from video-encoded gridded weather data. These early results demonstrate promising scalable technological opportunities for enhancing production efficiency and service innovation within the weather enterprise and beyond.

💡 Deep Analysis

📄 Full Content

THE PROMISING POTENTIAL OF VISION LANGUAGE MODELS FOR THE GENERATION OF TEXTUAL WEATHER FORECASTS A PREPRINT Edward C. C. Steele* Met Office Dinesh Mane Amazon Web Services Emilio Monti Amazon Web Services Luis Orus Amazon Web Services Rebecca Chantrill-Cheyette University of East Anglia† Matthew Couch Amazon Web Services Kirstine I. Dale Met Office Simon Eaton Met Office Govindarajan Rangarajan Amazon Web Services Amir Majlesi Amazon Web Services Steven Ramsdale Met Office Michael Sharpe Met Office Craig Smith Amazon Web Services Jonathan Smith Met Office Rebecca Yates Met Office Holly Ellis Amazon Web Services Charles Ewen Met Office ABSTRACT Despite the promising capability of multimodal foundation models, their application to the generation of meteorological products and services remains nascent. To accelerate aspiration and adoption, we explore the novel use of a vision language model for writing the iconic Shipping Forecast text directly from video-encoded gridded weather data. These early results demonstrate promising scalable technological opportunities for enhancing production efficiency and service innovation within the weather enterprise and beyond. Main We are presently experiencing a revolution in artificial intelligence (AI), with pioneering developments in machine learning weather prediction (MLWP) models demonstrating skill comparable to physics-based numerical weather prediction (NWP) models across a range of attributes at substantially lower computational cost [Lam et al., 2023, Allen et al., 2025, Bodnar et al., 2025]. Despite these advances, most of the early AI applications within the meteorological discipline have focused on tasks related to the generation of raw weather predictions rather than on the generation of weather products. This is significant as a weather forecast, however skillful, has no intrinsic value unless its user can derive a benefit from it [Mylne, 2002]. Consequently, it is reasonable to expect the benefit of alleviating the similarly resource-intensive effort typically required for the production of purposeful textual forecast bulletins, narratives and warnings that are useful, usable and used will likely match – if not exceed – the impact of AI use in other parts of the science-to-services value chain. While simple rules-based approaches provide an alternative means of automation in instances where data-to-text conversion is able to be coded explicitly, to enable wider product and service transformation on an enterprise scale it is essential to develop and improve solutions in a more efficient and effective manner that doesn’t cause an unnecessary proliferation of isolated microservices. Against the background of an ever-expanding demand from forecast consumers for the multimodal provision of personalized weather data/intelligence across an increasing range of required applications, the advent of Vision Language Models (VLMs; Bordes et al. [2024]) – that extend the capabilities of Large Language Models (LLMs; Vaswani et al. [2023], Minaee et al. [2025]) by combining computer vision and natural language processing – offer new possibilities for the scalable generation of textual meteorological products and services directly from gridded NWP or MLWP model output. Indeed, the increased availability of ∗edward.steele@metoffice.gov.uk †Work completed while at the Met Office arXiv:2512.03623v1 [cs.LG] 3 Dec 2025 The promising potential of vision language models A PREPRINT foundation models, combined with the opportunities for leveraging the latest advances more accessibly, afford new approaches for extracting meaning from complex scientific data – with LLMs/VLMs offering a different way to deliver information in a less labor-intensive manner – potentially leading to a whole new generation of products and services. In this paper we are, therefore, motivated to share our experiences from a targeted prototyping activity exploring the novel use of a VLM fine-tuned for meteorological data-to-text conversion using the writing of the iconic Shipping Forecast as an example. Celebrating its BBC broadcast centenary in 2025, the Shipping Forecast (the oldest and longest running weather forecast in the world) is a British cultural institution and cornerstone of regional maritime safety. Issued by the Met Office on behalf of the Maritime & Coastguard Agency (MCA), the forecast requires that the predicted conditions for each of its 31 constituent sea areas out to 24 hours ahead are analyzed and condensed into text sentences with a very specific length and format according to a strict set of rules. The output contains a general (pressure) synopsis followed by the area bulletins themselves. These use an 8-point compass for the wind direction, the Beaufort Scale for the wind strength, the Douglas Scale for wave height, a four-category classification of visibility and a maximum of five words to highlight any high impact weather, with key timing phrases within the final text breaking the

📄 Read Full PDF on ArXiv

📸 Image Gallery

Reference

This content is AI-processed based on open access ArXiv data.

The promising potential of vision language models for the generation of textual weather forecasts

📝 Original Info

📝 Abstract

💡 Deep Analysis

📄 Full Content

📸 Image Gallery

Reference

Table of Contents

Table of Contents

📝 Original Info

📝 Abstract

💡 Deep Analysis

📄 Full Content

📸 Image Gallery

Reference

Related Posts

Educational Cone Model in Embedding Vector Spaces

Text-Printed Image: Bridging the Image-Text Modality Gap for Text-centric Training of Large Vision-Language Models

Exploring Depth Generalization in Large Language Models for Solving Recursive Logic Tasks

Start searching

No results found