Title: The promising potential of vision language models for the generation of textual weather forecasts
ArXiv ID: 2512.03623
Date: 2025-12-03
Authors: Edward C. C. Steele, Dinesh Mane, Emilio Monti, Luis Orus, Rebecca Chantrill-Cheyette, Matthew Couch, Kirstine I. Dale, Simon Eaton, Govindarajan Rangarajan, Amir Majlesi, Steven Ramsdale, Michael Sharpe, Craig Smith, Jonathan Smith, Rebecca Yates, Holly Ellis, Charles Ewen
📝 Abstract
Despite the promising capability of multimodal foundation models, their application to the generation of meteorological products and services remains nascent. To accelerate aspiration and adoption, we explore the novel use of a vision language model for writing the iconic Shipping Forecast text directly from video-encoded gridded weather data. These early results demonstrate promising scalable technological opportunities for enhancing production efficiency and service innovation within the weather enterprise and beyond.
💡 Deep Analysis
📄 Full Content
THE PROMISING POTENTIAL OF VISION LANGUAGE MODELS FOR
THE GENERATION OF TEXTUAL WEATHER FORECASTS
A PREPRINT
Edward C. C. Steele*
Met Office
Dinesh Mane
Amazon Web Services
Emilio Monti
Amazon Web Services
Luis Orus
Amazon Web Services
Rebecca Chantrill-Cheyette
University of East Anglia†
Matthew Couch
Amazon Web Services
Kirstine I. Dale
Met Office
Simon Eaton
Met Office
Govindarajan Rangarajan
Amazon Web Services
Amir Majlesi
Amazon Web Services
Steven Ramsdale
Met Office
Michael Sharpe
Met Office
Craig Smith
Amazon Web Services
Jonathan Smith
Met Office
Rebecca Yates
Met Office
Holly Ellis
Amazon Web Services
Charles Ewen
Met Office
ABSTRACT
Despite the promising capability of multimodal foundation models, their application to the generation
of meteorological products and services remains nascent. To accelerate aspiration and adoption,
we explore the novel use of a vision language model for writing the iconic Shipping Forecast text
directly from video-encoded gridded weather data. These early results demonstrate promising scalable
technological opportunities for enhancing production efficiency and service innovation within the
weather enterprise and beyond.
Main
We are presently experiencing a revolution in artificial intelligence (AI), with pioneering developments in machine
learning weather prediction (MLWP) models demonstrating skill comparable to physics-based numerical weather
prediction (NWP) models across a range of attributes at substantially lower computational cost [Lam et al., 2023, Allen
et al., 2025, Bodnar et al., 2025]. Despite these advances, most of the early AI applications within the meteorological
discipline have focused on tasks related to the generation of raw weather predictions rather than on the generation of
weather products. This is significant as a weather forecast, however skillful, has no intrinsic value unless its user can
derive a benefit from it [Mylne, 2002]. Consequently, it is reasonable to expect the benefit of alleviating the similarly
resource-intensive effort typically required for the production of purposeful textual forecast bulletins, narratives and
warnings that are useful, usable and used will likely match – if not exceed – the impact of AI use in other parts of the
science-to-services value chain. While simple rules-based approaches provide an alternative means of automation in
instances where data-to-text conversion is able to be coded explicitly, to enable wider product and service transformation
on an enterprise scale it is essential to develop and improve solutions in a more efficient and effective manner that doesn’t
cause an unnecessary proliferation of isolated microservices. Against the background of an ever-expanding demand
from forecast consumers for the multimodal provision of personalized weather data/intelligence across an increasing
range of required applications, the advent of Vision Language Models (VLMs; Bordes et al. [2024]) – that extend the
capabilities of Large Language Models (LLMs; Vaswani et al. [2023], Minaee et al. [2025]) by combining computer
vision and natural language processing – offer new possibilities for the scalable generation of textual meteorological
products and services directly from gridded NWP or MLWP model output. Indeed, the increased availability of
∗edward.steele@metoffice.gov.uk
†Work completed while at the Met Office
arXiv:2512.03623v1 [cs.LG] 3 Dec 2025
The promising potential of vision language models
A PREPRINT
foundation models, combined with the opportunities for leveraging the latest advances more accessibly, afford new
approaches for extracting meaning from complex scientific data – with LLMs/VLMs offering a different way to deliver
information in a less labor-intensive manner – potentially leading to a whole new generation of products and services.
In this paper we are, therefore, motivated to share our experiences from a targeted prototyping activity exploring the
novel use of a VLM fine-tuned for meteorological data-to-text conversion using the writing of the iconic Shipping
Forecast as an example.
Celebrating its BBC broadcast centenary in 2025, the Shipping Forecast (the oldest and longest running weather forecast
in the world) is a British cultural institution and cornerstone of regional maritime safety. Issued by the Met Office on
behalf of the Maritime & Coastguard Agency (MCA), the forecast requires that the predicted conditions for each of its
31 constituent sea areas out to 24 hours ahead are analyzed and condensed into text sentences with a very specific length
and format according to a strict set of rules. The output contains a general (pressure) synopsis followed by the area
bulletins themselves. These use an 8-point compass for the wind direction, the Beaufort Scale for the wind strength, the
Douglas Scale for wave height, a four-category classification of visibility and a maximum of five words to highlight any
high impact weather, with key timing phrases within the final text breaking the