Generating wholescale natural language


As voice interfaces become ubiquitous in our daily lives the Met Office faces a challenge to not only make our data available, but to add the context and meaning necessary to make it useful.

The use of a Natural Language Generation API could bridge the gap between raw data and user understanding on a hyperlocal resolution. This has been designed initially with Amazon’s Alexa in mind, but any other voice device could interface with this service.

What do we do now?

The Met Office is already well equipped in getting text forecasts out to the public. When looking at a weather forecast, on both the website and app we can already see a text summary of the data that’s being presented to you. These summaries are handwritten daily by our expert meteorologists and provide an overview of the weather trend on a regional and national resolution to up to 30 days. 

What’s wrong with that?

Well, what if I want to know more specifically about what’s actually going to affect me; ’Will it rain in Exeter tomorrow?’ or ‘What’s the forecast for Plymouth on Wednesday?’. Of course one could infer from the regional text forecast a front is moving in from the west, or we could look to the weather symbols and surmise for ourselves that no rain is forecast, but neither of these methods seem entirely satisfactory. Why should we be asking users to read between the lines - can’t the Met Office just tell me what I need to know? The short answer is, we could! However, hiring professional Forecasters to handwrite bespoke forecasts for all the locations in the UK is just not feasible.

The solution

One way we can move toward addressing this issue is to (attempt to) dynamically generate text forecasts. Firstly we aim to design a service capable of simply replicating responses given by Alexa’s built-in weather skill. This format is proven for effectively delivering weather information via natural language, with voice UI weather requests accounting for ~40% of daily voice UI interactions.

As an additional benefit to users, we can also integrate the national severe weather warnings.

We have set this up as a HTTP API returning GeoJSON running on AWS Lambda behind an AWS API Gateway proxy. This ensures the service is not only scalable but is decoupled from Alexa and as such could provide a foundation for any other service requiring natural language forecasts.   

Generating a text forecast

When we came to break this down, it’s clear that Alexa’s built-in weather skill aways responds in a certain format. Initially giving the sky/rain state (‘It will be cloudy with heavy rain’), followed by a summary of the temperature (‘There will be a high of 15°C and a low of 10°C’). 

Internally, we decided to model each of these conditions as a ‘State’ object, seeding them with the forecast attributes as inputs. Each ‘State’ is concerned only with generating a fully formed text summary of it’s assigned condition. Each ‘State’ summary can stand alone or be combined with other States to form a meaningful paragraph. Here’s what the State interface looks like:

"use strict";

module.exports.State = class State {

     * @returns {string} - a single fully formed sentence text summary of phenomenon state
    getSummary() {
        throw new Error('you must implement the method getSummary');


To replicate Alexa’s forecasts we initially have two State implementations; RainState and TemperatureState.

Additionally we have added WindState and WarningsState. As a default the wind state is not mentioned, however we felt it had the potential to be highly impactful. We assigned a threshold (mapped to the Beaufort Wind Force Scale) above which we will include wind information on the end of our generated forecast. Similarly, any currently issued warnings are prepended.

Here is an example of the text these state objects generate:

"It will be sunny and is highly likely to remain dry."
"There will be a high of 16 degrees Celcius and a low of 12." 

Here is an example of how these states are combined into a concise forecast:

“In Exeter it will be sunny and is highly likely to remain dry. There will be a high of 16 degrees Celcius and a low of 12.”

Read this blog post for more information on how the data is turned into natural language. 

Future Work

When modelling temperature we felt there was an opportunity to add additional context that would describe how warm or cold it was; “It will feel very warm with highs of 24°C”. Temperature cycles fluctuate not only diurnally but also seasonally and regionally. Ideally we would be able to identify scenarios such as a very warm morning in Cornwall in December, in relation to what a usual December morning would usually feel like in Cornwall. This is of course theoretically possible but would rely on a system that gave API access to high resolution climate data. 

To create a more humanistic response we also considered ranking our state objects against each other, allowing us to mention the most impactful conditions first. If it’s a particularly cold day, that may be more important to the user than information on whether it's going to rain. However, to rank conditions based on this is not as easy as it sounds! This method would require a nuanced understanding of the impact of weather types combined with climatologies to understand how rare a given event is.

In the future, we hope to continue our research and development on these more experimental methods of natural language generation!