Early results for Charley and Frances

What a week we had! We had envisioned many classifications, but received so many more! So far we have received more than 11,000 classifications from nearly 2000 users in June. These storms had never been analyzed on CycloneCenter and Hurricane Charley was completed on the first day! Hurricane Frances is nearly complete now. We will likely have more completely new storms this month.

Learning algorithms

There are numerous crowdsourced science projects out there and each have the same goal: To better understand an issue (hurricanes, bats, animal populations, etc.) based on input from numerous clicks and selections from citizen scientists. In addition to the Zooniverse, there are other crowdsourced projects. The concept of learning from a crowd is not new. There are many mathematical and statistical papers available that provide a means to accurately learn the best possible answer based on everyone’s input.

In our analysis, we have used an approach to estimate a probability of a selection based on the selections from individuals, given what those individuals tend to select. It is a pretty complex algorithm that took me a while to understand, so I won’t belabor the point, but provide some links to the papers below. The method described by Raykar et al. is an Expectation Maximization (E-M) algorithm.

Our initial analysis is looking at what type of storm is the cyclone based on the broad categories available: No storm, Curved band, Embedded Center, Eye, Shear or Post tropical. Later, we plan to use this information to estimate of the storm’s intensity.

Hurricane Charley

Hurricane Charley was relatively short-lived: only 6 days so only about 48 images. This means it was completed relatively quickly, contrast that with Frances which has nearly 150 images.

The following graphically denotes the basic selections for Hurricane Charley. The selections (or votes) by citizen scientists are denoted in the lower graph. Each column  is the selections for a given image of a storm. The percentages show what fraction of the citizen scientists selected for an image. The upper graph denotes the probability of the image type based on the selections and the tendencies of the citizen scientists. These are most often 100% of one type, but can sometimes be a “toss-up” (i.e., no clear winner such as the case in the first two images of Charley).

Early results - Charley

Early results of citizen scientist votes and the combined storm type aggregation from a learning algorithm for Hurricane Charley (2004).

Also, there is quite a bit of variance in the selections and no clear time period when the storm had an eye. This is partly an artifact of the satellite imagery. Each pixel is about 8km while operational data available to forecasters can be as high as 1 km for each pixel. Such resolution helps identify small eyes.

Hurricane Frances

Even while Hurricane Frances is available for classifying, the early results are very good. They show a bit more consistency in the selections. Since it isn’t done yet, there are some images with less than 10 classifications, but it looks consistent so far.

Same as above, except for Hurricane Frances (2004).

Same as above, except for Hurricane Frances (2004).

The graph shows large agreement in storm type at various stages of hurricane development. The storm rapidly developed an eye by about day 3. It maintained an eye more most of the time between day 4-9. Then the primary type became embedded center with some selections of other types (e.g., shear). By day 12, the storm had begun to dissipate and was largely being classified as post-tropical or No storm.


Most of the users this month are new so these results certainly aren’t final. The learning algorithm needs lots more samples from all the new classifiers to more accurately understands their tendencies. As time goes on and those who were active on these storms classify other storms, the E-M algorithm will refine this storm.

Nonetheless, the results are very encouraging. In fact, we’ve made more than 180 of these plots for all storms that are complete (or nearly complete). The next step will be to further analyze the results and see how best to estimate storm intensity from these classifications.


The following papers were crucial in our initial analysis of the CycloneCenter data.

Learning from crowds 2010: VC Raykar, S Yu, LH Zhao, GH Valadez, C Florin, L Bogoni, L Moy, The Journal of Machine Learning Research 11, 1297-1322

This article is the basis for our current algorithm.  At first I used the binary approach to determine which images had eyes. Then I applied the multi-class approach (section 3) for all storm types.

Supervised learning from multiple experts: whom to trust when everyone lies a bit, 2009:VC Raykar, S Yu, LH Zhao, A Jerebko, C Florin, GH Valadez, L Bogoni, L Moy, Proceedings of the 26th Annual international conference on machine learning
This is basically the same method but with a bit more explanation for some aspects of the algorithm. Also, it has a great title.
About these ads

About K Knapp

I am a meteorologist at NOAA’s National Climatic Data Center in Asheville, NC. My research interests include using satellite data to observe hurricanes, clouds, and other climate variables. *******Disclaimer******* The opinions expressed in these blogs are mine only. They do not necessarily reflect the official views or policies of NOAA, Department of Commerce, or the US Government.

Trackbacks / Pingbacks

  1. A Tropical Cyclone Nursery | Cyclone Center - June 25, 2014

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


Get every new post delivered to your Inbox.

Join 1,320 other followers

%d bloggers like this: