What a week we had! We had envisioned many classifications, but received so many more! So far we have received more than 11,000 classifications from nearly 2000 users in June. These storms had never been analyzed on CycloneCenter and Hurricane Charley was completed on the first day! Hurricane Frances is nearly complete now. We will likely have more completely new storms this month.
There are numerous crowdsourced science projects out there and each have the same goal: To better understand an issue (hurricanes, bats, animal populations, etc.) based on input from numerous clicks and selections from citizen scientists. In addition to the Zooniverse, there are other crowdsourced projects. The concept of learning from a crowd is not new. There are many mathematical and statistical papers available that provide a means to accurately learn the best possible answer based on everyone’s input.
In our analysis, we have used an approach to estimate a probability of a selection based on the selections from individuals, given what those individuals tend to select. It is a pretty complex algorithm that took me a while to understand, so I won’t belabor the point, but provide some links to the papers below. The method described by Raykar et al. is an Expectation Maximization (E-M) algorithm.
Our initial analysis is looking at what type of storm is the cyclone based on the broad categories available: No storm, Curved band, Embedded Center, Eye, Shear or Post tropical. Later, we plan to use this information to estimate of the storm’s intensity.
Hurricane Charley was relatively short-lived: only 6 days so only about 48 images. This means it was completed relatively quickly, contrast that with Frances which has nearly 150 images.
The following graphically denotes the basic selections for Hurricane Charley. The selections (or votes) by citizen scientists are denoted in the lower graph. Each column is the selections for a given image of a storm. The percentages show what fraction of the citizen scientists selected for an image. The upper graph denotes the probability of the image type based on the selections and the tendencies of the citizen scientists. These are most often 100% of one type, but can sometimes be a “toss-up” (i.e., no clear winner such as the case in the first two images of Charley).
Also, there is quite a bit of variance in the selections and no clear time period when the storm had an eye. This is partly an artifact of the satellite imagery. Each pixel is about 8km while operational data available to forecasters can be as high as 1 km for each pixel. Such resolution helps identify small eyes.
Even while Hurricane Frances is available for classifying, the early results are very good. They show a bit more consistency in the selections. Since it isn’t done yet, there are some images with less than 10 classifications, but it looks consistent so far.
The graph shows large agreement in storm type at various stages of hurricane development. The storm rapidly developed an eye by about day 3. It maintained an eye more most of the time between day 4-9. Then the primary type became embedded center with some selections of other types (e.g., shear). By day 12, the storm had begun to dissipate and was largely being classified as post-tropical or No storm.
Most of the users this month are new so these results certainly aren’t final. The learning algorithm needs lots more samples from all the new classifiers to more accurately understands their tendencies. As time goes on and those who were active on these storms classify other storms, the E-M algorithm will refine this storm.
Nonetheless, the results are very encouraging. In fact, we’ve made more than 180 of these plots for all storms that are complete (or nearly complete). The next step will be to further analyze the results and see how best to estimate storm intensity from these classifications.
The following papers were crucial in our initial analysis of the CycloneCenter data.
Learning from crowds 2010: VC Raykar, S Yu, LH Zhao, GH Valadez, C Florin, L Bogoni, L Moy, The Journal of Machine Learning Research 11, 1297-1322
This article is the basis for our current algorithm. At first I used the binary approach to determine which images had eyes. Then I applied the multi-class approach (section 3) for all storm types.
With 15,000+ citizen scientists contributing to CycloneCenter.org, we have more than thirty thousand eyes searching through satellite data.
So far, everyone has provided input on almost 50,000 images. As we begin to sift through all the responses, one task is to determine the storm type (eye, shear, embedded center or curved band) of each image from all the responses.
The eye images seem to make up about 8% of our images so far. The image below is a collection of some of the images identified as eye scenes by the citizen scientists. This is only a small portion of what we have, but it shows great progress.
This contains only 391 of the ~4500 eye images identified. So, 30,000 human eyes have found 4500 storm eyes.
Well, after getting 100,000+ classifications, we thought it was time to let you – the citizen scientists – know that you’re doing a great job!
Q: What did you find?
A: During our preliminary analysis, we have observed some results that encouraged us – the science team – so we thought it would encourage you – the citizen scientists.
Q: What did you analyze?
A: So far, we have not looked at detailed classifications (which should give the best estimate of intensity) but just the initial scene type classification. That is, we looked at how each scientist answered “Pick the cyclone type, then choose the closest match.”
Q: How did you analyze that data?
A: We assigned a number based on the scene selected – what I’ll call an Image Scene Number, or ISN. We then averaged the ISN for each image of a storm and did some analysis to get what we think is a current best estimate of an ISN. The results were surprisingly good based solely on this Image Scene Number. Further statistical analysis is needed (these initial results were obtained using simpler methods).
Q: What do we learn from the initial analysis?
A: Well, remember the analysis is still very preliminary. This is a real result for one cyclone in our 3000+ record of cyclones. What we want others to know is:
- Citizen scientists are doing a great job, but much more needs to be done. Every click counts!
- The science team is working with the data and are encouraged that some more initial results could be shown soon (within a year).
- The initial results show a similar pattern as the best track data. We are in the right ballpark! Further refinement and better statistical methods should refine the results.
Q: Can we see the results?
A: Yes! Now remember, these are still very preliminary, but here is the analysis so far for 1990 Hurricane Trudy.
- The time axis has the same reference start date.
- The boxes are the average ISN.
- The vertical lines represent the variation in ISN between multiple classifications. The larger the bar, the less certain we are of that value.
- The numbers along the bottom are the number of classifications averaged.
- When only one classification was available, the variation is zero (square with a dot in it). The dot does not mean no variation, but only that we don’t have enough classifications to estimate the variation.
- The ISN pattern captures the peaks in intensity near day 4 and 11 !
- So far, there seem to be more classifications in the early portion of the storm’s lifetime than later (lots of images with only 1 classification after day 7). This will change as more classifications are made.
- The valley (the weaker intensity around day 7) appears in both the best track and the initial analysis.
Q: Why 1990 Trudy?
A: Well, the results here were good and it shows an interesting intensity where it peaks as a strong hurricane, then weakens, then increases again before finally ending. Some other cyclones showed similar results. Much more analysis is needed before we can say how well the scientists and the analysis website are performing. But regardless, these results are promising!
Q: Why use ISN (Image Scene Number)? Why not something from the Dvorak method?
A: The technique used here is based on the Dvorak Technique for analyzing tropical cyclone intensity. In that method, there is a measure of intensity call the Pattern T-Number (called PT). While our method is based on Dvorak, it is not an exact representation of the Dvorak values. So ISN (Image Scene Number) is analogous to the Pattern T-number, but is not an exact match. Rather than confuse the two methods, we will try to define our values clearly and, where possible, will state if there are analogous parameters in the Dvorak method.