How to Classify a Million Galaxies in Three Weeks

  • Share
  • Read Later

Do you have an Internet connection, some free time and a penchant for staring off into space? Then Galaxy Zoo needs you.

Among the most ambitious and successful online "citizen science" projects to date, Galaxy Zoo asks its participants to help classify galaxies by studying images of them online and answering a standard set of questions about their features. For instance: Is the galaxy smooth or bulging? Is it elliptical or spiral? If it's spiral, how many arms does it have, and are they tightly wound or thrown open wide?

Galaxy Zoo was first launched in 2007 by astronomers and astrophysicists from the U.S. and U.K. The goal was to get the public to identify the shapes of 1 million galaxies in the Sloan Digital Sky Survey (SDSS), which were photographed between 2000 and 2008 by a telescope at the Apache Point Observatory in New Mexico. Because every feature of each galaxy had to be categorized by at least 20 people — having multiple classifications of the same object is important because it helps scientists assess how reliable each one is — astronomers estimated it would take three to five years to categorize all million galaxies.

It took three weeks. In the first year, 50 million classifications were made by 150,000 people. Galaxy Zoo became the world's largest database of galaxy shapes. There are now German- and Polish-language versions, and a Chinese one is scheduled to launch sometime in April.

So successful was the project that it spawned Galaxy Zoo 2 in February 2009 to classify another 250,000 SDSS galaxies. To date, more than 57 million classifications have been made by some 265,000 volunteers (this reporter's contribution is, so far, a meager 267); another 5 million classifications will finish the job. There's also an entire "Zooniverse" of related citizen-science projects, which include simulating galaxy collisions to study mergers; hunting for supernovae and hypervelocity stars, incredibly rare stars that are so "fast," they escape the gravitational pull of a galaxy; and compiling collections of "irregulars," galaxies that defy classification.

It might be tempting to dismiss Galaxy Zoo as just an amusing diversion — fun in an I-play-a-scientist-on-TV kind of way. But astronomers — and volunteers — have made real discoveries by mining its crowd-sourced data. Among them: red spiral galaxies (most spirals are blue), green peas (small but energy-packed, star-spewing galaxies) and Hanny's Voorwerp, an amorphous blue blob spotted by Dutch schoolteacher Hanny Van Arkel, who learned about Galaxy Zoo on the website of Brian May, the former Queen guitarist turned astrophysicist.

Galaxy Zoo discoveries have been important enough that astronomers have investigated some of them separately, using Earth-bound telescopes as well as the Hubble Space Telescope. The project's findings have led to 10 scientific articles, which appeared in peer-reviewed journals, and six more are on the way. And a few "zooites," as the volunteers refer to themselves, hope to publish their own citizen-science research projects someday, with help from professional astronomers.

Citizen science — "the involvement of nonprofessionals in the scientific process," according to University of Oxford astronomer Chris Lintott, one of Galaxy Zoo's founders — is not a new concept. Distributed-computing projects like SETI@home, which hunts for radio signals that might indicate intelligent life in the universe, and, which tests the accuracy of global climate models, have long tapped volunteers' home computers to help process data. The difference between these projects and Galaxy Zoo — and its inspiration, Stardust@home, which asks volunteers to search electron-microscope images for interstellar dust particles collected in space — is that the latter two interface with not the volunteer's PC but with his or her mind.

This model — their data, your brain — may represent an increasingly common way to handle large data sets. Relatively cheap technology and bandwidth have made data collection almost too easy. Many scientists are now drowning in massive amounts of data, which they don't have the time, resources or brain power to analyze. "In many parts of science, we're not constrained by what data we can get," says Lintott, who is also the co-host of the long-running BBC series The Sky at Night. "We're constrained by what we can do with the data we have. Citizen science is a very powerful way of solving that problem."

He estimates that the perfect graduate student — essentially, a human computer that never eats, sleeps or takes a bathroom break — spending 24 hours a day, seven days a week analyzing Galaxy Zoo's data would have needed three to five years to match what Galaxy Zoo's volunteers collectively accomplished in the project's first sixth months.

So who are these overachieving zooites? According to a 2008 survey of 11,000 Galaxy Zoo users, 80% are men and two-thirds live in the U.S. or U.K. They are primarily people who want to "contribute to original scientific research," says Jordan Raddick, education director of the Institute for Data-Intensive Engineering and Science at Johns Hopkins University, who helped conduct the survey. For some Galaxy Zoo volunteers, the draw is somewhat more philosophical. Contemplating a galaxy that exists at an almost unimaginable distance, in both space and time, and contributing a bit of knowledge about it can be humbling and satisfying. "Every galaxy has a story to tell. They are beautiful, mysterious, and show how amazing our universe is," says Aida Berges, a homemaker in Puerto Rico who has classified 150,000 galaxies — at one point putting in 16-hour days. "It was love at first sight when I started in Galaxy Zoo ... It is a magical place, and it feels like coming home at last."

Researchers from other disciplines have begun approaching the Galaxy Zoo team for help sorting their own masses of information. With Galaxy Zoo's assistance, the Royal Observatory Greenwich just launched Solar Stormwatch, which asks volunteers to track solar explosions captured on video by NASA's STEREO spacecraft. The idea is eventually to be able to predict these flare-ups, which interfere with satellites and endanger astronauts. Another project will task volunteers with translating the famous Oxyrhynchus Papyri, a cache of 50,000 Ptolemaic-era manuscript fragments from Egypt. Yet another will analyze footage of the New Caledonian crow in the wild. (It's one of the few nonprimate species to create and even modify tools.)

Yale astronomy graduate student Carolin Cardamone, who has published research on green pea galaxies, says astronomers were alerted to their existence by Galaxy Zoo volunteers who posted hundreds of images to the site's busy discussion forum. She says the project is an enormous boon to her field. "This is all hard, rigorous science," she says. "We're not giving [the volunteers] busywork to do. We're not doing this so they can have fun with science, but so they can participate in real science."