Skip navigation links

Sept. 10, 2014

Jason Gallant: The Rogue Butterfly

Sept. 10, 2014

Jason Gallant is an assistant professor of zoology in the College of Natural Science and the first author on a recent paper suggesting that when it comes to evolving some traits—especially simple ones—there may be a shared gene that’s the source.

What follows is a little parable that I tell all of my students in the laboratory about two “Eureka!” moments. It emphasizes the importance of never ignoring data, being creative, staying organized, and above all else, taking copious notes.

It’s 2 a.m. on a Tuesday in the winter of 2013. And so begins the third month of staring at the blue and grey lines that represent the culmination of my first scientific contribution since graduate school, and it’s not going well.

My task: to find the proverbial needle in the haystack. In this case, the needle is a tiny stretch of DNA only a few letters long, and the haystack is a region of butterfly genome approximately 150,000 base pairs long. Though this might seem an arduous task, much of the work has already been done—the project began as a search for a needle in a hayfield.

My colleagues, Vance Imhoff and Sean Mullen at Boston University, began combing the ‘hayfield’ (the entire butterfly genome, almost 500 million base pairs long), trying to identify the right ‘haystack’ to check. After years, they finally succeeded in identifying a region of the butterfly genome, approximately 150,000 base pairs long, which contains the gene that causes the normally stark white bands of the White Admiral butterfly, to turn dark purple when it is found in the southern United States, where the same butterfly is known as the red-spotted purple.

The only problem is, it doesn’t seem to be the case. Clicking around the data again for what feels like it might be the 100th or 1000th time reveals a stretch of about 20,000 bases that would otherwise be a perfect candidate—all of the bases are in one configuration in all of the white admirals, and a second configuration in almost all of the red-spotted purples we’ve sequenced. One red-spotted purple looks genetically identical to a white admiral.

This one butterfly, dead since 2005, has literally kept me up at night for weeks and refuses to give up its secrets. But unfortunately for this butterfly, I like working late almost as much as I like solving puzzles.

I think of myself a bit of a musician when I’m working late nights, riffing on ideas and trying things that may have seemed too crazy to do during the day. It’s my quiet, creative time.

Today’s data riff happens to be scanning the butterfly genome sequence that I’ve been staring at against a database of DNA sequences. It’s meant to be a distraction from my current troubles, but if it works it might offer clues to give insights as to what is going on. Paging through the results, I come across a striking finding. I have inadvertently found evidence of a retrotransposon, something like a DNA parasite, sitting right in the middle of my nearly perfect stretch.

Retrotransposons are little bits of DNA that can move haphazardly through the genome, landing in the middle of otherwise normal genes, sometimes causing mutations, oftentimes causing no discernable effect, but every now and then creating a big opportunity for evolution to occur.

It strikes me immediately that this retrotransposon could cause the switch between wing patterns. This new finding convinces me that we have found what we are looking for, but also frustrates me more deeply that the one butterfly bucks the rule. I switch off the computer to let this new information incubate.

Weeks pass. Sean and I discuss potential explanations for our rogue butterfly in light of this new data. Whiteboards and papers with complex genetic models to explain the data surround us, but none seem to be satisfying. Sean raises the possibility of a thought too horrible in my mind to say aloud—a labeling error. The numerous steps between the lab and the final genome sequence mean that a labeling error would be essentially impossible to sort out. We spend weeks checking anyway, but can find nothing to indicate such an error occurred.

Three more months pass. The whiteboards become indecipherable from arrows and old dry erase ink and the stack of doodles and notes grows, as does the frustration.

One day, Sean returns to the labeling error explanation and I throw my hands up in frustration. Deciding to cool off, we agree to go our separate ways for an hour and I continue to stare at my data, as has become habit. My mind, desperate to make a connection in any way, somehow recalls my first observation on the project—the vials of DNA had stickers on the top. A walk to the freezer down the hall suddenly reveals that the stickers had been applied, in a misguided attempt to stay organized, over the original labels! Flipping open a laboratory notebook by the original collector reveals that the rogue butterfly, which we had originally though was a red-spotted purple was, in fact, a white admiral all along!

After numerous other checks to confirm this the rest, as they say, is history. After we discovered the labeling error, our data showed us the exact region, which is responsible for the difference in wing patterns between red-spotted purples and white admirals.

We used this as a way of moving forward to identify the same cause in a distantly related group of butterflies, last sharing a common ancestor 65 million years ago. Because the data was so clean and unambiguous, we were able to spot a labeling error just by looking at the raw data—it just took us months to identify where the error took place!

Despite this ‘nightmare’ scenario, I always remind my students of the most important part of the story—because we made this error, our desperate attempts to explain it and the painful review of the data for months and months forced us to look at the data from many new perspectives. One perspective on a late night lead to the discovery of an elegant mechanism for how this mutation may have appeared in the genome of a red-spotted purple. The moral of the story—every piece of data, even when it is inconvenient, is incredibly important!

Photo courtesy Jason Gallant