Skip to content
June 19, 2016 / jpschimel

How I learned to hate statisticians

OK, I don’t hate statisticians. But have you ever gotten so sick from eating something once that you haven’t been able to look at that dish for years afterward? So how would you feel if an experimental design was foisted upon you on the basis of “statistical perfection” that wasted >$1 million and an entire year’s effort by many, many people on a nationally important study? That was my experience on the Exxon Valdez coastal habitat damage assessment study.

I started as an Assistant Professor at the University of Alaska Fairbanks in January 1989. It was quite the welcome to Alaska—that winter I saw the thermometer read -60 F and my mother was sure that I was going to freeze to death. The ice fog in Fairbanks was so thick that I was stranded on campus for weeks, and with my impressive skills at driving on ice, it was taking many people’s lives into my hands anytime I tried to drive to the supermarket.

But then on March 24, the Exxon Valdez ran aground on Bligh Reef. Everyone with any scientific expertise, it seemed, got caught up in the effort of trying to figure out how to assess the damage to the magical environment of Prince William Sound. How do you assess such damage? The animal people had it “easy”—everyone agreed that you could set a cash value on a dead sea otter; $10,000 per animal? But how do you assess the damage to the habitat that supports those sea otters? How much is a dead barnacle worth? How much are a few fronds of dead Fucus worth? The obvious answer would be that on their own, it would be awfully close to zero. But these are the base of the food chains that support the otters, the murrelets, and the herring. Clearly the value of the ecosystem is far, far from zero—rather it’s mammoth!

So we put together a damage assessment strategy that focused on foodweb concepts, targeting the quantity, quality, and composition of key trophic levels: The Coastal Habitat Damage Assessment. A large group of us developed the core approach over several meetings in Juneau and Anchorage, with a plan to get research teams into the field by August. We called it the “Coastal Habitat” study to emphasize that we were studying basic ecosystem members not for their own sake necessarily, but because they created the habitat for the more charismatic members.

We developed a sampling strategy that would compare heavily oiled sites to lightly or unoiled sites of different habitat types (e.g. exposed rocky shores, sheltered rocky shores, sandy beaches, estuaries), and would have three separate teams spread across the coast of Alaska: one in Prince William Sound itself, one in Kenai, and the third in the Shelikof Straight area of Kodiak and Katmai.

The biologists on the study wanted to make it a paired design where we would use a GIS system to classify the degree of oiling and of habitat type along all the shorelines of Prince William Sound and of the other sampling areas. We would randomly select heavily oiled sites in each habitat type. Then we wanted to pick the nearest available lightly or unoiled site of the same habitat type to use as a paired control. We felt this would balance the need for random sampling with ensuring meaningful biological reality.

But this was a huge effort, coordinated by State and Federal Agencies, and the Management Team had contracted a biometrician who I understood was well known known and respected for work on wildlife, but I’ll leave his name and affiliation anonymous. He insisted that such a paired design was imperfect since it meant selecting control sites non-randomly. He insisted that we select the oiled and control sites independently, and randomly, to create a stronger statistical design. We argued extensively about the alternative designs: paired vs. random. He won that battle.

As a result of his winning that battle, we all lost the “war”—it destroyed the first year of the study. The efforts of about 10 research staff working out of two 50 foot charter vessels for over a month continuously in the field (I think the boats together cost $5,000 per day), plus people working back in Fairbanks on analyzing samples and data. All for naught. Wasted.

It was wasted in part because of one other decision that seemed trivial. That was how to define the sampling “universe” from which sites were selected. The site selection and marking group was based out the Alaska Dept. of Fish and Game (if I remember right). Their job was to a) do the GIS work to map out coastal habitat types and overlay that with level of oiling, b) randomly select sections of coast—5 each oiled and unoiled in each habitat type (no longer than 1 km per section), and then c) send a team to the Sound to mark the sites for the research team that was going to be heading out a few weeks later. The “trivial” decision was to include any map quad with oiled sites in it as part of the sampling Universe.

As it turned out, the map quad that included the northwest section of Prince William Sound had some oiled sites in it. As a result, the entire section became part of the study. But because there wasn’t much oiled coastline in that area, a disproportionate number of control sites ended up in the northwest, some on the mainland, even though most of the oiled sites were on the islands in the more central areas of the Sound. The oil was concentrated on those central islands because the currents that carried it from Bligh Reef through the Sound run right up against and around them.

Well, it rains a lot in Prince William Sound, averaging 60 inches of rain a year, according to NOAA, but near Whittier (in the NW) it can be closer to 200 inches a year! And all that freshwater falling from the sky has only one place to go: Prince William Sound. As a result, up in the coastal areas in the northwest Sound, it isn’t really a marine environment. There can be a freshwater lens sitting on top of the seawater can be more than a foot deep, as we learned once we were out there sampling—you could drink the “seawater.” And you know, marine shoreline organisms like Fucus and barnacles really don’t like freshwater. As a result of having too many of our control sites in that brackish- or even fresh-water dominated area, our first year’s sample collection made it look like a massive oil spill was “good” for the populations of marine coastal organisms. Oops. The entire year’s effort was a complete bust. A waste.

All because of one major conceptual decision forced upon us by the biometrician, coupled to a few what seemed minor decisions made by different groups who were all under enormous pressure to get moving. There wasn’t time to consult widely on how to define the “sampling universe.” There may well be people who could have told us to stay away from those  areas because they are not comparable but when you have to move a complex operation quickly under crisis conditions, it’s not surprising that those experts weren’t where we needed them when we needed them. For example, I have no clue who even decided which map quads to include in the GIS that was used to select sites.

So are you surprised that I developed a healthy skepticism for statisticians and for the “perfect” design? The research teams didn’t know how different the regions of the Sound were, but our intuition was to ensure that control and oiled sites were well matched, even if that gave us a less perfect statistical design. That gave us a strong biological design—the one that was used in later years of the study, and that showed, as expected that crude oil was rough on marine coastal species. Just not as rough, perhaps, as trying to live in freshwater. A contaminated habitat is still, after all, “habitat.”

My relatively short experience with the Exxon Valdez spill taught me valuable lessons—about the challenge of working across agencies and cultures under crisis conditions, of the joys of working off boats under stormy conditions, and importantly to never let some idealized version of the “perfect” or even of just the “better” design trump the common sense practicality of a good and workable design. I’ve also learned about the importance of thinking carefully about how you define the sampling universe and how you think about scaling from that limited area to larger scales of whole systems.

I’ll happily consult with a statistician on how to deal with the data I collect, but never again will I allow one to determine how I set up a study, at least not if their advice goes against my biological intuition.



Leave a Comment
  1. Meredith Warshaw / Jun 19 2016 12:34 pm

    As a statistician, I’m sad to hear that story. Working with a statistician should be a collaborative partnership, with each side listening to the other. I have important input into study design, due to my expertise in that area, but I always listen to the clinical researchers when they say something isn’t feasible or will cause problems.

    Please don’t judge all statisticians by the one who wasted a year for you, just as I won’t judge all researchers by the one or two who have caused great problems for me!


    • jpschimel / Jun 19 2016 1:02 pm

      Meredith: As I said, I don’t really hate statisticians. That was a situation where, because of the crisis situation, there was huge pressure and little time to think through important decisions. Also, the management team that ultimately controlled the program was not made up of researchers. They were doing their best under horrible conditions–but the Army says “you fight like you train” and we had never trained for such a situation. Researchers operated in a researcher mode, managers in a manager mode, lawyers in a legal mode. They didn’t always mesh well. The management team seemed to trust the academic researchers the least.

      I value the insights of the statisticians and modelers I’ve worked with. They have enhanced my research. But I learned the hard way about how not to do the collaboration–and that in a battle between the perfect and the good, I’ll go for the good.

      But also, I’ve realized that my blog is an opportunity to discuss more than purely writing and to reflect on some of my professional history and on what is now “ancient history.” Very little about the Exxon Valdez coastal habitat damage assessment was ever published or even about how the program was put together or any lessons learned that might be useful.

  2. Meredith Warshaw / Jun 19 2016 3:16 pm

    I’m glad you shared your experience. I’m just appalled by it. I know you won’t judge all statisticians by that one, as you said, but was responding to “never again will I allow one to determine how I set up a study, at least not if their advice goes against my biological intuition”. That’s the part that particularly saddens me, because most of the time involving a good collaborative statistician in study design can mean the difference between usable and unusable data.

    The great statistician RA Fisher said “To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of.” Too many times, I’ve been in that situation. Sadly, you experienced its inverse. A good collaboration involves a lot of listening and willingness on the statistician’s part to trust the scientist’s expertise in the area being researched.

    And I definitely agree about not letting perfect be the enemy of good.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: