Thursday, February 7, 2013

Scientific communication: blogging and publishing?


Fletcher Halliday, blogging at BioDiverse Perspectives, wrote:
"My original intent in writing this post was to compare the 5 most-cited papers on biodiversity to the 5 most blogged-about papers on biodiversity to address the differences between what we value as researchers versus what we value as general science communicators."

What struck me throughout this post was this, maybe implicit, need to distinguish between blogging and publications. I always approach a publication as a communication of ideas. Depending on the research and journal, these ideas can be very specific or more general, and hopefully I will tailor my writing to these different audiences. And blogging is just another communication channel, with its own specific type of writing associated with it.

However,  I also think that the similarity between blogging and publishing goes deeper than the intent. Jeremy Fox (I know, he just keeps putting ideas in my head) compares blogging basically to conference meetings. While I agree with the usefulness of this analogy, I think blogging goes deeper than that. Isn't our ancient publication system a prototype of blogging, with references serving the double role of url links and comments/replys? Our old system is surely slower, but both systems even use the same philosophy to measure "importance": pageranks for blogging and #references/citation index for papers (and more sophisticated methods such as this paper, mentioned in a different context here).

Jeremy Fox and commenters to the Fletcher Halliday's post also point out that the biggest difference between blogging and publications is the peer-review system. But this is "just" a difference in quality control, and not in intent. And this quality control system works reasonably well, I think. While I recently mentioned to some glowing reviews, I will now balance the scales with a recent anonymous review of a manuscript we submitted:
"Beside these major problems, this MS suffers from a weak editing effort as suggested by the numerous errors found (e.g., several errors in the literature cited in text and the reference list, poor quality of figures without any explicit legends, redundancy in the method, result and discussion sections). I warmly recommend the authors to thoroughly revise their MS accordingly.
To summarize, I suggest the authors to deeply reconsider the general framework of their study by clarifying its novelty and scope, to increase their pedagogical effort, to remove the meta-analysis, and to be more careful while editing their MS."
Auch. Auw. "Warmly". And we probably all get more of these critical than glowing reviews. But s/he made a lot of valid points that will greatly improve this manuscript. So a publication is just a blog post where the commenters have some real power. Interestingly, there are some academic publishing experiments that blur this difference between blogging and publishing even further (see for instance this interview with the president of Frontiers).

Tuesday, February 5, 2013

What is ecology, and evolution - the value of interdisciplinary collaboration?

What is ecology? What is evolution? Seems like a really simple question from an ecology or evolution perspective, until you bring a different field into the mix, e.g., transposon/genome biology. When we started the TE (formerly known as genome) ecology group, sometimes in March 2010, I had no idea we would end here, with a publication in a journal with a higher impact factor than Ecology.

Our starting point for this publication was the appeal of using ecological theory to explore the dynamics of transposable elements in the genome. But we quickly realized that what these genome biologists described as "ecology", did not correspond with my idea of ecology, and it seemed more like an evolutionary pattern/process. Luckily our collaborative group consisted of a combination of ecologists, evolutionary biologists, philosophers, and computational biologists, which provide the necessary background information to study this interdisciplinary problem. We thus ended up defining, in very general terms, ecology and evolution:
"A strictly evolutionary approach investigates change (or the lack thereof) in some focal entity over successive generations. The focal entities can range from genes to traits or from populations to higher taxonomic units. 
A strictly ecological approach assumes no change in the focal entities themselves, but focuses instead on the relationships between these entities and their environment. Here we use ‘environment’ in a broad sense potentially to include any of the factors with which an entity interacts."
Not exactly rocket science, you would think, but writing this publication has been an eye opener (in a positive way) to me on the importance of clear and exact definitions. I think that we have literally discussed every term we wrote in this manuscript.

Moreover, because of our interdisciplinary group, we could actually provide a proof of principle analysis, with some actual, very promising, results:

The predictions arose directly from our philosophical definitions and applying general ecological and evolutionary principles, the data from available genome sequences obtained and handled with bioinformatics, the actual tests from standard ecological multivariate practices.

And this is also the only manuscript I have worked on that received these completely positive (anonymous) reviews:
"This is a brilliantly organized and beautifully written paper that provides a closely analysed conceptual framework and strong evidential support for several key claims made by the authors. The paper draws on, clarifies and expands existing literature, and provides suggestions for how findings about the system of focus (TEs in genomes) have much broader implications and potential applications. The authors set out clearly and thoroughly a case for why transposon ecology is interesting, and are very convincing in their treatment of potential confusions and conflations. The figures are illuminating, and the results of the empirical analysis strongly supportive of all the distinctions made with such care by the earlier part of the paper. The ideas the authors develop are novel and thought-provoking, and are likely to gain a lot of attention from the biology community and beyond. I can only recommend acceptance with no changes at all, apart from a few typos to do with spaces, italicization, redundant commas, and the occasional missing ‘that’ or ‘the’. I expect this paper to be read and cited abundantly."
Since this type of accolade is very rare, I decided to put it here, and thank the anonymous reviewer again (I hope s/he doesn't mind me putting it out there). This would be the perfect blurb that often comes on the dust cover of a book, and it provides a better summary and critical praise for our work.

Finally, to stress again the importance of the collaborative nature of our truly interdisciplinary group, during our discussions I came across this publication: http://www.plosone.org/article/info:doi/10.1371/journal.pone.0004803, that looks at how different types of sciences are linked to each other (by following the surfing behaviour of users of scientific websites):

There is a lot of fascinating information in this figure, but without access to the raw data (and given my limitations due to daltonism), I get the impression that 1) ecology/biodiversity seems to be this bridge between the social and natural sciences (with Ecology, a very specialized journal in ecology, identified as a "connector[s] between various domains"), 2) philosophy is surprisingly far away from ecology.

I think that our publication explicitly and successfully bridges that gap between these disciplines, and benefited greatly from collaborations with researchers from these disciplines.

Monday, January 21, 2013

Synchrony in metacommunities

A new year, a new publication, this time with Shubha and Jurek: "Population synchrony decreases with richness and increases with environmental fluctuations in an experimental metacommunity" in Oecologia. We continued our work with specialization in metacommunities, but this time looked at the implications on population synchrony.

Key figure:
We found, as predicted, that:

  • the synchrony between populations of a specialist species within a metacommunity is more influenced by environmental fluctuations compared to a generalist species
  • that increasing species richness decreased individual population synchrony
While these results make perfect sense from an ecological point of view, getting this paper published was not so straightforward because of the unbalanced experimental design we had to use. This struggle is transcribed in the boring science language at the end of the M&M that completely glosses over the blood, sweat, and tears we experienced in our back-and-forth with reviewers:
"We performed an unbalanced factorial analysis using the 15 effective numbers of dimensions, since replication of the two-species treatment in our experiment resulted in enough error degrees of freedom to test for an interaction effect between environmental fluctuation (EF) and species richness. We decided to use this unbalanced but replicated design because of time constraints. We were unable to implement a completely balanced design due to sample processing demands, which, for the sake of consistency, meant that all of the samples had to be counted within a short time frame."
 Because we were interested in synchrony in a metacommunity, we (i.e., Shubha) had to count densities of 3 species within all 4 communities of each metacommunity, or approximately 96 x 3 in total. All this in a limited time, and the individuals had to be put back to the original community to not affect population dynamics. This meant that we only had 2 replicated metacommunities (the experimental unit) for the 2 species treatment, and for all other treatments only 1 metacommunity.

We thus had to be really careful in how we worded the Figure 2 caption (see above, last line), since we could in principle only compute a measure of spread (e.g., standard error or deviation) for the 2 species points. However, that was also based on only 2 replicates. We thus decided not to add these, because it would be even more confusing to have some points with and some without some measure of variation indicated in the figure.

This could, of course, raise the question whether not adding this information to the main figure of our article is not only removing potential confusion, but also removes visual clues to readers about the limitations of our study, i.e., the lack of replication for most of the treatments.

I do believe we are only guilty of the first (removing confusion), since to test the interaction term EF x S, we only had 3 degrees of freedom (see Table 1). Despite this low power, we fought hard to get these results in, because when looking at the Figure 2, I think nobody would question that the lines that connect EF treatments with the same species richness are basically parallel to each other, and thus provides visual confirmation of the non-significant interaction term associated with the low degrees of freedom due to lack of full-factorial design.

This is thus one of my publications that for a large part hinge on the visual interpretation of the results (Figure 2, above) and the statistics (and all the other figures and tables) only had a confirmational role. I was really glad I had read Analysis of messy data, to convince the reviewers.

Thursday, January 10, 2013

Variation decomposition is a zombie idea?

Another post as a response to something written by Jeremy Fox! I think it becomes time that I meet him in person so that I can address him by “Jeremy” instead of “Dr. Fox”, “Fox”, or “the author”. I tried to remove all the salesmanship from my response, though, because I wanted to make sure that 1) I summarized his blog post correctly, and 2) expressed my thoughts as clearly as possible. This will make it easier for others to point out my logical mistakes, and thus to correct my thinking, and my understanding of ecology.

Original argument by Dr. Fox

  • correlation does not equal causation
  • e.g. abundance in species over time
    • data
      • 2 species, abundances not correlated
      • species abundances correlated with weather
    • naive conclusion
      • no density dependence
      • weather important
    • Dr. Fox’s conclusion: this is wrong
      • correlation is not prediction of density dependence (example from economics)
  • implication
    • zombie idea: “misinterpreting correlations, and lack of correlations, among variables as evidence for or against causality when those variables are affected by density-dependent feedbacks”
    • in community ecology: “Given that there’s intra- and interspecific density dependence within sites, I doubt that you can reliably infer the causes of metacommunity structure just by looking for statistical associations between environmental and geographic variables, and species abundances.”

My initial response

  • after reading that last quote, I thought that Dr. Fox made this conclusion:
    • so “correlations, partial correlations, variance partitioning, multiple regression, structural equation modeling, or related statistical methods” become useless in ecology, since this is probably a density-dependent dynamical system
    • because of this zombie idea dismissing general statistical techniques as tools to “infer how causality works in a density-dependent dynamical system” seems a bridge too far. I think that this could become a zombie idea in itself if lots of ecologists would follow this advice.
  • but I doubt that this is the intent of the blog post. The key word being “reliably”; he probably added this for people like myself to the sentence. Without it, it would seem that Dr. Fox is negating the scientific method; with this word, you can talk about observational versus experimental approaches, weak and strong tests of certain hypotheses etc.
  • I thus read the whole blog post again in more detail, followed by coming up with a second response.

My second response

  • I do think that Dr. Fox pulls a sleight of hand:
    • Dr. Fox’s conclusion and example only deals with pointing out that correlation is not a good prediction for density dependence
    • but the data do provide evidence for the hypothesis that weather influences population abundances through food availability etc.
  • so my conclusion for the data from Dr. Fox outlined above
    • there is evidence that weather is important
    • not a good test for whether density dependence is important or not, so not possible to make any conclusion on this aspect
  • implications of this for variation decomposition as a method to detect metacommunity processes
    • any statistical method is only as good as the prediction and associated hypothesis it is testing
      • not good to test for density dependence
      • could be for testing e.g. effects of environment on species abundances
    • statistical methods have their limitations (see indeed Gilbert and Bennett 2010, as pointed out by Dr. Fox)
      • thus observational data should be accompanied by experimental studies
      • see e.g. my PhD work (from a long time ago, observational study using variation decomposition, and experimental study)
    • my 2005 publication should thus only be seen as a starting point
      • there is lots of evidence for species sorting in response to environmental condition in general in these potentially dynamic systems (in 73% of the data sets)
      • it is difficult, but not impossible, to include for instance competition or density dependence or time or evolution into these analyses.
      • we should explore more interesting or detailed predictions, and some of these could use variation decomposition (e.g. generalist versus specialist species or body size)
  • The real problem would be that the zombie idea of testing density dependence with correlation could somehow make the correlations between species abundances and weather invalid.
    • I maybe missed this explanation in the blog post, but I could not find how density dependence could make not significant external drivers significant (but if I am wrong, let me know)
    • maybe this is addressed in Ziebarth et al. 2010, and is implicitly suggested by Dr. Fox?

My conclusions

  • Dr. Fox convincingly illustrates that correlation is not a good prediction for the hypothesis of density dependence, and there exist better predictions and tests.
  • My 2005 article is flawed, but I do not think that this zombie idea of density dependence is the most important one (we are working on a follow article, though).
  • Writing blog posts are fun, especially if you have a skilled writer as Dr. Fox giving you starting points. I just have to take care that when somebody writes to get a reaction (which is part of the appeal of Fox’s writing, and which he suggests in this and this post), that I read and react to the actual points made, and try to “read around” the salesmanship aspect of the writing.

Friday, December 7, 2012

Only a hammer in your toolbox

Talk about climate change, and be sure that your analyses are rock solid, because you will get some serious backlash. What interests me most is the danger of "if the only tool in your toolbox is a hammer, everything will look like a nail" thinking.

http://www.statisticsblog.com/2012/12/the-surprisingly-weak-case-for-global-warming/ is a blog post written by a graduate statistics student, with this summary:
"TL;DR (scientific version): Based solely on year-over-year changes in surface temperatures, the net increase since 1881 is fully explainable as a non-independent random walk with no trend. 
TL;DR (simple version): Statistician does a test, fails to find evidence of global warming."
The correcting nature of the connected scientific community quickly came to the rescue (?) in the form of Dr. Fellows http://blog.fellstat.com/?p=304:
"What we have shown is that the model proposed by Mr. Asher to "disprove" the theory of global warming is likely misspecified. It fails to to detect the highly significant trend that was present in our simulated data. Furthermore, if he is to call himself a statistician, he should have known exactly what was going on because regression to the mean is a fundamental 100 year old concept." 
I  find it really brave that a grad student puts himself out there, blogwise (see Commandment 6); especially with a subheading that reads:
"In Monte Carlo we trust"
Mr. Asher put a lot of effort into writing that post, given the length and thought that went into it. And he clearly feels very strongly about this issue (see Commandments 1-3). Too bad he put too much trust into his hammer, as pointed out by Dr. Fellows. I do think that Dr. Fellows also plays the man in his final comment, which was not necessary, given his rebuttal; but I can see where his frustration is coming from.

I was going to write that "Hopefully, I will not make the same mistake as Mr. Asher when writing for this blog", but then there is always Commandment 8 ("Thou Shalt Not Simply Trot Out thy Usual Shtick."), so luckily I can then always fall back on Commandment 5 ("Thou Shalt Not Flaunt thine Ego. Be Thou Vulnerable. Speak of thy Failure as well as thy Success.").