Twitter record of #AGU17

I missed the 2017 Fall AGU meeting, but I did follow along on twitter. However the coverage was spotty — some sessions were mentioned, some not at all. From this experience I kept wondering about the digital traces of the meeting on twitter. Lo and behold I saw this tweet from Dr. Christina K. Pikas (@cpikas) at the beginning of this year:

So let’s look at this awesome dataset that Dr. Pikas collected and published on figshare:. First, this data was collected using TAGS, and contains tweets from Nov. 4th, 2017 to Jan. 4th, 2018 that used the hashtag #AGU17There are a total of 31,909 tweets in this dataset. In this post I am subsetting the data to look only at the meeting (with a 1 day buffer, so Sunday Dec. 10, 2017 to Saturday Dec. 17, 2017) — a total of 25,531 tweets during the 7 days:

Hourly.jpg

I noticed:

  • Twitter activity decays through the week (fatigue? do most people just tweet their arrival? Daily attendance variations?)
  • There is a noticeable lunch break on M, W, Th, and F
  • Each day twitter activity starts suddenly, but has a slow(er) decay at the end of the day (late night activities?)

Retweets account for 44% of the 25,531 tweets during the meeting. Removing RTs yields an almost identical plot, but there is small peak that appears at the end of each day (pre-bedtime tweets?):

HourlyNoRT.jpg

Lastly, the biggest #AGU17 twitter user is @theAGU (by far), which sent 1063 tweets during the week. Here is the timeseries with only @theAGU tweets:

HourlyAGU.jpg

I see the lunch break and not as many late nights for the organization.

Thanks @cpikas for collecting and publishing the data! It is available on figshare:

My code is on github here

Advertisements

Timing and authorship of scholarly mentions to GRL in Wikipedia

In March 2018 I had a guest post on Altmetric describing my work investigating Geophysical Research Letters articles that are mentioned in Wikipedia.

Tweet length summary: Only 1.4% of Wikipedia edits that mention a GRL article can unambiguously be assigned to the GRL article author.

Here is a link to the Altmetric blog post

 

Data Collection: getting Altmetric.com data for GRL articles

In previous posts I have looked at several aspects of Earth and Space Science citations in Wikipedia. As part of a project I am working on, I’m interested in expanding this work to look at some mechanics of citations in Wikipedia to articles in Geophysical Research Letters (e.g., when do they appear, who writes the edit, on what Wikipedia pages, etc.). In this post, I want to walk through my method for getting the data that I will analyze. All the code is available (note that I am not a good R programmer).

Data on Wikipedia mentions are aggregated by Altmetric. rOpenSci built a tool to get altmetric data (rAltmetric) using the Altmetric API. rAltmetric works by retrieving data for each paper using the paper’s DOIs — so I need the DOIs for any/all papers before I do anything. Fortunately, rOpenSci has a tool for this too — rcrossref — which queries the Crossref database for relevant DOIs given some condition.

Since my project is focused on Geophysical Research Letters, I only need the DOIs for papers published in GRL. Using the ISSN for GRL, I downloaded 36,960 DOIs associated with GRL and then the associated Altmetric data (using rAltmetric).

The data from rAltmetric returns the number of times a given article is cited in Wikipedia. But I want some extra detail:

  • The name of the Wikipedia article where the GRL citation appears
  • When the edit was made
  • and Who made the edit

This information is returned through the Altmetric commercial API — you can email Stacy Konkiel at Altmetric to get a commerical API key through Altmetric’s ‘researcher data access program’ (free access for those doing research). I got the data another way, via webscraping. To keep everything in R, I used rvest to scrape the Altmetric page (for each GRL article) to get Wikipedia information — the Wikipedia page that was edited, the author, and the edit date. Here is an example of an Altmetric.com page for a GRL article:

alt.jpeg

The Wikipedia page (‘Overwash’), the user (‘Ebgoldstein’ — hey that’s me!), and the edit date (’10 Feb 2017′) are all mentioned… this is the data that I scraped for.

Additionally I scraped the GRL article page to get the date that the GRL article first appeared online (not when it was formally typeset and published). Here is an exampLE of a GRL article landing page:

GRL.jpeg

Notice that the article was first published on 15 Dec 2016. However, if you click the ‘Full Publication History’ link, you find out that the article first appeared online 24 Nov 2016 — so potential Wikipedia editors could add a citation prior to the formal ‘publication date’ of the GRL article.

So now that I have that data, what does it look like? Out of 36,960 GRL articles, 921 appear in Wikipedia, some are even cited multiple times. Below is a plot with the number of GRL articles (y-axis) that appear in Wikipedia, tallied by the number of times they are cited in Wikipedia — note the log y-axis.

Rplot.jpeg

GRL articles are spread over a range of Wikipedia pages, but some Wikipedia pages have many references to GRL articles (note the log scale of the y-axis):

Rplot01.jpeg

553 Wikipedia Articles have a reference to only a single GRL article, while some articles contain many GRL references. Take for instance the ‘Cluster II (spacecraft)‘ page, with 25 GRL citations, or ‘El Niño‘ with 11 GRL references).

I’ll dive into data I collected over the next few weeks in a series of blog posts, but I want to leave you with some caveats about the code and the data so far. (Edited after the initial posting) Altmetric.com only shows the data for up to 5 Wikipedia mentions for a given journal articles unless you have paid (instituitonal) access. Several GRL articles were cited in >5 Wikipedia articles, so I manually added the missing data. Hopefully i will make a programmatic work-around sometime. After I wrote this post, I was informed that the commerical Altmetric API gives all of the Wikipedia data (edit, editor, date). To get a commerical API key through Altmetric’s ‘researcher data access program’ (free access for those doing research), email Stacy Konkiel at Altmetric (thanks Stacy!).

Furthermore, many of the edit times that you see here could be re-edits, therefore ‘overprinting’ the date and editor for the first appearance of the wikipedia citation. This will be the subject of a future post, though I haven’t yet found an easy way to get the original edit…

Nonlinear Dynamics and Geomorphology

This is a list of geomorphology papers that map onto the chapter headings of Strogatz — ‘Nonlinear Dynamics and Chaos’. This is a work in progress — some headings are left blank because I can’t find concrete examples (i.e., strange attractors) and while others remain blank because of too many examples (i.e., fractals). I envision this list could be used when teaching or discussing nonlinear dynamics in a geomorphology setting.

  • Part I: One-Dimensional Flows
    • Ch. 2: Flows on a Line
    • Ch. 3: Bifurcations
      • Fagherazzi, S., Carniello, L., D’Alpaos, L., & Defina, A. (2006). Critical bifurcation of shallow microtidal landforms in tidal flats and salt marshes. Proceedings of the National Academy of Sciences, 103(22), 8337-8341. 10.1073/pnas.0508379103
      • Anderson, R. S. (2002). Modeling the tor-dotted crests, bedrock edges, and parabolic profiles of high alpine surfaces of the Wind River Range, Wyoming. Geomorphology, 46(1), 35-58. 10.1016/S0169-555X(02)00053-3
      • Pelak, N. F., Parolari, A. J., & Porporato, A. (2016). Bistable plant–soil dynamics and biogenic controls on the soil production function. Earth Surface Processes and Landforms, 41(8), 1011-1017.10.1002/esp.3878
      • Yizhaq, H., Ashkenazy, Y., & Tsoar, H. (2007). Why do active and stabilized dunes coexist under the same climatic conditions?. Physical Review Letters, 98(18), 188001. 10.1103/PhysRevLett.98.188001
      • Yizhaq, H., Ashkenazy, Y., & Tsoar, H. (2009). Sand dune dynamics and climate change: A modeling approach. Journal of Geophysical Research: Earth Surface, 114(F1). 10.1029/2008JF001138
      • Bel, G., & Ashkenazy, Y. (2014). The effects of psammophilous plants on sand dune dynamics. Journal of Geophysical Research: Earth Surface, 119(7), 1636-1650. 10.1002/2014JF003170
      • Goldstein, E.B., and L.J. Moore, (2016) Stability and bistability in a one-dimensional model of coastal foredune height, J. Geophys. Res. Earth Surf.121964977doi: 10.1002/2015JF003783
    • Ch. 4: Flows on a Circle
  • Part II: Two-Dimensional Flows
    • Ch. 5: Linear Systems
      • Plant, N. G., Todd Holland, K., & Holman, R. A. (2006). A dynamical attractor governs beach response to storms. Geophysical Research Letters, 33(17). 10.1029/2006GL027105
    • Ch. 6: Phase Plane
      • Marani, M., D’Alpaos, A., Lanzoni, S., Carniello, L., & Rinaldo, A. (2007). Biologically‐controlled multiple equilibria of tidal landforms and the fate of the Venice lagoon. Geophysical Research Letters, 34(11).10.1029/2007GL030178
      • Marani, M., D’Alpaos, A., Lanzoni, S., Carniello, L., & Rinaldo, A. (2010). The importance of being coupled: Stable states and catastrophic shifts in tidal biomorphodynamics. Journal of Geophysical Research: Earth Surface, 115(F4). 10.1029/2009JF001600
      • Stark, C. P., & Passalacqua, P. (2014). A dynamical system model of eco‐geomorphic response to landslide disturbance. Water Resources Research, 50(10), 8216-8226.10.1002/2013WR014810
      • Stark, C. P. (2006), A self-regulating model of bedrock river channel geometry, Geophys. Res. Lett., 32, L04402, doi:10.1029/2005GL023193.
      • Limber, P. W., A.B. Murray, P. N. Adams and E.B. Goldstein, (2014), Unraveling the dynamics that scale cross-shore headland amplitude on rocky coastlines, Part 1: Model Development,Journal of Geophysical Research: Earth Surface, 119, doi: 10.1002/2013JF002950
      • Limber, P. W., & Murray, A. B. (2014). Unraveling the dynamics that scale cross‐shore headland relief on rocky coastlines: 2. Model predictions and initial tests. Journal of Geophysical Research: Earth Surface, 119(4), 874-891.10.1002/2013JF002978
      • Mariotti, G., & Fagherazzi, S. (2013). Critical width of tidal flats triggers marsh collapse in the absence of sea-level rise. Proceedings of the National Academy of Sciences, 110(14), 5353-5356. 10.1073/pnas.1219600110
    • Ch. 7: Limit Cycles
      • Stark, C. P. (2010). Oscillatory motion of drainage divides. Geophysical Research Letters, 37(4).10.1029/2009GL040851
    • Ch. 8: Bifurcations revisited
      • Mariotti, G., & Fagherazzi, S. (2013). A two‐point dynamic model for the coupled evolution of channels and tidal flats. Journal of Geophysical Research: Earth Surface, 118(3), 1387-1399. 10.1002/jgrf.20070
  • Part III: Chaos
    • Ch. 9: Lorenz Equations
    • Ch. 10: One-Dimensional Maps
      • Goldstein, E.B., and L.J. Moore, (2016) Stability and bistability in a one-dimensional model of coastal foredune height, J. Geophys. Res. Earth Surf.121964977doi: 10.1002/2015JF003783
    • Ch. 11: Fractals
      • There are too many papers/books/issues to discuss here…
    • Ch. 12: Strange Attractors

The AGU EOS ‘Editorial Practices’ discussion of 1984

On May 15 1984, Russell and Reiff published a (jokey) flow chart of the AGU editorial and peer review process with several time delay terms and a ‘counting’ step for the multiple revisions. This set off 6 responses in EOS, similar to the episode in 2003-2004.

RusellReiff1984.jpeg

  1. On Oct 23, 1984, Baum wrote in to discuss how peer review tended to filter out controversial new ideas. Baum recommended that authors be allowed to publish controversial new ideas even if reviewers protested, but reviewers should also be allowed to publish their criticisms. In addition Baum offered some mathematical changes to the Russell and Reiff flow chart.
  2. Dessler also wrote in on Oct 23, 1984, with remarks that referees are often named and thanked by the editor or author. As a result, authors may be more wary of support for controversial ideas. Dessler also suggests that Comment—Reply pairs should be published more often (I have written about these in JGR-ES).  
  3. On Dec. 25, 1984, Sonnerup (the editor of JGR-Space Physics) wrote to EOS in support the idea that peer review should permit new and unorthodox ideas. Additionally, Sonnerup provides additional details regarding the review process at JGR-Space Physics. 
  4. On Feb 19, 1985 Walker and van der Voo wrote in to EOS to discuss the editorial process at GRL. Choice quote (bold type highlighted by me): Because of the importance attached to prompt publication in GRL we will gener­ally use only one reviewer for each paper, communicating with this reviewer, when necessary, by telephone or telemail. More reviewers are used only when a paper seems likely to be particularly controver­sial or is otherwise difficult to deal with.”
  5. Baker wrote in on April 25, 1985 to suggest that JGR collect the rejected papers and publish them. Baker stated, in jest, that there is likely a “large body of unpub­lished papers out there which have been rejected by Neanderthal referees. I say let’s do something about it! I suggest that all of these brilliant, creative, earthshaking pa­pers be collected into a special JGR issue each year.”
  6. Murdoch wrote in on March 10, 1987 to suggest that abstracts of rejected papers be published. If a scientist wanted to see the rejected paper, then the author could provide the paper AND the critical reviews.

 

These papers just highlight the role of editors, something still missing from my peer review agent model (pointed out by a commenter/Jazz legend).

Retaliation in the Peer Review Model

(The full motivation, rule set, and previous results for this model are collected here)

Today I am adding a new rule into my toy peer review model. Suppose some fraction of the population is set to ‘retaliate’ when called upon to be a reviewer. Specifically, these agents assign review scores based on their feelings (postive or negative) toward the author. This is an example of biases that might influence a reviewers decision (e.g., Lee et al., 2012).

So the new rule is:

  • If a ‘retaliator’ is assigned a review, and they feel positively or negatively toward the author, the review is postive or negative, respectively (overriding a random review).

(n.b.: A more gentle statement of this review could instead focus solely on ‘cliques’ — if a reviewer feels positively toward the author, the review is positive. if the review feels negatively, the review is random. )

The issue is now there are 4 types of peple in the model:

  • Those who sign reviews, and do not retaliate
  • Those who sign reviews, and retaliate
  • Those who do not sign reviews, and do not retaliate
  • Those who do not sign reviews, and retaliate

Again I will use the range of incoming and outgoing node weights to visualize model results. As a reminder:

R_{i}^{in} is the maximum incoming weight minus the minimum incoming weight. This represents the range of feelings all other scientists have about scientist i.

R_{i}^{out} is the maximum outgoing weight minus the minimum outgoing weight. This represents the range of feelings scientist i has about all other scientists in the discpline.

So here are the results with 30% of all scientists being ‘retaliators’.

Figure_2.jpg

  • Compared to the previous results, the same trends hold: Rin is larger for signed reviewers (blue), and Rout is roughly the same for signed vs unsigned. (ranges are different for the previous results because of a change in the number of model timesteps).
  • Unsigned retaliators (empty orange markers) are similar to Unsigned non-retaliators. If you never sign reviews, no author will end up knowing that you are a retaliator (the editor is a different story).
  • Signed retaliators (empty blue markers) have a large Rin — they are polarizing figures. Authors are either on the good side of these people (they are friends) or on the bad side (they are enemies).

Signed and Unsigned Reviews in Earth Surface Dynamics

All of the reviews for Earth Surface Dynamics are open, published, and citable. Today I do a bit of webscraping to determine the % of mix of signed and blind reviews for the 198 paper reviewed in EsurfD. Also, since reviews occur in sequence (i.e.,  R1 submits their review before R2), we can exame how R1’s decision to sign a review influences the decision of R2.

The code to do the webscraping is here. Note that R is not my best language, but I am using it because of all the cool packages written for R to interface with Crossref (rcrossref, for obtaining publication DOIs), and the easy webscraping (rvest).

The code works by:

  1. Pulling (from Crossref) details for all ESurf Discussion publications using the ISSN number.
  2. Going to every EsurfD page (following the DOI link)
  3. Scraping the webpage for author, editor, and reviewer comment (see this helpful tutorial on using rvest).
  4. Checking for descriptive words, for instance “Anonymous Comment #1”, to determine if Reviewer 1 and/or Reviewer 2 were anonymous.
  5. Check to see if a Reviewer 3 exists (to exclude the data… I only want to deal with papers with 2 reviewers for this initial study).

I imagine some specific pathological cases in review comments may have slipped through this code, but a cursory check shows it captures relevant information. After the code runs, I am left with 135 papers with 2 reviewers, for a total of 270 reviews. In total, 41% reviews are signed — this matches previous reports such as 40% reported by Okal (2003) and the 40% reported by PeerJ

  • Reviewer 1 totals are 74 unsigned, 61 signed —55% unsigned, 45% signed
  • For the 74 papers where Review 1 is unsigned,
    • Reviewer 2 data is 59 unsigned, 15 signed — 80% unsigned, 20% signed
  • For the 61 papers where Review 1 is signed,
    • Reviewer 2 data is 27 unsigned,  34 signed — 44% unsigned, 54% signed.

There is one clear confounding factor here, which is how positive/negative reviews impact the likelyhood to sign a review (both for R1 and R2). I imagine referee suggestions to the editor (e.g., minor revisions, major revisions, reject) and/or text mining could provide some details. (I can think of a few other confounds beyond this one)…. Furthermore, I would assume that since many (all?) journals from Copernicus/EGU have open review, this analysis could be scaled…