Query examples in paleoclimate data syntheses

Hi all,
many of you are familiar with data syntheses like PAGES 2k, Iso 2k, SISAL, and others, often associated with PAGES working groups and often using the Linked Paleo Data (LiPD) format. These and other databases support an ever growing array of scientific pursuits in paleoclimatology and paleoceanography.

In order to build better software around LiPD, the LinkedEarth team would like to know what sort of questions you ask of these databases? And while doing so, do you query by time and space? Archive type? Proxy type? Resolution? All of the above?

We are looking for examples of questions you ask using these databases, and what (meta)data you need to run queries that will assemble a subset of records that can address these questions. We welcome any example, not just those using LiPD-formatted databases.

The more examples we have, and the more details they contain, the better and more usable our software will be. Don’t be shy!

The questions I like to ask these sorts of databases tend to fall under one main (very broad) category which is: What spatially/temporally coherent variability emerges when we examine paleoclimate records on a global scale, and what does it tell us about the history (and future) of the climate? More specifically I’m curious about things like:

  • When we examine paleoclimate “events”, do we observe spatial/temporal patterns? Can we see things like the propagation of the 8.2k event moving through space over time? Is event-like behavior coherent between paleoclimate records regionally? Globally? Can we learn anything about the history of atmospheric teleconnections from this kind of analysis?

  • Do we observe coherent changes in system dynamics reflected in paleoclimate records? Are there any examples of global alterations in dynamics? Regional alterations? How synchronous are they with one another if they are observed?

My workflows tend to be comprised of loading a bunch of LiPD records, doing an initial “pruning” depending on what my requirements are for my analysis, applying some statistical tool (say event detection, or climate regime shift detection) and then slicing the results to see if there’s any agreement (or disagreement) between records as a function of time, space, archive type, etc.

The metadata I’m typically interested in during the first pruning step of my analysis is resolution, time bounds, record length, and either proxy interpretation or variable name (d18O, d2H, etc.). Sometimes I’m also interested in location during this initial step, but recently my focus has been on looking for globally resolved phenomena, at least at first. During the slicing results/interpreting step I’m normally interested in location, archive type, and sometimes proxy type. In total I’m interested in everything you listed, just at different stages of the process.

For SISAL, the search queries I usually use are:

  • Look for records within lat-long limits or age-limits depending on the research question.
  • Then screen these records for speleothem type, mineralogy, number of U-Th dates within reasonable age limits, author generated age-depth models with uncertainties, if I need more age control I also look for SISAL chronologies. If age control is important, I also look at the U-Th data for sources of age errors e.g. detrital Th, open system 234/238U etc.
  • Then I look for further details like superseding records, composite records and skim read publications.
  • I haven’t found a very good way to screen for resolution so I usually plot figures to check this manually.
  • The other metadata that I find useful include geology, distance of cave entrance and if monitoring data is available for the cave.

Thank you. Out of curiosity, do you find most of the information in the database itself or need to read the paper (cave entrance information for instance).

Hi Deborah , so speleothem type, mineralogy, U-Th dating information including 234/238U etc, geology, distance of cave entrance, cave monitored (yes/no question) are all metadata fields in the database itself. I read the papers to double-check for (i) hiatus, (ii) if authors flag something that the database doesn’t pick up very obviously e.g. an open 234/238 dating system in the older part of the record, (iii) or e.g. Corchia or Devils Hole records where there are multiple overlapping records and it’s not immediately clear why one should be selected over the other (iv) and of course for interpretations.