Rxivist logo

Rxivist combines preprints from bioRxiv with data from Twitter to help you find the papers being discussed in your field. Currently indexing 50,524 bioRxiv papers from 235,220 authors.

Rxivist API documentation

System health dashboard

The Rxivist API offers free, programmatic access to all of our bioRxiv metadata in a JSON interface. It's open to all—no keys or authentication here, at least for now. We do ask that you go easy on the requests, as this is a small project with limited funding for server infrastructure.

If you are looking for data to use offline somewhere, there's also no need to send 200,000 API requests to get all of it: We generate weekly database dumps that contain all Rxivist information, and you're welcome to download them. Not only is that easier for our servers to handle, but it may be much easier for you to process. The PostgreSQL dumps are available for direct download.

If you are planning to use the Rxivist data within a web application, it would be much appreciated if you link to Rxivist on any page that displays a non-trivial amount of data pulled from this API. Also, let us know what you're up to! We'd love to find out how this data is being used.

While we're talking about third-party web applications, we should here that we can't guarantee this API will be around forever. We plan to keep it running (and updated!) for the foreseeable future, but if you're going to build something using the Rxivist API that requires a strong uptime commitment or a concrete promise of long-term functionality, consider deploying your own version of the software. We'll provide as much guidance as we can.

Etiquette

We want to provide a free API for all, and we don't want to unnecessarily burden developers (or ourselves) with cumbersome API tokens or registration processes. For that to work, we ask that you be polite and try not to do anything that will take the API down or otherwise make it unusable for others. Specifically, we encourage the following polite behaviour:

  • Cache data so you don't request the same information over and over again.
  • Minimize the number of parallel requests being made. If you start noticing increased response times or start getting timeout errors, consider adding pauses between requests.
  • Specify a User-Agent header that properly identifies your script or tool and that provides a means of contacting you via email using "mailto:". For example: GroovyBib/1.1 (https://example.org/GroovyBib/; mailto:GroovyBib@example.org) BasedOnFunkyLib/1.4. This way we can contact you if we see a problem.
  • Report problems and/or ask questions on our issue tracker.

Alas, not all people are polite. And for this reason we reserve the right to impose rate limits and/or to block clients that are disrupting the public service.

"Etiquette" section based on the Crossref API documentation, available via Creative Commons license.

How to cite Rxivist

If you use Rxivist data in your research, please cite our paper, which is now available at eLife:

Abdill RJ, Blekhman R. "Tracking the popularity and outcomes of all bioRxiv preprints." eLife (2019). doi: 10.7554/eLife.45133.

Table of contents

  • Preprints: Search https://api.rxivist.org/v1/papers
  • Preprints: Details https://api.rxivist.org/v1/papers/<id>
  • Preprints: Download data https://api.rxivist.org/v1/downloads/<id>
  • Authors: Rankings https://api.rxivist.org/v1/authors
  • Authors: Details https://api.rxivist.org/v1/authors/<id>
  • API details: Category list https://api.rxivist.org/v1/data/categories
  • API details: Total entities https://api.rxivist.org/v1/data/stats
  • API details: Site-wide metric distributions https://api.rxivist.org/v1/data/distributions/<entity>/<metric>

Preprints

Retrieve a list of papers matching the given criteria.
https://api.rxivist.org/v1/papers

Arguments

  • query – A search string to filter results based on their titles, abstracts and authors. Default:
  • metric – Which field to use when sorting results. Default: twitter
    • Acceptable values: downloads, twitter
  • timeframe – How far back to look for the cumulative results of the chosen metric. ("ytd" and "lastmonth" are only available for the "downloads" metric. Default: "day" for Twitter metrics, "alltime" for downloads.
    • Acceptable values: alltime, ytd, lastmonth, day, week, month, year
  • category_filter – An array of categories to which the results should be limited. Default: []
    • Acceptable values: animal-behavior-and-cognition, biochemistry, bioengineering, bioinformatics, biophysics, cancer-biology, cell-biology, clinical-trials, developmental-biology, ecology, epidemiology, evolutionary-biology, genetics, genomics, immunology, microbiology, molecular-biology, neuroscience, paleontology, pathology, pharmacology-and-toxicology, physiology, plant-biology, scientific-communication-and-education, synthetic-biology, systems-biology, zoology
  • page – Number of the page of results to retrieve. Shorthand for an offset based on the specified page_size Default: 0
  • page_size – How many results to return at one time. Default: 20

Example

Top 3 downloaded papers, all time

Using the "downloads" metric, get 3 papers ordered by their overall download count.
https://api.rxivist.org/v1/papers?metric=downloads&page_size=3&timeframe=alltime

Response (click to expand)
{
  "query": {
    "text_search": "",
    "timeframe": "alltime",
    "categories": [],
    "metric": "downloads",
    "page_size": 3,
    "current_page": 0,
    "final_page": 11138,
    "total_results": 33416
  },
  "results": [
    {
      "id": 12345,
      "metric": 166288,
      "title": "Example Paper Here: A compelling placeholder",
      "url": "https://api.rxivist.org/v1/papers/12345",
      "biorxiv_url": "https://www.biorxiv.org/content/early/2018/fake_url",
      "doi": "10.1101/00000",
      "category": "cancer-biology",
      "first_posted": "19-09-18",
      "abstract": "This is where the abstract would go.",
      "authors": [
        {
          "id": 1,
          "name": "Richard Abdill"
        },
        {
          "id": 24802,
          "name": "Another Person"
        }
      ]
    },
    # (More responses go here...)
  ]
}
    

Endpoint: Details

Retrieve data about a single paper and all of its authors. Note: Unlike the author rankings, paper rankings do NOT incorporate the concept of ties.
https://api.rxivist.org/v1/papers/<id>

Arguments

  • id – Rxivist paper ID associated with the paper you want Default: True

Example

Paper detail request

https://api.rxivist.org/v1/papers/25777

Response (click to expand)
{
  "id": 25770,
  "doi": "10.1101/096727",
  "biorxiv_url": "https://www.biorxiv.org/content/early/2016/12/29/096727",
  "url": "https://api.rxivist.org/v1/papers/25770",
  "title": "Parallel adaptation to higher temperatures in divergent clades of the nematode Pristionchus pacificus",
  "category": "evolutionary-biology",
  "first_posted": "2016-12-29",
  "abstract": "Studying the effect of temperature on fertility is particularly important in the light of ongoing climate change. We need to know if organisms can adapt to higher temperatures and, if so, what are the evolutionary mechanisms behind such adaptation. Such studies have been hampered by the lack different populations of sufficient sizes with which to relate the phenotype of temperature tolerance to the underlying genotypes. Here, we examined temperature adaptation in populations of the nematode Pristionchus pacificus, in which individual strains are able to successfully reproduce at 30°C. Analysis of the frequency of heat tolerant strains in different temperature zones on La Reunion supports that this trait is subject to natural selection. Reconstruction of ancestral states along the phylogeny of highly differentiated P. pacificus clades suggests that heat tolerance evolved multiple times independently. This is further supported by genome wide association studies showing that heat tolerance is a polygenic trait and that different loci are used by individual P. pacificus clades to develop heat tolerance. More precisely, analysis of allele frequencies indicated that most genetic markers that are associated with heat tolerance are only polymorphic in individual clades. While in some P. pacificus clades, parallel evolution of heat tolerance can be explained by ancestral polymorphism or by gene flow across clades, we observe at least one clearly distinct and independent scenario where heat tolerance emerged by de novo mutation. Thus, temperature tolerance evolved at least two times independently in the evolutionary history of this species. Our data suggest that studies of wild populations of P. pacificus will reveal distinct cellular mechanisms driving temperature adaptation.",
  "authors": [
    {
      "id": 1221,
      "name": "Mark Leaver",
      "institution": "Max Planck Institute of Molecular Cell Biology and Genetics;",
      "orcid": "http://orcid.org/0000-0003-2796-4312"
    },
    {
      "id": 1222,
      "name": "Merve Kayhan",
      "institution": "Bilkent University;",
      "orcid": null
    },
    {
      "id": 1223,
      "name": "Angela McGaughran",
      "institution": "Australian National University;",
      "orcid": null
    },
    {
      "id": 1224,
      "name": "Christian Roedelsperger",
      "institution": "Max Planck Institute for Developmental Biology;",
      "orcid": null
    },
    {
      "id": 1225,
      "name": "Anthony A Hyman",
      "institution": "Max Planck Institute of Molecular Cell Biology and genetics",
      "orcid": "http://orcid.org/0000-0003-3664-154X"
    },
    {
      "id": 1226,
      "name": "Ralf Sommer",
      "institution": "Max Planck Institute for Developmental Biology;",
      "orcid": "http://orcid.org/0000-0003-1503-7749"
    }
  ],
  "ranks": {
    "alltime": {
      "rank": 15658,
      "tie": false,
      "downloads": 290
    },
    "ytd": {
      "rank": 22951,
      "tie": false,
      "downloads": 68
    },
    "lastmonth": {
      "rank": 28283,
      "tie": 4,
      "downloads": 65
    },
    "category": {
      "rank": 1500,
      "tie": 4,
      "downloads": 290
    }
  },
  publication": {
    "journal": "Journal Name Here",
    "doi": "10.1038/1234567"
  }
}
    

Endpoint: Download data

Retrieve monthly download statistics for a single paper.
https://api.rxivist.org/v1/downloads/<id>

Arguments

  • id – Rxivist paper ID associated with the download data you want. Default: True

Example

Paper download data request

https://api.rxivist.org/v1/downloads/12345

Response (click to expand)
{
  "query": {
    "id": 12345
  },
  "results": [
    {
      "month": 6,
      "year": 2018,
      "downloads": 205,
      "views": 259
    },
    {
      "month": 7,
      "year": 2018,
      "downloads": 153,
      "views": 199
    },
    {
      "month": 8,
      "year": 2018,
      "downloads": 88,
      "views": 98
    },
    {
      "month": 9,
      "year": 2018,
      "downloads": 118,
      "views": 159
    },
    {
      "month": 10,
      "year": 2018,
      "downloads": 10,
      "views": 18
    }
  ]
}
    

Authors

Endpoint: Rankings

The top 200 authors for all-time downloads in a category.
https://api.rxivist.org/v1/authors

Arguments

  • category – The category to which results should be limited. Omitting one returns results for the entire site. Default: False
    • Acceptable values: animal-behavior-and-cognition, biochemistry, bioengineering, bioinformatics, biophysics, cancer-biology, cell-biology, clinical-trials, developmental-biology, ecology, epidemiology, evolutionary-biology, genetics, genomics, immunology, microbiology, molecular-biology, neuroscience, paleontology, pathology, pharmacology-and-toxicology, physiology, plant-biology, scientific-communication-and-education, synthetic-biology, systems-biology, zoology

Example

Author rankings request, limited to biophysics

https://api.rxivist.org/v1/authors?category=biophysics

Response (click to expand)
{
  "results": [
    {
      "id": 80168,
      "name": "Claudia Cattoglio",
      "rank": 1,
      "downloads": 2504,
      "tie": true
    },
    {
      "id": 47439,
      "name": "Xavier Darzacq",
      "rank": 1,
      "downloads": 2504,
      "tie": true
    },
    {
      "id": 47441,
      "name": "Robert Tjian",
      "rank": 1,
      "downloads": 2504,
      "tie": true
    },
    {
      "id": 19704,
      "name": "Patrick Cramer",
      "rank": 4,
      "downloads": 2389,
      "tie": false
    },
    {
      "id": 80823,
      "name": "Dimitry Tegunov",
      "rank": 5,
      "downloads": 2388,
      "tie": true
    },
    # ...and so on for 200 entries
  ]
}

Endpoint: Details

Retrieve data about a single author.
https://api.rxivist.org/v1/authors/<id>

Arguments

  • id – Rxivist paper ID associated with the author in question. Default: True

Example

Author detail request

https://api.rxivist.org/v1/authors/1222

Response (click to expand)
{
  "id": 1222,
  "name": "Merve Kayhan",
  "institution": "Bilkent University;",
  "orcid": null,
  "emails": [
    "merve.kayhan@bilkent.edu.tr"
  ],
  "articles": [
    {
      "id": 25770,
      "doi": "10.1101/096727",
      "biorxiv_url": "https://www.biorxiv.org/content/early/2016/12/29/096727",
      "url": "https://api.rxivist.org/v1/papers/25770",
      "title": "Parallel adaptation to higher temperatures in divergent clades of the nematode Pristionchus pacificus",
      "category": "evolutionary-biology",
      "ranks": {
        "alltime": {
          "rank": 15658,
          "tie": false,
          "downloads": 290
        },
        "ytd": {
          "rank": 22951,
          "tie": false,
          "downloads": 65
        },
        "lastmonth": {
          "rank": 28283,
          "tie": 4,
          "downloads": 65
        },
        "category": {
          "rank": 28283,
          "tie": 4,
          "downloads": 0
        }
      }
    }
  ],
  "ranks": [
    {
      "downloads": 134075,
      "rank": 1,
      "out_of": 104795,
      "tie": false,
      "category": "alltime"
    },
    {
      "downloads": 3126,
      "rank": 1120,
      "out_of": 14752,
      "tie": true,
      "category": "bioinformatics"
    },
    {
      "downloads": 130949,
      "rank": 1,
      "out_of": 21996,
      "tie": false,
      "category": "neuroscience"
    }
  ]
}
    

API details

Endpoint: Category list

A list of all bioRxiv "collections," or categories, currently available via Rxivist.
https://api.rxivist.org/v1/data/categories

Example

https://api.rxivist.org/v1/data/categories

Response (click to expand)
{
  "results": [
    "animal-behavior-and-cognition",
    "biochemistry",
    "bioengineering",
    "bioinformatics",
    "biophysics",
    "cancer-biology",
    "cell-biology",
    "clinical-trials",
    "developmental-biology",
    "ecology",
    "epidemiology",
    "evolutionary-biology",
    "genetics",
    "genomics",
    "immunology",
    "microbiology",
    "molecular-biology",
    "neuroscience",
    "paleontology",
    "pathology",
    "pharmacology-and-toxicology",
    "physiology",
    "plant-biology",
    "scientific-communication-and-education",
    "synthetic-biology",
    "systems-biology",
    "zoology"
  ]
}
    

Endpoint: Total entities

Basic information about how many papers and authors are indexed by Rxivist.
https://api.rxivist.org/v1/data/stats

Example

https://api.rxivist.org/v1/data/stats

Response (click to expand)
{
  "papers_indexed": 34409,
  "authors_indexed": 145540,
  "missing_abstract": 0,
  "missing_date": 3539,
  "outdated_count": {
    "biophysics": 66,
    "cell-biology": 134,
    "developmental-biology": 356,
    "ecology": 3,
    "epidemiology": 1,
    "evolutionary-biology": 87,
    "genetics": 1250,
    "genomics": 292,
    "immunology": 93,
    "microbiology": 407,
    "molecular-biology": 212,
    "neuroscience": 391,
    "pharmacology-and-toxicology": 57,
    "physiology": 56,
    "plant-biology": 68,
    "scientific-communication-and-education": 51
  }
}
    

Endpoint: Site-wide metric distributions

Histogram-style binned data summarizing how many papers or authors have a given metric within the range of each bin. For example, the paper downloads distribution might say that there are 15 papers with between 0 and 9 downloads, 35 papers with between 10 and 19 downloads, 70 papers with between 20 and 29 downloads, and so on. ALSO provides averages for the specified metric.
https://api.rxivist.org/v1/data/distributions/<entity>/<metric>

Arguments

  • entity – Which object should be used to group the metric totals. Default: papers
    • Acceptable values: paper, author
  • metric – Which metric to evaluate. Not currently available for Twitter data. Default: downloads
    • Acceptable values: downloads

Example

Paper download distribution

https://api.rxivist.org/v1/data/distributions/paper/downloads

Response (click to expand)
{
  "averages": {
    "mean": 480,
    "median": 253
  },
  "histogram": [
    {
      "bucket_min": 0,
      "count": 4
    },
    {
      "bucket_min": 2,
      "count": 21
    },
    {
      "bucket_min": 4,
      "count": 92
    },
    {
      "bucket_min": 8,
      "count": 128
    },
    {
      "bucket_min": 16,
      "count": 272
    },
    {
      "bucket_min": 32,
      "count": 457
    }
  ]
}