Data files associated with: Leihy, R. I., Coetzee, B. W. T., Morgan, F., Raymond, B., Shaw, J. D., Terauds, A., Bastmeijer, K. & Chown, S. L. 2020. Antarctica’s wilderness fails to capture continent’s biodiversity. Nature, doi: 10.1038/s41586-020-2506-3. Data citation: Leihy, R. I., Coetzee, B. W. T., Morgan, F., Raymond, B., Shaw, J. D., Terauds, A., Bastmeijer, K. & Chown, S. L. 2020. Figshare, doi: 10.26180/5c32bf1b041ea. Overview: Although the entirety of Antarctica is often considered well-protected, increasing and diversifying human activity across the continent demonstrates that this is not the case. The extent of Antarctica’s wilderness remains unknown, despite the fact that the protection of wilderness values and inviolate areas kept free from human interference is enshrined in Antarctic Law (Protocol on Environmental Protection to the Antarctic Treaty, Art. 3, para.1 and Annex V). We assembled a comprehensive record of ground-based human activity records (~ 2.7 million records; see meta data file), sourced from publications and scientific databases, spanning 1819 to 2018, and used it to identify wilderness areas and their representation of biodiversity. We assessed the extent of wilderness using six global and regional definitions of wilderness: Definition 1: Globally Significant Wilderness. Large (≥ 10,000 km^2), mostly intact (≥ 70% of historical habitat extent) areas, and with low human population densities (≤ 5 people per km^2). Definition 2: Undeveloped Antarctic Wilderness. Antarctic land areas without human infrastructure. Definition 3: Visibly Pristine Antarctic Wilderness. Antarctic land areas with an absence of visible evidence of human activity (i.e. excluding areas within a visible distance of human infrastructure). Definition 4: Negligibly Impacted Antarctic Wilderness. Large, contiguous land areas (≥ 10,000 km^2), where the cumulative impacts of historical human visitation are likely to have been negligible or non-existent. This definition weights the number of independent visitation records to each cell, on a 25 km^2 grid of Antarctica, by the proportion of ice-free area per cell and the proportion of ice-free area in the eight adjacent cells (as measures of substrate sensitivity to human activity and ice-free area connectedness). Large, contiguous areas (≥ 10,000 km^2) of cells where the impact of human visitation is likely to have been negligible (cells with a visitation impact value below a threshold, or unvisited cells) were defined as a wilderness area. Definition 5: Inviolate Antarctic Wilderness. Large, contiguous land areas (≥ 10,000 km^2), with no historical human visitation records. Definition 6: Biodiversity relevant Antarctic Wilderness. Negligibly Impacted Antarctic Wilderness areas that intersect with the ice-free, relatively biodiversity-rich (compared to permanently ice-covered areas) Antarctic Conservation Biogeographic Regions (ACBRs). Definitions 1-3 and 6 were calculated using a high-resolution Antarctic coastline shapefile, peak summer 2018 Antarctic facilities population estimates, human infrastructure footprint and ACBR spatial layer (data sources: SCAR (2018), COMNAP (2018), Brooks et al. (2019) and Terauds & Lee (2017)). Here, we publish the complete historical human activity record (spatial points and meta data) and R-code used to identify Negligibly Impacted Antarctic Wilderness areas and Inviolate Antarctic Wilderness areas. We also publish the wilderness grids (rasters and spatial polygons) of the Negligibly Impacted Antarctic Wilderness and Inviolate Antarctic Wilderness areas. We include a raster brick containing three 25 km^2 grids of Antarctica, where cell values equal the proportion each cell overlaps with land, the proportion each cell overlaps with ice-free areas or the proportion of the eight adjacent cells that overlaps with ice-free areas, rasterised from the high-resolution coastline and exposed rock outcrop shape files from the Scientific Committee for Antarctic Research's Antarctic Digital Database. The code also requires the high-resolution coastline of Antarctica, which is available at: Scientific Committee on Antarctic Research, (SCAR), Antarctic Digital Database, version 7. www.add.scar.org File descriptions: The data records contain two wilderness grids, available in the native raster package format (.grd, R “raster” package, ver. 2.6-7; Hijmans 2017) and netCDF format (.nc), and the same data as spatial polygons outlining the extent of each wilderness area. These data have a projected coordinate system (South Pole Lambert Azimuthal Equal Area; ESRI:102020). The Negligibly Impacted Antarctic Wilderness has a 5 km (25 km^2) resolution. The Inviolate Antarctic wilderness grid has 50 km (2500 km^2) resolution. Cells with a value of one indicate wilderness cells. Cells with an NA or 0 value are either marine areas or non-wilderness terrestrial areas. The complete human activity shapefile comprises of high-resolution (~< 25km^2) spatial points for ground-based human activity across Antarctica from 1819-2018. These records include scientific sampling sites, traverses, infrastructure and tourism records, but exclude marine data, aerial surveys, remote-sensed data and data from social networking and public image-hosting services (e.g. Twitter, Flickr, Facebook). These data have a projected coordinate system (South Pole Lambert Azimuthal Equal Area; ESRI:102020). The meta data file (.csv) contains the data source information for the ground-based human activity records, including the number of records per source and the number of unique locality records per source. The Antarctic terrestrial layers (.grd) are a raster brick containing three 25 km^2 grids of Antarctica, where cell values equal the proportion each cell overlaps with land (layer 1), the proportion each cell overlaps with ice-free areas (layer 2) or the proportion of the eight adjacent cells that overlaps with ice-free areas (layer 3), rasterised from the high-resolution coastline and exposed rock outcrop shape files from the Scientific Committee for Antarctic Research's Antarctic Digital Database (www.add.scar.org). These data are used to weight the number of independent visitation records per site by the ice-free area of each site, and exclude marine areas from the wilderness grids. Usage Notes: Files are available in shapefile (.shp), native raster package (.grd) and netCDF (.nc) data formats. These can be viewed in standard GIS software, including: ArcGIS- https://www.arcgis.com QGIS- http://www.qgis.org/ R- https://www.r-project.org/ To view rasters and shapefiles in R, the ‘raster’ (v.2.6-7; Hijmans 2018) and ‘rgdal’ (v.1.3-4; Bivand et al. 2018) packages may be required. These data have a few important usage caveats. The first is that the human activity data is conservative in terms of data-availability because some activities have gone unreported, un-digitised or unpublished, or were not captured in the data search. The Inviolate Antarctic Wilderness areas are therefore putatively unvisited based on the available activity records. These data will also need to be updated as human activity continues in the region and more records become publicly-available. Second, presence/absence activity records do not capture the number of people per visit, length of stay, infrastructure installed, activity or transport type, and some records may be duplicated across data sources. Likewise, reporting frequency varied across data sources and activities, with some recent traverses with vehicle-mounted GPS devices automatically recording localities at a high temporal resolution for presumably transient visits. Here, we have divided the data records into multiple-event sources, which describe large collections of human activity, and single-event sources, which describe one field trip or traverse, to reduce bias caused by different sampling frequencies. We also weighted tourism records as more likely to have impacted the wilderness value of sites, because tourism groups tend to be larger than non-tourism groups. We weighted ice-free areas as more likely to be impacted by human activity than glaciated areas because ice-free areas are the most biodiverse areas in Antarctica, and slow to recover from disturbance. Finally, some wilderness cells in the Negligibly Impacted Antarctic Wilderness areas grid may overlap in part with marine areas. To calculate the total area of Negligibly Impacted Antarctic Wilderness areas across Antarctica, we cropped the Negligibly Impacted Antarctic Wilderness raster layer by a high-resolution Antarctic coastline shapefile from the Antarctic Digital Database, to excluded some cell areas that overlap with marine areas. References Bivand, R., Keitt T. & Rowlingson, B. (2018). rgdal: Bindings for the 'Geospatial' Data Abstraction Library. R package version 1.3-4. https://CRAN.R-project.org/package=rgdal Brooks, S. T. et al. (2019). Our footprint on Antarctica competes with nature for rare ice-free land. Nat. Sustain. 2, 185–190. Council of Managers of National Antarctic Programs, (COMNAP). (2018). Antarctic facilities operated by National Antarctic Programs in the Antarctic Treaty Area (South of 60° latitude South), version 3.0.1. https://www.comnap.aq, downloaded on 8 August 2018. Hijmans, R. J. (2017). raster: Geographic Data Analysis and Modeling. R package version 2.6-7. https://CRAN.R-project.org/package=raster Scientific Committee on Antarctic Research, (SCAR). (2018). Antarctic Digital Database, version 7. www.add.scar.org . Terauds, A. & J. R. Lee. (2016). Antarctic biogeography revisited: updating the Antarctic Conservation Biogeographic Regions. Divers. Distrib. 22, 836–840.