Snapshot API Documentation
Our Snapshot API provides you the entirety of a Wikimedia project (examples: English Wikipedia or French Wiktionary) in a compressed file format. These project files include all articles (pages) with all data fields in NDJSON format updated twice-monthly and are provided free with an account.
Use our metadata endpoints to find the project snapshot(s) you need and make use of parallel downloading headers or chunking to break up large files. Feel free to reference our Data Dictionary for information about the fields in the payload.
If you require snapshots updated daily and/or access to the new Structured Contents snapshots, please contact our sales team for access.
Available Snapshots
Returns a list of available project snapshots by namespace. Includes identifiers, file sizes and other relevant metadata.
Returns a list of available project snapshots by namespace. Includes identifiers, file sizes and other relevant metadata.
fields
- array
- Optional
- Allows to select what fields you receive in your response.
filters
- array
- Optional
- Allows you to filter the response payload.
-
application/json
[ { "identifier": "string", "name": "string", "version": "string", "in_language": { "identifier": "string", "name": "string", "alternate_name": "string", "direction": "string" }, "is_part_of": { "identifier": "string", "code": "string", "name": "string", "url": "string", "in_language": { "identifier": "string", "name": "string", "alternate_name": "string", "direction": "string" } }, "namespace": { "identifier": "number", "name": "string", "description": "string" }, "size": { "unit_text": "string", "value": "number" }, "chunks": "array" } ]
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
Returns a list of available project snapshots by namespace. Includes identifiers, file sizes and other relevant metadata.
fields
- array
filters
- array
application/json{ "fields": "[\"name\",\"identifier\"]\n", "filters": "[{\"field\":\"namespace.identifier\",\"value\":0}]\n" }
-
application/json
[ { "identifier": "string", "name": "string", "version": "string", "in_language": { "identifier": "string", "name": "string", "alternate_name": "string", "direction": "string" }, "is_part_of": { "identifier": "string", "code": "string", "name": "string", "url": "string", "in_language": { "identifier": "string", "name": "string", "alternate_name": "string", "direction": "string" } }, "namespace": { "identifier": "number", "name": "string", "description": "string" }, "size": { "unit_text": "string", "value": "number" }, "chunks": "array" } ]
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
Snapshot Bundle Info
Information on a specific Snapshot bundle. Includes identifiers, file sizes and other relevant metadata.
Information on a specific Snapshot bundle. Includes identifiers, file sizes and other relevant metadata.
identifier
- string
- Required
- Snapshot identifier.
fields
- array
- Optional
- Allows to select what fields you receive in your response.
-
application/json
{ "identifier": "string", "name": "string", "version": "string", "in_language": { "identifier": "string", "name": "string", "alternate_name": "string", "direction": "string" }, "is_part_of": { "identifier": "string", "code": "string", "name": "string", "url": "string", "in_language": { "identifier": "string", "name": "string", "alternate_name": "string", "direction": "string" } }, "namespace": { "identifier": "number", "name": "string", "description": "string" }, "size": { "unit_text": "string", "value": "number" }, "chunks": "array" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
Information on a specific Snapshot bundle. Includes identifiers, file sizes and other relevant metadata.
identifier
- string
- Required
- Snapshot identifier.
fields
- array
application/json{ "fields": "[\"name\",\"identifier\"]\n" }
-
application/json
{ "identifier": "string", "name": "string", "version": "string", "in_language": { "identifier": "string", "name": "string", "alternate_name": "string", "direction": "string" }, "is_part_of": { "identifier": "string", "code": "string", "name": "string", "url": "string", "in_language": { "identifier": "string", "name": "string", "alternate_name": "string", "direction": "string" } }, "namespace": { "identifier": "number", "name": "string", "description": "string" }, "size": { "unit_text": "string", "value": "number" }, "chunks": "array" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
Project Snapshot
Downloadable bundle of all current revisions in a specified project and namespace. Updated daily at 12:00 UTC.
Downloadable bundle of all current revisions in a specified project and namespace. Updated daily at 12:00 UTC.
identifier
- string
- Required
- Snapshot identifier.
Range
- string
- Optional
- The Range HTTP request header indicates the part of a document that the server should return.
-
application/gzip
{}
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
Set of headers that describe the snapshot download.
identifier
- string
- Required
- Snapshot identifier.
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
Available Snapshot Chunks
Returns a list of available chunks for a specific snapshot. Includes chunk identifiers and other relevant metadata.
Returns a list of available chunks for a specific snapshot. Includes chunk identifiers and other relevant metadata.
snapshot_identifier
- string
- Required
- Snapshot identifier.
fields
- array
- Optional
- Allows to select what fields you receive in your response.
filters
- array
- Optional
- Allows you to filter the response payload.
-
application/json
{ "identifier": "string", "name": "string", "version": "string", "in_language": { "identifier": "string", "name": "string", "alternate_name": "string", "direction": "string" }, "is_part_of": { "identifier": "string", "code": "string", "name": "string", "url": "string", "in_language": { "identifier": "string", "name": "string", "alternate_name": "string", "direction": "string" } }, "namespace": { "identifier": "number", "name": "string", "description": "string" }, "size": { "unit_text": "string", "value": "number" }, "chunks": "array" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
Returns a list of available chunks for a specific snapshot. Includes chunk identifiers and other relevant metadata.
snapshot_identifier
- string
- Required
- Snapshot identifier.
fields
- array
filters
- array
application/json{ "fields": "[\"name\",\"identifier\"]\n", "filters": "[{\"field\":\"identifier\",\"value\":\"hiwiki_namespace_0_chunk_0\"}]\n" }
-
application/json
{ "identifier": "string", "name": "string", "version": "string", "in_language": { "identifier": "string", "name": "string", "alternate_name": "string", "direction": "string" }, "is_part_of": { "identifier": "string", "code": "string", "name": "string", "url": "string", "in_language": { "identifier": "string", "name": "string", "alternate_name": "string", "direction": "string" } }, "namespace": { "identifier": "number", "name": "string", "description": "string" }, "size": { "unit_text": "string", "value": "number" }, "chunks": "array" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
Snapshot Chunk Info
Information on a specific chunk of a snapshot. Includes chunk identifier, size, and other relevant metadata.
Information on a specific chunk of a snapshot. Includes chunk identifier, size, and other relevant metadata.
snapshot_identifier
- string
- Required
- Snapshot identifier.
identifier
- string
- Required
- Chunk identifier or index.
fields
- array
- Optional
- Allows to select what fields you receive in your response.
-
application/json
{ "identifier": "string", "name": "string", "version": "string", "in_language": { "identifier": "string", "name": "string", "alternate_name": "string", "direction": "string" }, "is_part_of": { "identifier": "string", "code": "string", "name": "string", "url": "string", "in_language": { "identifier": "string", "name": "string", "alternate_name": "string", "direction": "string" } }, "namespace": { "identifier": "number", "name": "string", "description": "string" }, "size": { "unit_text": "string", "value": "number" }, "chunks": "array" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
Information on a specific chunk of a snapshot. Includes chunk identifier, size, and other relevant metadata.
snapshot_identifier
- string
- Required
- Snapshot identifier.
identifier
- string
- Required
- Chunk identifier or index.
fields
- array
application/json{ "fields": "[\"name\",\"identifier\"]\n" }
-
application/json
{ "identifier": "string", "name": "string", "version": "string", "in_language": { "identifier": "string", "name": "string", "alternate_name": "string", "direction": "string" }, "is_part_of": { "identifier": "string", "code": "string", "name": "string", "url": "string", "in_language": { "identifier": "string", "name": "string", "alternate_name": "string", "direction": "string" } }, "namespace": { "identifier": "number", "name": "string", "description": "string" }, "size": { "unit_text": "string", "value": "number" }, "chunks": "array" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
Download Snapshot Chunk
Downloadable bundle of a specific chunk from a snapshot. Returns a zipped tar file.
Downloadable bundle of a specific chunk from a snapshot. Returns a zipped tar file.
snapshot_identifier
- string
- Required
- Snapshot identifier.
identifier
- string
- Required
- Chunk identifier or index.
Range
- string
- Optional
- The Range HTTP request header indicates the part of a document that the server should return.
-
application/gzip
{}
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
Set of headers that describe the snapshot chunk download.
snapshot_identifier
- string
- Required
- Snapshot identifier.
identifier
- string
- Required
- Chunk identifier or index.
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
Available Structured Contents Snapshots (BETA)
Returns a list of available project structured contents snapshots.
Returns a list of available project structured contents snapshots.
fields
- array
- Optional
- Allows to select what fields you receive in your response.
filters
- array
- Optional
- Allows you to filter the response payload.
-
application/json
[ { "identifier": "string", "name": "string", "version": "string", "in_language": { "identifier": "string", "name": "string", "alternate_name": "string", "direction": "string" }, "is_part_of": { "identifier": "string", "code": "string", "name": "string", "url": "string", "in_language": { "identifier": "string", "name": "string", "alternate_name": "string", "direction": "string" } }, "namespace": { "identifier": "number", "name": "string", "description": "string" }, "size": { "unit_text": "string", "value": "number" }, "chunks": "array" } ]
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
Returns a list of available project structured contents snapshots.
fields
- array
filters
- array
application/json{ "fields": "[\"name\",\"identifier\"]\n", "filters": "[{\"field\":\"is_part_of.identifier\", \"value\":\"enwiki\"}]\n" }
-
application/json
[ { "identifier": "string", "name": "string", "version": "string", "in_language": { "identifier": "string", "name": "string", "alternate_name": "string", "direction": "string" }, "is_part_of": { "identifier": "string", "code": "string", "name": "string", "url": "string", "in_language": { "identifier": "string", "name": "string", "alternate_name": "string", "direction": "string" } }, "namespace": { "identifier": "number", "name": "string", "description": "string" }, "size": { "unit_text": "string", "value": "number" }, "chunks": "array" } ]
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
Structured Contents Snapshot Bundle Info (BETA)
Information on a specific structured contents snapshot bundle.
Information on a specific structured contents snapshot bundle.
identifier
- string
- Required
- Structured Contents Snapshot identifier.
fields
- array
- Optional
- Allows to select what fields you receive in your response.
-
application/json
{ "identifier": "string", "name": "string", "version": "string", "in_language": { "identifier": "string", "name": "string", "alternate_name": "string", "direction": "string" }, "is_part_of": { "identifier": "string", "code": "string", "name": "string", "url": "string", "in_language": { "identifier": "string", "name": "string", "alternate_name": "string", "direction": "string" } }, "namespace": { "identifier": "number", "name": "string", "description": "string" }, "size": { "unit_text": "string", "value": "number" }, "chunks": "array" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
Information on a specific structured contents snapshot bundle.
identifier
- string
- Required
- Structured Contents Snapshot identifier.
fields
- array
application/json{ "fields": "[\"name\",\"identifier\"]\n" }
-
application/json
{ "identifier": "string", "name": "string", "version": "string", "in_language": { "identifier": "string", "name": "string", "alternate_name": "string", "direction": "string" }, "is_part_of": { "identifier": "string", "code": "string", "name": "string", "url": "string", "in_language": { "identifier": "string", "name": "string", "alternate_name": "string", "direction": "string" } }, "namespace": { "identifier": "number", "name": "string", "description": "string" }, "size": { "unit_text": "string", "value": "number" }, "chunks": "array" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
Project Structured Contents Snapshot (BETA)
Downloadable bundle of structured contents of all current revisions in a specified project and namespace. Updated daily at 12:00 UTC.
Downloadable bundle of structured contents of all current revisions in a specified project and namespace. Updated daily at 12:00 UTC.
identifier
- string
- Required
- Structured Contents Snapshot identifier.
Range
- string
- Optional
- The Range HTTP request header indicates the part of a document that the server should return.
-
application/gzip
{}
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
Set of headers that describe the structured contents snapshot download.
identifier
- string
- Required
- Snapshot identifier.
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }