Realtime API Documentation
Receive real-time event updates, from all supported Wikimedia projects, using the streaming (firehose) endpoint, or via batch file endpoints that are generated hourly with changes throughout the day.
Batch files return NDJSON in a tarball (.tar.gz). Streaming supports server-sent events (SSE) by default or NDJSON when you pass the Accept: application/x-ndjson
header. Event types are: update, delete, visibility-change:
- An
update
event type is sent when an article is created, its content is updated, or its name or namespace is changed. - A
delete
event type is sent when an article has been deleted. - A
visibility-change
event type is sent when the visibility of an article’s editor, comment, or content is changed by community volunteers
For access to Realtime APIs, contact our sales team.
Blog: Realtime API Parallel Connections and Restart Support
Article Updates (Streaming)
Returns a stream of article updates, deletions, and visibility changes across all supported projects. The event type is found in the article.event.type
field. Possible values are update
, delete
, and visibility-change
.
since
- string
- Optional
- A timestamp in RFC3339 format (e.g. '2006-01-02T15:04:05Z07:00') that specifies the start time for the data you want to receive.
fields
- array
- Optional
- A list of fields to receive in your response (e.g.
version.*
will return all version object fields). filters
- array
- Optional
- Specify how you want to filter your data.
parts
- array
- Optional
- This parameter is used when opening parallel connections to the realtime API. Using parts, one can target subsets of partitions in each of the parallel connections. The max allowed number of parallel connections is 10, i.e., the parts can take 0 through 9. Each of these parts represent 1/10th of the subsequent partition. For instance, parts 0 correspond to partitions 0 through 4; parts 1 correspond to partitions 5 through 9 and so on.
offsets
- object
- Optional
- Use the
offsets
parameter to resume from a specific point in a partition's event stream when reconnecting. Pass a map ofpartition:offset
when reconnecting. This indicates the offset from which the Realtime API should start sending events for a specific partition. If theoffsets
map includes a partition not represented in theparts
parameter, it will be ignored. If theoffsets
map does not include a partition that is represented in theparts
parameter, events from that partition will be delivered in live mode (as they happen). since_per_partition
- object
- Optional
- Use the
since_per_partition
parameter when reconnecting to the Realtime API. Pass a map ofpartition:timestamp
(with the timestamp in RFC3339 format) when reconnecting. This indicates the timestamp from which the Realtime API should start sending events for a specific partition. If thesince_per_partition
map includes a partition not represented in theparts
parameter, it will be ignored. If thesince_per_partition
map does not include a partition that is represented in theparts
paramter, events from that partition will be delivered in live mode (as they happen).
-
text/event-stream
{ "event": { "identifier": "string", "type": "string", "date_created": "string", "date_published": "string", "partition": "integer", "offset": "integer" }, "additional_entities": "array", "article_body": { "html": "string", "wikitext": "string" }, "categories": "array", "date_modified": "string", "identifier": "integer", "in_language": { "identifier": "string", "name": "string" }, "is_part_of": { "date_modified": "string", "identifier": "string", "in_language": { "identifier": "string", "name": "string" }, "name": "string", "size": { "unit_text": "string", "value": "number" }, "url": "string", "version": "string" }, "license": "array", "main_entity": { "aspects": "array", "identifier": "string", "url": "string" }, "name": "string", "abstract": "string", "namespace": { "identifier": "integer", "name": "string" }, "protection": "array", "redirects": "array", "templates": "array", "url": "string", "version": { "comment": "string", "editor": { "date_started": "string", "edit_count": "integer", "groups": "array", "identifier": "integer", "is_anonymous": "boolean", "is_bot": "boolean", "name": "string" }, "identifier": "integer", "is_flagged_stable": "boolean", "is_minor_edit": "boolean", "is_breaking_news": "boolean", "noindex": "boolean", "scores": { "revertrisk": { "prediction": "boolean", "probability": "object" }, "referencerisk": { "reference_risk_score": "number" }, "referenceneed": { "reference_need_score": "number" } }, "maintenance_tags": { "citation_needed_count": "integer", "pov_count": "integer", "clarification_needed_count": "integer", "update_count": "integer" }, "tags": "array" }, "visibility": { "comment": "boolean", "text": "boolean", "user": "boolean" } }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
since
- string
fields
- array
filters
- array
parts
- array
offsets
- object
since_per_partition
- object
application/json{ "since": "2006-01-02T15:04:05Z", "fields": [ "name", "identifier" ], "filters": [ { "field": "in_language.identifier", "value": "en" } ], "parts": [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 ], "offsets": { "0": 3614782, "4": 3593806, "8": 3588693 }, "since_per_partition": { "1": "2023-06-05T12:00:00Z", "2": "2023-06-05T12:00:00Z" } }
-
text/event-stream
{ "event": { "identifier": "string", "type": "string", "date_created": "string", "date_published": "string", "partition": "integer", "offset": "integer" }, "additional_entities": "array", "article_body": { "html": "string", "wikitext": "string" }, "categories": "array", "date_modified": "string", "identifier": "integer", "in_language": { "identifier": "string", "name": "string" }, "is_part_of": { "date_modified": "string", "identifier": "string", "in_language": { "identifier": "string", "name": "string" }, "name": "string", "size": { "unit_text": "string", "value": "number" }, "url": "string", "version": "string" }, "license": "array", "main_entity": { "aspects": "array", "identifier": "string", "url": "string" }, "name": "string", "abstract": "string", "namespace": { "identifier": "integer", "name": "string" }, "protection": "array", "redirects": "array", "templates": "array", "url": "string", "version": { "comment": "string", "editor": { "date_started": "string", "edit_count": "integer", "groups": "array", "identifier": "integer", "is_anonymous": "boolean", "is_bot": "boolean", "name": "string" }, "identifier": "integer", "is_flagged_stable": "boolean", "is_minor_edit": "boolean", "is_breaking_news": "boolean", "noindex": "boolean", "scores": { "revertrisk": { "prediction": "boolean", "probability": "object" }, "referencerisk": { "reference_risk_score": "number" }, "referenceneed": { "reference_need_score": "number" } }, "maintenance_tags": { "citation_needed_count": "integer", "pov_count": "integer", "clarification_needed_count": "integer", "update_count": "integer" }, "tags": "array" }, "visibility": { "comment": "boolean", "text": "boolean", "user": "boolean" } }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
Available Hourly Batches
Returns a list of available Realtime (Batch) bundles by date and hour (00, 01, …, 23). Includes identifiers, file sizes and other relevant metadata.
date
- string
- Required
hour
- string
- Required
fields
- array
- Optional
- Select which fields to receive in your response.
filters
- array
- Optional
- Select which projects and languages to receive in your response.
-
application/json
[ { "identifier": "string", "name": "string", "version": "string", "in_language": { "identifier": "string", "name": "string", "alternate_name": "string", "direction": "string" }, "is_part_of": { "identifier": "string", "code": "string", "name": "string", "url": "string", "in_language": { "identifier": "string", "name": "string", "alternate_name": "string", "direction": "string" } }, "namespace": { "identifier": "number", "name": "string", "description": "string" }, "size": { "unit_text": "string", "value": "number" } } ]
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
date
- string
- Required
hour
- string
- Required
fields
- array
filters
- array
application/json{ "fields": [ "name", "identifier" ], "filters": [ { "field": "namespace.identifier", "value": 0 } ] }
-
application/json
[ { "identifier": "string", "name": "string", "version": "string", "in_language": { "identifier": "string", "name": "string", "alternate_name": "string", "direction": "string" }, "is_part_of": { "identifier": "string", "code": "string", "name": "string", "url": "string", "in_language": { "identifier": "string", "name": "string", "alternate_name": "string", "direction": "string" } }, "namespace": { "identifier": "number", "name": "string", "description": "string" }, "size": { "unit_text": "string", "value": "number" } } ]
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
Single Hourly Metadata
Information on specific hourly batch. Includes identifier, file size and other relevant metadata.
date
- string
- Required
hour
- string
- Required
identifier
- string
- Required
- Batch identifier.
fields
- array
- Optional
- Select which fields to receive in your response.
-
application/json
{ "identifier": "string", "name": "string", "version": "string", "in_language": { "identifier": "string", "name": "string", "alternate_name": "string", "direction": "string" }, "is_part_of": { "identifier": "string", "code": "string", "name": "string", "url": "string", "in_language": { "identifier": "string", "name": "string", "alternate_name": "string", "direction": "string" } }, "namespace": { "identifier": "number", "name": "string", "description": "string" }, "size": { "unit_text": "string", "value": "number" } }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
date
- string
- Required
hour
- string
- Required
identifier
- string
- Required
- Batch identifier.
fields
- array
application/json{ "fields": [ "name", "identifier" ] }
-
application/json
{ "identifier": "string", "name": "string", "version": "string", "in_language": { "identifier": "string", "name": "string", "alternate_name": "string", "direction": "string" }, "is_part_of": { "identifier": "string", "code": "string", "name": "string", "url": "string", "in_language": { "identifier": "string", "name": "string", "alternate_name": "string", "direction": "string" } }, "namespace": { "identifier": "number", "name": "string", "description": "string" }, "size": { "unit_text": "string", "value": "number" } }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
Project Updates (Batch)
Downloadable bundle of updated articles by project, namespace, date, and hour. Generated hourly starting at 00:00 UTC each day.
date
- string
- Required
hour
- string
- Required
identifier
- string
- Required
- Batch identifier.
Range
- string
- Optional
- The Range HTTP request header indicates the part of a document that the server should return.
-
application/gzip
{}
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
Set of headers that describe the hourly download.
date
- string
- Required
hour
- string
- Required
identifier
- string
- Required
- Batch identifier.
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }
-
application/json
{ "message": "string", "status": "integer" }