Sustainability of Digital Formats: Planning for Library of Congress Collections |
|
| Introduction | Sustainability Factors | Content Categories | Format Descriptions | Contact | |


| Full name | YAML Ain't Markup Language (YAML) |
|---|---|
| Description |
YAML is a human-readable data serialization language that is compatible with most programming languages. The specification calls it "both a text format and a method for presenting any native data structure in this format." YAML was developed in the early 2000s as an alternative to, and a simpler version of, XML. Common use cases for YAML mentioned in the specification include "configuration files, log files, interprocess messaging, cross-language data sharing, object persistence and debugging of complex data structures." YAML syntax borrows features from other programming and markup languages, such as C, HTML, Perl, Python, XML, among others. But unlike other markup languages that use symbols to structure a document, YAML minimizes structural characters by instead using Python-style indentation. A YAML document begins with 3 hyphens (---) and ends with 3 periods (...), and a YAML stream (i.e., file) can contain multiple documents. YAML streams use printable Unicode characters. YAML represents any native data structure using nodes with one of three kinds of content: mappings (hashes/dictionaries), sequences (arrays/lists), and scalars (strings/numbers). For key/value pairs, the key is separated from its value with a colon and a space (: ). Entries in a sequence begin with a hyphen and space (- ). YAML tags indicate data types, using a single exclamation mark (!) for local/user-defined data types and two exclamation marks (!!) for universal data types. Comments begin with a pound/hash sign (#) and can appear after a document value or on their own line. As of version 1.2, YAML is a strict superset of JSON, meaning that valid JSON is also valid YAML. Although YAML's design goals include that it should be easily readable by humans, portable between programming languages, and easy to implement and use, Wikipedia lists some criticisms of YAML that are counter to these goals. Section 4 of RFC 9512 lists security considerations for the format. Such complaints and concerns have led to the development of YAML alternatives, such as YAML parsers that only validate a restricted subset of the YAML specification. |
| Production phase | Can be used as initial, middle, or final-state format. |
| Relationship to other formats | |
| Has subtype | JSON, JSON (JavaScript Object Notation). As of YAML 1.2, YAML is a strict superset of JSON. |

| LC experience or existing holdings |
YAML is one of the output formats provided by the loc.gov API, which provides structured data about Library of Congress collections. The following parameter can be added to an endpoint to view the YAML: ?fo=yaml (example: https://aj.sunback.homes/item/2012592226/?fo=yaml). Refer to the API documentation for more information: https://aj.sunback.homes/apis/json-and-yaml/. The Library of Congress has a small amount of YAML files in its collections. |
|---|---|
| LC preference | See the Library of Congress Recommended Formats Statement for format preferences for textual works and datasets. |

| Disclosure | Open standard. Since 2020, it has been maintained by the YAML Language Development Team. |
|---|---|
| Documentation | YAML specifications are available on https://yaml.org/. The most recent specification is YAML 1.2, Revision 1.2.2, released October 1, 2021. Development of the YAML language takes place in the public repository https://github.com/yaml/yaml-spec. |
| Adoption |
YAML has been widely adopted for configuration files and data exchange between systems. The YAML website has an extensive list of frameworks and tools that have been developed to integrate YAML into other languages, including Python, C/C++, Perl, Ruby, Java, JavaScript, and more. According to an IBM article from 2023, YAML is used in DevOps to define infrastructure as code (IaC), create deployment files, and define continuous integration and continuous delivery (CI/CD) pipelines, and some popular DevOps tools that use YAML include Ansible, Kubernetes, GitHub, and Docker Compose, among others. |
| Licensing and patents | None. |
| Transparency | YAML files are human readable in a text editor and use Unicode printable characters. |
| Self-documentation |
Like other programming or markup languages, YAML files can be documented through comments. YAML tags are used to indicate data type metadata. The specification notes that tags "may provide additional information such as the set of allowed content values for validation, a mechanism for tag resolution or any other data that is applicable to all of the tag's nodes." Accessibility Features Other than being a structured text file, YAML has no specific attributes to support accessibility. Comments welcome. |
| External dependencies | None. |
| Technical protection considerations | None. |

| Text | |
|---|---|
| Normal rendering | YAML was developed to be both human- and machine-readable. YAML files can be viewed, edited, and printed using plain text editors. YAML streams are presented as a series of Unicode characters. |
| Integrity of document structure | YAML streams are structured using indentation and indicator characters, such as hyphens, colons, question marks, exclamation points, etc. |
| Integrity of layout and display |
Per the format's name, YAML is not markup language and isn't designed to mark up the elements of a textual document. Plain text editors can render YAML streams on screen with data structure evident from the use of indentation and indicator characters. Because of YAML's use of indentation to indicate structure, it can be difficult for humans to navigate large YAML streams. According to the specification, when a YAML stream is created by an application (i.e., dumped), the application's YAML processor introduces presentation details, "such as the choice of node styles, how to format scalar content, the amount of indentation, which tag handles to use, the node tags to leave unspecified, the set of directives to provide and possibly even what comments to add." |
| Functionality beyond normal rendering | Supports comments. |
| Dataset | |
| Normal functionality |
From the specification: "YAML represents any native data structure using three node kinds: sequence - an ordered series of entries; mapping - an unordered association of unique keys to values; and scalar - any datum with opaque structure presentable as a series of Unicode characters." "Each YAML node requires, in addition to its kind and content, a tag specifying its data type. Type specifiers are either global URIs or are local in scope to a single application." |
| Support for software interfaces (APIs, etc.) | YAML is widely used to store and transmit data. The YAML website has an extensive list of frameworks and tools that have been developed to integrate YAML into other languages, including Python, C/C++, Perl, Ruby, Java, JavaScript, and more. |
| Data documentation (quality, provenance, etc.) | See Self-documentation above. |

| Tag | Value | Note |
|---|---|---|
| Filename extension | yaml yml |
See registration at IANA. The extension .yaml is preferred, though .yml is still used. |
| Internet Media Type | application/yaml |
See registration at IANA. The following alias names are deprecated: application/x-yaml, text/yaml, and text/x-yaml. These names are used but are not registered. |
| Magic numbers | See note. | RFC 9512 indicates that there is no magic number to identify YAML files. |
| Uniform Type Identifier (Mac OS) | public.yaml |
See Apple's Uniform Type Identifiers. |
| Other | See note. | NARA File Format Preservation Plan ID has no corresponding entry for YAML as of May 2025. |
| Pronom PUID | fmt/818 |
See https://www.nationalarchives.gov.uk/PRONOM/fmt/818. |
| Wikidata Title ID | Q281876 |
See https://www.wikidata.org/wiki/Q281876. |

| General |
YAML (rhymes with "camel") originally stood for "Yet Another Markup Language" (see YAML 1.0 working drafts from 2001), but by the time the YAML 1.0 Final Draft was published, the developers changed it to the recursive acronym "YAML Ain't Markup Language." Developer Ingy döt Net replied to a Stack Overflow post in 2013 explaining the reason for the name change soon after he joined the development team: "After a few months of us working together, I pointed out that YAML (which most definitely stood for Yet Another Markup Language at that time) was not really a markup language (marking up various elements of a text document) but a serialization language (textual representation of typed/cyclical data graphs). We all liked the name YAML, so we backronymed it to mean YAML Ain't Markup Language." YAML's official website (https://yaml.org/) is formatted like a YAML document. |
|---|---|
| History | The history of YAML's development is documented in revision 1.2.2 of the format's specification (see section 1.2 YAML History) and on the YAML blog. XML was being used for data serialization, but it was not designed for that purpose, which "was the big problem to be solved," according to of YAML's original developers Clark Evans in a Stack Overflow post. Three versions of the specification have been published by developers Clark Evans, Oren Ben-Kiki, and Ingy döt Net: 1.0 in early 2004, 1.1 in 2005, and 1.2 in 2009. Version 1.2 has undergone several revisions, the most recent being 1.2.2, published in October 2021. As of revision 1.2.2, the specification is developed by the YAML Language Development Team in order "to better meet the needs and expectations of its users and use cases," as stated in the YAML History section of the specification. |

|
|