The Library of Congress >> Especially for Librarians and Archivists >> Standards

MARC Standards

HOME >> MARC Development >> Proposals List


MARC PROPOSAL NO. 2026-04

DATE: January 15, 2026
REVISED:

NAME: Modernization of Field 041 and Field 008/35-37 in the MARC 21 Bibliographic Format

SOURCE: OCLC

SUMMARY: This proposal revises field 041 (Language Code) and 08/35-37 (Language) to remove the requirement to use fill characters in 008/35-37 when the first 041 field uses non-MARC language codes.

KEYWORDS: Field 041 (BD); Language Code (BD); Subfield $a, in field 041 (BD); Language code of text/sound track or separate title (BD); Subfield $d, in field 041(BD); Language code of sung or spoken text (BD); Subfield $2, in field 041 (BD); Source of code, in field 041 (BD); Field 008/35-37 (BD); Language (BD)

RELATED: 2025-DP14; 2001-06; 2001-DP02

STATUS/COMMENTS:
01/15/26 – Made available to the MARC community for discussion.


Proposal No. 2026-04: Modernization of Field 041 and Field 008/35-37

1. BACKGROUND

This proposal builds upon the ideas and observations presented in MARC Discussion Paper 2025-DP14, which presented two options for revising fields 041 (Language Code) and 008/35-37 (Language) to remove the requirement to use fill characters in 008/35-37 when the first 041 field uses non-MARC language codes. MAC members supported seeing option 2 returning as a proposal with three small, related issues returning as discussion papers for further consideration. OCLC has addressed those three issues fully in two discussion papers submitted for this meeting.

Although it did not come up in the discussion of 2025-DP14, we were recently made aware of MARBI proposal 2001-06, which proposed changes to accommodate non-MARC language codes in field 041 (Language Code). That proposal was focused on two issues: 1) recording ISO 639-1 codes in field 041 (because Dublin Core recommended ISO 639-1 at the time) and 2) dealing with "stacked language codes," which were multiple MARC language codes strung together, e.g., $a engfrespa. This was a joint paper by LC and OCLC to support OCLC's Cooperative Online Resource Catalog (CORC), which included Dublin Core records for online resources.

2. DISCUSSION

OCLC has noticed an increase in the use of field 041 to record language codes from sources other than the MARC Code List for Languages. This increase may be attributed to libraries wanting to provide more specific information than can be done with the MARC Code List for Languages, which contains 484 codes, 55 of which are used for groups of languages. In comparison, ISO 639-3 contains over 7,900 codes.

In MARC field 041, subfields $a and $d are used for language codes for the language of the principal work in the resource. For a book consisting of text, subfield $a would be used to indicate the language of the main textual content, while subfield $b (Language code of summary or abstract) and subfield $f (Language code of table of contents) would provide language codes for other content. For a sound recording consisting of singing, subfield $d would be used to indicate the language of the singing, while subfield $g (Language code of accompanying material other than librettos and transcripts) would be used for the language of program notes on the container. There are also resources for which neither subfield $a or $d is used because the principal work in the resource does not have language, e.g., an audio recording of birds chirping.

In the Field Definition and Scope for field 041, it is clear that the code recorded in field 008/35-37 and the code recorded in subfield $a or $d should be the same: "If there is a code in 008/35-37, it is recorded as the first code in subfields $a or $d (for sound recordings) of the first occurrence of field 041 in the bibliographic record. If 008/35-37 contains all blanks (No information provided) or the code "zxx" (No linguistic content) and field 041 is being used, e.g., to record the language code(s) of accompanying material, no subfield $a or $d is present." By extension, this currently means that field 041 must contain a MARC language code in $a or $d when the field is used, and this is explained further in the subfield $2 instructions.

If non-MARC language codes are used exclusively in subfield $a or $d of field 041, the MARC format says to use fill characters in field 008. This means if agencies do not want to use fill characters in 008/35-37 but want to use other vocabularies for subfields $a and $d of field 041, they must also provide an instance of field 041 with the MARC language code from the 008/35-37. While it is understandable that the codes in 008/35-37 and 041 $a or $d should not contradict each other, in today's cataloging environment, the rationale is not clear for repeating the MARC language code in field 041 to avoid the use of fill characters in 008/35-37. This practice comes from MARBI proposal 2001-06, which only considered two scenarios for recording non-MARC language codes in field 041: 1) the stacked language codes and 2) ISO 639-1 codes that were less specific than MARC language codes. Neither 2001-06 nor 2001-DP02 provided an explanation of WHY 008/35-37 should be fill characters when field 041 contained only non-MARC language codes and the Status/Comments section does not discuss field 008/35-37. The source of non-MARC language codes in field 041 is specified in subfield $2, so we think this is sufficient for systems to recognize the source. We appreciate that MARBI could not be expected to predict the many other sources of language codes that we have now nor how systems today handle field 008/35-37 and 041. As previously noted in our discussion paper the date character positions in field 008 must be ISO 8601, but field 046 may have dates formatted according to different specifications.

We believe the instructions about the use of the fill character in field 008/35-37 may deter catalogers from using non-MARC language codes in field 041 and force a duplication of data in fields 041 and 008/35-37 when a cataloger wants to use a non-MARC language in field 041 and does not want to use fill characters in field 008/35-37. Unfortunately, the MARC Code List for Languages lacks codes for specific sign languages, several dead languages, and many Indigenous languages. We believe that removing the instructions about recording the fill character in field 008/35-37 will make applying non-MARC language codes in field 041 easier for libraries, leading to better descriptions of resources in the languages not well covered by MARC's codes.

Despite current MARC instructions about using fill characters in 008/35-37, we have noticed many records in which there is one 041 field with subfields $a and $2, and the 008/35-37 value is a MARC language code rather than fill characters. For example, 008/35-37=sgn and 041 07 $a ase $2 iso639-3. Thus, our proposed changes would reflect current cataloging practices while increasing record quality.

The National Library of Australia provided feedback that they did not support our suggested changes because they use fill characters in field 008/35-37 when using language codes from AUSTLANG, the database of Aboriginal Australian languages maintained by AIATSIS. We would like to clarify that our proposed changes do not eliminate the fill character in field 008/35-37 or prohibit its use with non-MARC language codes in field 041. Our proposed change to the fill character description only eliminates the wording about "if only non-MARC language coding is preferred (and coded in field 041 (Language code))." Thus, the National Library of Australia could use the fill characters, which would continue to mean "no attempt to code." We believe libraries should have the flexibility with the 008/35-37 field to provide a language code value when appropriate or use fill characters to best meet the needs of their users and provide information in ways appropriate for their systems.

There was one bit of feedback on our discussion paper we were not able to address in our proposal. We noted that currently in the 008/35-37 instructions it contains this sentence: "When only one language is associated with an item, the code for that language is recorded." This seems to prohibit using "zxx" for scores, which usually have some language content like the title pages and brief words like "adagio." One MAC member suggested it be reworded to say, "When the item is entirely in a single known language, the code for that language is recorded." We are not convinced that addresses the situation as the wording is ambiguous. It could mean that the entire item consists of language content or that all the language content of the item is in a single known language. If interpretated as the latter, a score with an Italian title page and a few words would be coded as "ita" rather than "zxx." That is exactly the result we would like to avoid. Therefore, we have submitted the proposed change as written in our discussion paper.

3. PROPOSED CHANGES

(Note: changes indicated by strikethroughs and underlining)

3.1. Field 041, Field Definition and Scope

In Field 041 (Language Code), revise the first six paragraphs of the Field Definition and Scope as follows:

Revision:

Codes for languages associated with an item when the language code in field 008/35-37 of the record is insufficient to convey full information. Includes records for multilingual items, items that involve translation, and items where the medium of communication is a sign language. Sources of the codes are: MARC Code List for Languages or other code lists such as ISO 639-1 (Codes for the representation of names of languages - Part 1 : alpha-2 code).

 

Languages may also be recorded in textual form in field 546 (Language Note).

 

Used in conjunction with 008/35-37 (Language). If there is a code in 008/35-37, it is recorded as the first code in subfields $a or $d (for sound recordings) of the first occurrence of field 041 in the bibliographic record. If 008/35-37 contains all blanks (No information provided) or the code "zxx" (No linguistic content) and field 041 is being used, e.g., to record the language code(s) of accompanying material, no subfield $a or $d is present. If only a non-MARC code is used to express the predominant language in an item, field 008/35-37 is coded with three fill characters (| | |).

 

The first code recorded in subfield $a or $d (for sound recordings) of field 041 should be comparable to the value recorded in field 008/35-37. However, the MARC language code in field 008/35-37 may be less specific than the language code(s) in field 041 if only a collective language code is available in the MARC Code List for Languages and other language codes are used in field 041.

 

Used when one or more of the following conditions exist:

Field 041 may also be used to express the original language of the work.

Clean version:

Codes for languages associated with an item when the language code in field 008/35-37 of the record is insufficient to convey full information. Includes records for multilingual items, items that involve translation, and items where the medium of communication is a sign language. Sources of the codes are: MARC Code List for Languages or other code lists such as ISO 639-1 (Codes for the representation of names of languages - Part 1 : alpha-2 code).

 

Languages may also be recorded in textual form in field 546 (Language Note).

 

Used in conjunction with 008/35-37 (Language). If 008/35-37 contains all blanks (No information provided) or the code "zxx" (No linguistic content) and field 041 is being used, e.g., to record the language code(s) of accompanying material, no subfield $a or $d is present.

 

The first code recorded in subfield $a or $d (for sound recordings) of field 041 should be comparable to the value recorded in field 008/35-37. However, the MARC language code in field 008/35-37 may be less specific than the language code(s) in field 041 if only a collective language code is available in the MARC Code List for Languages and other language codes are used in field 041.

 

Field 041 may also be used to express the original language of the work.

3.2. Field 041, Second Indicator

In field 041, second indicator (Source of code), make the following change to value "7":

7 - Source specified in subfield $2

Source of the language code is indicated by a code in subfield $2.

 

            008/35-37 |||

            041           07 $a en $a fr $a it $2 iso639-1

3.3. Field 041, Subfield $a

In field 041, subfield $a (Language code of text/sound track or separate title), revise first paragraph and add new paragraph and examples. The remaining paragraphs and examples are unchanged.

Revision:

Language code in the first occurrence of subfield $a in the bibliographic record may also be recorded in field 008/35-37 (Language) if it is a MARC language code. is also recorded in 008/35-37 (Language) unless 008/35-37 contains blanks (###) or the code "zxx" (No linguistic content).

 

008/35-37 eng

041            0# $a eng $a fre $a ger

[Text is in English, French and German.]

 

008/35-37 jpn

041            1# $3 Gojira $a jpn $j eng $h jpn

041            0# $3 Godzilla $a eng $h eng

[DVD contains two films, one with a Japanese sound track, and one with an English sound track.]

 

If a non-MARC language code is used in the first occurrence of subfield $a in the bibliographic record, the code recorded in field 008/35-37 should have a comparable value, e.g., an equivalent code or a collective code.

 

008/35-37 ara

041           07 $a arq $2 iso639-3

546           ## $a Dialogue in Algerian Arabic.

 

008/35-37 phi

041           07 $a taus1251 $b taga1270 $2 glotto

546           ## $a Chiefly in Tausug with abstract in Tagalog.

[The collective code for Philippine (Other) is used in 008/35-37.]

Clean version:

Language code in the first occurrence of subfield $a in the bibliographic record may also be recorded in field 008/35-37 (Language) if it is a MARC language code.

 

008/35-37 eng

041            0# $a eng $a fre $a ger

[Text is in English, French and German.]

 

008/35-37 jpn

041            1# $3 Gojira $a jpn $j eng $h jpn

041            0# $3 Godzilla $a eng $h eng

[DVD contains two films, one with a Japanese sound track, and one with an English sound track.]

 

If a non-MARC language code is used in the first occurrence of subfield $a in the bibliographic record, the code recorded in field 008/35-37 should have a comparable value, e.g., an equivalent code or a collective code.

 

008/35-37 ara

041           07 $a arq $2 iso639-3

546           ## $a Dialogue in Algerian Arabic.

 

008/35-37 phi

041           07 $a taus1251 $b taga1270 $2 glotto

546           ## $a Chiefly in Tausug with abstract in Tagalog.

[The collective code for Philippine (Other) is used in 008/35-37.]

3.4. Field 041, Subfield $d

In field 041, subfield $d (Language code of sung or spoken text), revise first paragraph and add new third paragraph and an example. The second paragraph and current examples are unchanged.

Revision:

Language code(s) for the audible portion of an item, usually the sung or spoken content of a sound recording or computer file. If there is no subfield $a, the language code in the first occurrence of subfield $d in the bibliographic record may also be recorded in field 008/35-37 if it is a MARC language code.

 

Note: For materials where the audible portion of an item does not represent the primary content, the language code(s) for the textual portion of the primary content of the item is entered in subfield $a.

 

 

If a non-MARC language code is used in the first occurrence of subfield $d in the bibliographic record, the code recorded in field 008/35-37 should have a comparable value, e.g., an equivalent code or a collective code.

 

008/35-37 grc

041           07 $d atti1240 $2 glotto

546           ##  $a Songs in Attic Greek

    [The collective code for Ancient Greek (to 1453) is used in 008/35-37.]

Clean version:

Language code(s) for the audible portion of an item, usually the sung or spoken content of a sound recording or computer file. If there is no subfield $a, the language code in the first occurrence of subfield $d in the bibliographic record may also be recorded in field 008/35-37 if it is a MARC language code.

 

Note: For materials where the audible portion of an item does not represent the primary content, the language code(s) for the textual portion of the primary content of the item is entered in subfield $a.

 

 

If a non-MARC language code is used in the first occurrence of subfield $d in the bibliographic record, the code recorded in field 008/35-37 should have a comparable value, e.g., an equivalent code or a collective code.

 

008/35-37 grc

041           07 $d atti1240 $2 glotto

546           ##  $a Songs in Attic Greek.

    [The collective code for Ancient Greek (to 1453) is used in 008/35-37.]

3.5. Field 041, Subfield $2

In field 041, subfield $2 (Source of code), revise as follows:

Revision:

 Source of the language code scheme used in the field. Code from: Language Code and Term Source Codes.

 

041 07 $d N57 $d N60 $2 austlang

546 ##  $a Songs in Jawoyn and Dalabon.

 

If a non-MARC code is used to express the predominant language in an item, field 008/35-37 is coded with three fill characters (| | |).

If more than one code scheme is used in a record, repeat the field.

 

008/35-37 |||

041 07$aen$afr$ait$2iso639-1

 

008/35-37 eng

041           0# $a eng $a fre

041           07 $a en $a fr $2 iso639-1
[Two language code schemes are used and field 041 is repeated.]

 

041 0# $a art

041 07 $a nov $2 iso639-3

[A resource in the artificial language Novial.]

 

041 17 $a izh $a ekk $b rus $h izh $2 iso639-3

041 17 $a ingr1248 $a esto1258 $b russ1263 $h ingr1248 $2 glotto

546 ## $a In Ingrian, written in phonetic alphabet, with Estonian translation; summary in Russian.

Clean version:

 Source of the language code scheme used in the field. Code from: Language Code and Term Source Codes.

 

041 07 $d N57 $d N60 $2 austlang

546 ## $a Songs in Jawoyn and Dalabon.

 

If more than one code scheme is used in a record, repeat the field.

 

041 0# $a eng $a fre

041 07 $a en $a fr $2 iso639-1

 

041 0# $a art

041 07 $a nov $2 iso639-3

[A resource in the artificial language Novial.]

 

041 17 $a izh $a ekk $b rus $h izh $2 iso639-3

041 17 $a ingr1248 $a esto1258 $b russ1263 $h ingr1248 $2 glotto

546 ## $a In Ingrian, written in phonetic alphabet, with Estonian translation; summary in Russian.

3.6. Field 008/35-37

In field 008/35-27 (Language), revise the instructions as follows:

Revision:

35-37 - Language

Three-character alphabetic code that indicates the language of the item. Code from: MARC Code List for Languages. Choice of a MARC code is based on the predominant language of the item.

 

For language material (i.e., books and continuing resources), the language code is based on the text of the item. The term text refers to the principal work(s) included within the publication, excluding the preface, introduction, foreword, appendices, etc.

For computer files, the language associated with the data and/or the user interface (e.g., textual displays, audible output in a language) determines the code used in 008/35-37, not the programming language. (Accompanying documentation in a language other than that of the data and/or user interface is coded in field 041.) For maps, the language of names and text associated with the map or globe determines the code used. For music, the predominant language of the sung or spoken text associated with the score or sound recording is recorded in 008/35-37. For visual materials, coding depends on the type of material. For moving image materials, the language content is defined as the sound track, the accompanying sound, or sign language. For moving image materials with no sound or sign language content or, if with sound, no narration, use zxx (no linguistic content). For filmstrips and slides, code for the text on the film, the accompanying sound or the accompanying printed script (for works with no sound or, if with sound, no narration). For all other still images, including original or historical graphic material and opaque and non-opaque graphic material, and three-dimensional materials, the language content is that associated with the material, i.e., captions or other text associated with the item or collection that are part of the chief source of information. For mixed materials the language code is based on the predominant language of an item or materials in a collection.

 

When only one language is associated with an item, the code for that language is recorded unless the item does not contain linguistic content, the language cannot be determined, there is no attempt to code the language, or there is no information provided.

 

008/35-37 spa

245            00 $a Rentabilidad bruta del inversionista en bolsa. $p Bonos del tesoro.

 

If more than one language code is applicable, tThe code for the predominant language is recorded in 008/35-37, should have a comparable value (e.g., an equivalent code or a collective code) to the first code in subfields $a or $d (for sound recordings) of the first occurrence of field 041.and the codes for all of the languages, including the predominant language, are recorded in one or more occurrences of field 041 (Language Code). The code recorded in 008/35-37 is always the same as the language code recorded in the first occurrence in the bibliographic record of subfields $a or $d (for sound recordings).

 

008/35-37 rus

041            0# $a rus $a eng

546 500     ## $a Chiefly in Russian; with some contributions in English.

 

 

If the first occurrence of field 041 does not contain MARC language codes, the code recorded in field 008/35-37 should have a comparable value (e.g., an equivalent code or a collective code) to the first code in $a or $d of the first field 041.

008/35-37 ara

041            07 $a arq $2 iso639-3

546            ## $a Dialogue in Algerian Arabic.

 

008/35-37 grc

041           07 $d atti1240 $2 glotto

546           ##  $a Songs in Attic Greek.

         [The collective code for Ancient Greek (to 1453) is used in 008/35-37.]

 

When formulating a bibliographic record for a translation, the code for the language of the translation, not the language of the original, is given in 008/35-37. (The code for the language of the original is recorded in subfield $h of field 041.)

 

 

||| - No attempt to code

Three fill characters (|||) may be used if no attempt is made to code the language or if only non-MARC language coding is preferred (and coded in field 041 (Language code)).

Clean version:

35-37 - Language

Three-character alphabetic code that indicates the language of the item. Code from: MARC Code List for Languages. Choice of a MARC code is based on the predominant language of the item.

 

For language material (i.e., books and continuing resources), the language code is based on the text of the item. The term text refers to the principal work(s) included within the publication, excluding the preface, introduction, foreword, appendices, etc.

For computer files, the language associated with the data and/or the user interface (e.g., textual displays, audible output in a language) determines the code used in 008/35-37, not the programming language. (Accompanying documentation in a language other than that of the data and/or user interface is coded in field 041.) For maps, the language of names and text associated with the map or globe determines the code used. For music, the predominant language of the sung or spoken text associated with the score or sound recording is recorded in 008/35-37. For visual materials, coding depends on the type of material. For moving image materials, the language content is defined as the sound track, the accompanying sound, or sign language. For moving image materials with no sound or sign language content or, if with sound, no narration, use zxx (no linguistic content). For filmstrips and slides, code for the text on the film, the accompanying sound or the accompanying printed script (for works with no sound or, if with sound, no narration). For all other still images, including original or historical graphic material and opaque and non-opaque graphic material, and three-dimensional materials, the language content is that associated with the material, i.e., captions or other text associated with the item or collection that are part of the chief source of information. For mixed materials the language code is based on the predominant language of an item or materials in a collection.

 

When only one language is associated with an item, the code for that language is recorded unless the item does not contain linguistic content, the language cannot be determined, there is no attempt to code the language, or there is no information provided.

 

008/35-37 spa

245           00 $a Rentabilidad bruta del inversionista en bolsa. $p Bonos del tesoro.

 

The code recorded in 008/35-37 should have a comparable value (e.g., an equivalent code or a collective code) to the first code in subfields $a or $d (for sound recordings) of the first occurrence of field 041.

 

008/35-37 rus

041            0# $a rus $a eng

546            ## $a Chiefly in Russian; with some contributions in English.

 

 

If the first occurrence of field 041 does not contain MARC language codes, the code recorded in field 008/35-37 should have a comparable value (e.g., an equivalent code or a collective code) to the first code in $a or $d of the first field 041.

008/35-37 ara

041            07 $a arq $2 iso639-3

546            ## $a Dialogue in Algerian Arabic.

 

008/35-37 grc

041           07 $d atti1240 $2 glotto

546           ## $a Songs in Attic Greek.

          [The collective code for Ancient Greek (to 1453) is used in 008/35-37.]

 

When formulating a bibliographic record for a translation, the code for the language of the translation, not the language of the original, is given in 008/35-37. (The code for the language of the original is recorded in subfield $h of field 041.)

 

 

||| - No attempt to code

Three fill characters (|||) may be used if no attempt is made to code the language.

4. BIBFRAME DISCUSSION

BIBFRAME already supports these changes. BF only includes a reference to a MARC Language resource if one was in the original source or chosen by a cataloger.  The concept of ‘no attempt to code’ or ‘no information provided’ is understood to be the case by the absence of a MARC Language reference. BF also supports reference to non-MARC Language resources, such as ISO639-3.  When converting to MARC, from Bibframe, if a reference to a MARC Language is detected (and it refers to the whole resource, versus a component, i.e., the functional equivalent of an 041 $a or $d), it will be used in the 008.

5. PROPOSED CHANGES

5.1. In field 041 (Language Code) of the MARC 21 Bibliographic Format, revise Field Definition and Scope (3.1.), second indicator (3.2.), subfield $a (3.3), subfield $d (3.4.), and subfield $2 (3.5.).

5.2. In field 008/35-37 (Language) of the MARC 21 Bibliographic Format, revise instructions (3.6.) to correspond to proposed revisions to field 041.


HOME >> MARC Development >> Proposals List

The Library of Congress >> Especially for Librarians and Archivists >> Standards
(01/15/2026)
Legal | External Link Disclaimer Contact Us