Formatting MARC data

Formatting MARC data --  Formatting MARC data with File_MARC

Overview

The File_MARC_Record class enables you to write Machine Readable Cataloging (MARC) data in MARC 21 format, in a human-readable string format, and (with some restrictions) in MARCXML format.

Formatting MARC 21 data

To return a record in MARC 21 format, call the toRaw() method on the File_MARC_Record object.

Creating human-readable output from MARC data

To return a human-readable version of a MARC 21 or MARCXML record, call the __toString() method on the File_MARC_Record object. Note that you call the __toString() method implicitly when you call the print() function on a File_MARC_Record object.

Formatting MARCXML data

To return a record in MARCXML format, call the toXML() method on the File_MARC_Record object.

Significant restrictions on the toXML() method

  • Most significantly, PHP offers no means of converting from the MARC8 encoding that most legacy MARC records have been encoded in to a valid XML encoding such as UTF-8. MARC libraries in other languages have worked around this basic lack of infrastructure by creating their own character encoding conversion libraries. At this time, the author of File_MARC does not have the capacity to build the same support as a PEAR package but would welcome any assistance. Better still would be the addition of ANSEL and MARC8 encoding support to the iconv and ICU toolkits that are used to supply encoding conversion by most open-source projects and languages.

  • The toXML() method currently produces a single, complete, valid XML MARCXML document for a single File_MARC_Record object. You cannot simply concatenate the results of calling toXML() on two File_MARC_Record objects, because that will produce invalid an invalid XML document. At this time, it is up to the developer to extract the record node from each MARCXML document and concatenate them inside a collection root element if they want to create a MARCXML document that contains more than a single record.