Applications Data & Preservation Inventories

Home » Data Sets » Database Library Metadata

Database Library Metadata

General Information

Description: This data set describes the list databases and e-resources curated and made available to library patrons by library subscriptions.
Purpose: This data set contains the core metadata used in our discovery platform to support searching and browsing for e-resources.
Quick Facts: The data is comprised of approximately 1,300 records. Each record is stored in an individual file. The combined size of all records is approximately 7.2MB.

Data Classifications

Campus

Public: This data describes library collections and is publicly available.

Library

Descriptive This data set forms the core descriptive metadata that drives discovery of e-resources.

Data Contacts

Data Owner/Trustee: Lee Konrad, Associate University Librarian - Digital Strategy lee.konrad@wisc.edu
Data Steward: Coordinated Discovery Team forward-lib@lists.wisc.edu
Data Custodian: Aimee Glassel aimee.glassel@wisc.edu
Data Custodian: Steve Meyer, Data Strategist stephen.meyer@wisc.edu
Data Manager: Aimee Glassel aimee.glassel@wisc.edu
Data Consumer: Library Patrons
Internal Data Client: Shared Development Group
Data Subject Matter Expert: Aimee Glassel aimee.glassel@wisc.edu

Risk Assessment

Score	Risk Type	Details	Evaluation Date
2	Library Impact	If the representative record on the shared drive were lost, we could recover the data from tape backups. There may be some temporary disruption to how fast the data could be recovered if the latest backup was out of sync with the latest records. That would require a small amount of time to recreate some records or recent changes.	February 28, 2019
3	Data	This data is backed up, but if all copies were lost, we would need to engage in a time consuming process of recreating it.	February 28, 2019
1	Institutional Knowledge	This data is used operationally to support library discovery process but does not document the history of the university.	February 28, 2019

Technical Details

Specifications

The data files are serialized as XML. They use an OAI-PMH schema to wrap the entire record. Within this container, the primary metadata is expressed within a MARC/XML record. There is also a custom metadata section containing information about a local subject classification, which is a critical part of the data.

It should be noted that the MARC/XML is not fully valid as it contains both control and data fields with invalid alphabetical MARC fields.

Correctness

The data must be valid XML to be parsed correctly. The MARC data must be parseable using a standard MARC record library, though this requires stripping out the invalid MARC/XML fields noted in the technical specification. The subject categories must contain values from an approved list of subjects.

Basic validation steps are performed on data records when they are indexed for discovery.

Representative Record

The authoritative instance of the data is stored in a shared network drive location managed by the Library Technology Group (LTG). It is managed by two LTG staff who make changes to the data (create, update, archive and index).

Dependencies

The current data set originates from an export from MetaLib, which was the prior system used for both management and patron use of the data.

This data is required to support the indexing process that run our patron discovery user interface.

Access & Use

Delivery Modalities

The data is delivered to users via the Databases "bucket" within our discovery platform. This web-based user interface provides access to searching and browsing the list of databases purchased and/or subscribed to by the Libraries.

Lifecycle

New records typically originate through the e-resource ordering process within the NERO application. Within a NERO request, selectors and bibliographers provide the first version of the data required to create an XML record. This data is captured within the NERO database and transcribed into new XML files by LTG staff.

As selectors and catalogers request changes, LTG staff update the records and the data is indexed periodically on a sporadic schedule.

When an e-resource is no longer needed within the public interface the XML file is archived by relocating it to a special folder/directory within the shared drive location.

Disposition

The XML files are stored on a network storage drive. This drive/filesystem is subject to backup.

Relevant Processes

The data is based on a prescribed template. Changes are made in support of the user experience within the discovery interface. These changes should be vetted by the appropriate web team, the Coordinated Discovery Team.

Constraints

This data is publicly available and open. There are no regulations that govern its use or retention.

UW-Madison Libraries Staff

App Data & Preservation Inventories