Applications Data & Preservation Inventories

Home » Data Sets » Search indexes

Search indexes

General Information

Description

Search indexes are derivative metadata structured for and stored in Apache Solr index stores.

Purpose

Provides search functionality.

Quick Facts

A distinct Solr index is created for the Catalog, Databases, and UWDC categories in the Coordinated Discovery Platform and for each citation database implemented in Blacklight.

These are transitory application data sets. The size of the dataset vary from 1000s of records for citation database to more 12 million records for the catalog dataset.

Data Classifications

Campus

Internal: Used for internal discovery operations. Elements of the content are public.

Library

Operational This is transitory operational data.

Data Contacts

Data Custodian: Library Platforms and Applications Group
Data Owner/Trustee: Library Platforms and Applications Group
Data Consumer: Indirectly, patrons who use library discovery platforms.

Risk Assessment

Score	Risk Type	Details	Evaluation Date
1	Institutional Knowledge	These are transitory derivative datasets; they have no historic value.	March 1, 2019
1	Data	These are transitory derivative datasets; there is no risk of permanent loss of data.	March 1, 2019
1	Library Impact	These are transitory derivative datasets; there is no risk of permanent loss of data.	March 1, 2019

Technical Details

Specifications: NA
Correctness: Mappings between source data records in MARC, MODS, or local schemas and the semantics of Solr record schema must preserve the semantic intent of source data relative to the discovery and display requirements of the consuming application.
Representative Record: NA These are derivative records. There is no authoritative instance of this data.
Dependencies: Source MARC, MODS, and local schema records.

Access & Use

Delivery Modalities: Via Solr APIs to consuming applications: Coordinated Discovery Platform and Blacklight instances.
Lifecycle: The indexes are periodically rebuilt and incrementally updated based on changes detected in the source data. There is no preservation or archiving.
Disposition: Stored on LCB machines hosting Solr, their associated storage, and backups.
Relevant Processes: The data are derived through ETL workflows from source data sets.
Constraints: No applicable constraints,.

UW-Madison Libraries Staff

App Data & Preservation Inventories