Metadata Overview
Metadata provides information about data. Metadata can be many things, such as the creator of a video, the video’s creation date, the location where it was shot, and the file type or extension.
Archivists use structured metadata formats, or schemas, as a way to make sure that everyone is on the same page when referring to specific information about archived items. We generated our own metadata schema based on what we wanted to prioritize gathering and later mapped our schema to the standard schema used for video records, PBCore, so that it can be more accessible to people and computer systems that work with PBCore.
Below you can see our full metadata schema, separated by their distinct functions: logistical, technical, and descriptive. The metadata fields that appear differently in the website archive (content type, subject keywords) include that additional field in a parenthesis. Italics indicate that the field listed is for internal use only.
- Technical Metadata
- Creation Date – item creation date, formatted as YYYY-MM-DD
- Record Type – from the following options – Video RAW, Video Edited, Audio RAW, Audio Edited, Presentation, Spreadsheet, Document, Edit File
- File Size – number, conformed to GB
- Clip Length – HH:MM:SS
- File Type – file extension (mov, dv, fcp)
- Descriptive Metadata
- Coverage Type (Content Type) – keywords to describe the type of coverage from the footage (interview, event, action, etc)
- Location – location, as specific as possible and generalized to the state level
- Subject Keywords (Keyword)- subject tags from a controlled vocabulary
- Description – 1-3 sentence description of the clip (frequently written for groups of clips)
- Individuals Depicted – Individuals who are identified and show up within the footage
- Videographer – Individuals involved in the production of the video clip, identified by voice/context or by the directory structure/file names
- Organization(s) – Organizations present in the material
- Hashtag – Hashtags for social media items
- Content – content of text-based items
- Associated Media – media that are associated with the item
- Logistical Metadata
- Record ID – we assign these as items come into the collection, there are 2 digits for the year and then a 5 digit sequential number – YY_#####
- File Name – the original file name
- Old File Location – directory path to the former location of the file
- New File Location – directory path to the current file location
- Duplicates – ID numbers or notes on potential duplicate videos
- Cataloger – the name of the cataloger
- Copyright Description – the type of copyright, which determines access
- Notes – Internal notes on the clip/logistics etc
- Link – A live link to another available version of the item
- Legacy ID – Previous identification number for the record