Innovation and Inspiration: The Campaign for Kansas University

Data Management

The basics of data management and Resources for researchers

Where Should I Put My Metadata?

♦  In a readme file
♦  In a text file
♦  In an XML file
♦  Into a database when you share the data
♦  In a spreadsheet

Metadata Basic Elements

Title, Creator, Identifier, Subject, Funders, Language, Dates, Location, List of File Names, File Formats, File Structure, Variable List, Code Lists, Versions, Methodology, Data Processing, Sources, Rights, Access Information, Checksums

What do these elements mean?  Dublin Core explains

What is Metadata?

Metadata is data about data; it is a surrogate record for your dataset that allows you to document important information for finding, identifying, and sharing the data later.  It explains the who, what, where, when, why, and how of data creation

Metadata falls into 4 categories:

Technical: Includes the format, size, location, checksum, creating software, the equipment used.  It can be automated.  Technical metadata is critical for data preservation including replication/verification, format migrations, and emulations.

Descriptive: Answers the question, 'What is the data about?'  Includes the title, author, keywords, time. It describes how this data relates to other data objects.

Administrative: How is the data licensed? By whom? Who owns the data?

Provenance: A history of the changes made to the data, and its versions; who or what has modified it? when? how?  Allows others to use your data with confidence.

Metadata example

Looking for an example of good metadata?  The University of Minnesota has created this Sample ReadMe file

By using this form, or one like it, you can gather all of the documentation you need as you create or record your data.  Then you can standardize your formatting according to your discipline.

Metadata Standards

The great thing about metadata standards is that there are so many to choose from!

Why use a standard?

♦ It is the basic means of discovery, and facilitates the reuse of data. It is also the foundation for citing datasets.
♦ Your dataset can be organized with other datasets - standards are created to facilitate searching similar items by using similar terms and constructs to describe them.
♦ You have a complete, standard set of information about each part of your data.
♦ Most subject data repositories have mandated metadata standards.

Some Metadata Standards to consider:
       ♦  FGDC (Federal Geographic Data Committee)
       ♦  DDI (Data Documentation Initiative)
       ♦  Dublin Core
       ♦  Darwin Core
       ♦  ABCD (Access to Biological Collections Data)
       ♦  AVMS (Astronomy Visualization Metadata Standard)


When all else fails, use Dublin Core.  A lot of repositories that store data use a type of Dublin Core.