Metadata is data about data; it is a surrogate record for your dataset that allows you to document important information for finding, identifying, and sharing the data later. It explains the who, what, where, when, why, and how of data creation
Metadata falls into 4 categories:
Technical: Includes the format, size, location, checksum, creating software, the equipment used. It can be automated. Technical metadata is critical for data preservation including replication/verification, format migrations, and emulations.
Descriptive: Answers the question, 'What is the data about?' Includes the title, author, keywords, time. It describes how this data relates to other data objects.
Administrative: How is the data licensed? By whom? Who owns the data?
Provenance: A history of the changes made to the data, and its versions; who or what has modified it? when? how? Allows others to use your data with confidence.
Looking for an example of good metadata? The University of Minnesota has created this Sample ReadMe file.
By using this form, or one like it, you can gather all of the documentation you need as you create or record your data. Then you can standardize your formatting according to your discipline.
The great thing about metadata standards is that there are so many to choose from!
Why use a standard?
Some Metadata Standards to consider:
When all else fails, use Dublin Core. A lot of repositories that store data use a type of Dublin Core.