Data Management

The basics of data management and Resources for researchers

Library Research Guide

Tips for Organizing your Files

How you choose to organize your data depends on your project requirements and on your needs when it comes to accessing and sharing your data.  When you use consistent methods of naming, storing, and describing your files based on these parameters, you will be able to find and share these files easily.

Before you start collecting data, consider:

  • File naming conventions
  • Directory structures
  • File formats
  • Metadata

File Formats

Not all file formats are created equal, or for the long haul.  Think about long term access when selecting your file format.  If possible, use open source, standard formats:

  • ASCII text, not Excel
  • PDF/A + Word, not just Word
  • MPEG-4, not Quicktime
  • TIFF or JPEG2000, not JPG
  • XML or RDF, not RDBMS

File Naming

Just like your paper files, your electronic file need to be well organized and correctly labeled to be easily identified and accessible.  Here are some tips to help you do just this:

  • Use file names that give users an idea of what's in the file and what version of the file it is.
  • Use short, but meaningful file names.
  • Include creation and modification dates in a standard format: year-month-day.
    • Ex. 20120131_Agenda not Agenda for January 31, 2012; OrgChart2012_v02, not Org Chart 2012, revision
  • Avoid using special characters, such as *%$, use the underscore to connect_elements.
  • Be consistent in your naming schema.

Directory Structure

Have a File Plan, including a subject classification system.  This will help determine how to divide your files into the appropriate folders.

Set up an Index.  This will make it easier to locate both folders and files

Primary Subjects - the main functions of your research project.  One folder for each function.

Secondary Subjects - the more specific activities of your project, includes your research data.  These are each organized into the appropriate Primary subject folder.

Tertiary Subjects - limited by date or equivalent and oraganized into the appropriate secondary subject folder

Your top-level directory folder should include your project title, and a unique identifier. 

The sub-structure should have a clear naming convention; such as each run of an experiment, version of a dataset, geographic location, or person in the group, whichever is appropriate.

For example: