What metadata will be provided to help others identify and discover the data?
Describe the content, formats, and internal relationships of the data in detail |
|
|
|
Project Documentation |
Dataset Documentation |
|
|
|
|
|
|
|
|
|
|
|
|
Where data is stored and how it is kept secure throughout the research process is critical. Funding agencies may require that you retain data for a given period and will likely ask you to explain in your data management plan how you will store and back it up, and how you will manage the security of and access to your data. If you will be working with large data sets (with larger storage and backup needs) you should contact your departmental IT staff.
Be sure to include who will be responsible for ensuring that files are stored and backed-up properly. Funding agencies are increasingly looking for details related to “roles and responsibilities.”
Storage is the act of keeping your data in a secure location that you can access readily. Files in storage should be the working copies of your files that you can access and change regularly.
Backup is the practice of keeping additional copies of your data in separate physical or cloud locations from your files in storage. Backup copies are copies you would access in the case of data loss and needing to access previous versions of your work.
Storage systems often provide mirroring, in which data is written simultaneously to two drives. This is not the same thing as backup since alterations in the primary files will be mirrored in the second copy.
A good rule to go by with storing and backing up copies of your work is LOCKSS (Lots Of Copies Keep Stuff Safe), and to keep each copy as physically far apart from the other copies as possible to prevent damage by natural disaster, such as a fire or flood occurring in the lab where the research is being performed. It’s a good idea to follow the rule of three when thinking about this: you should keep three copies of your data, two backup copies should be kept on different devices or storage media, and one backup copy should be kept off-site. This might look like:
Courtesy UW-Madison
Other Considerations:
If any of the following policies affect the management of your data, you will need to address them in a DMP, as they will affect how you can store and share your data.
Data becomes useful when it has meaning and context associated with it. The most common way to bring context to data is by applying metadata (description and documentation of your data) and through supplementary files, such as a data dictionary. Documenting your data is important for sharing your data, in order for other researchers to understand how to access, view, and possibly re-use your data.
Courtesy UW-Madison
“Data scientists spend 80% of their time cleaning and manipulating data and only 20% of their time actually analyzing it.” IBM Analytics
A data catalog is a structured collection of data used by an organization. It is a kind of data library where data is indexed, well-organized, and securely stored. Most data catalog tools contain information about the source, data usage, relationships between entities as well as data lineage. This provides a description of the origin of the data and tracks changes in the data to its final form.
The catalog informs users about the available data sets and the metadata around a topic and assists users in locating it quickly.
You can contact your departmental IT staff for help finding a suitable catalog.
Below is a list of data catalogs.
“Data curation is the active and ongoing management of data through its lifecycle of interest and usefulness to scholarship, science, and education. Data curation enables data discovery and retrieval, maintains data quality, adds value, and provides for re-use over time through activities including authentication, archiving, management, preservation, and representation.”
-Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign
A Data Dictionary is a set of important information about data used within an organization (metadata). This information includes names, definitions, and attributes about data, owners, and creators of assets. Data Dictionary tools provide insights into meaning and purposes of data elements. They add useful aliases about the scope and characteristics of data elements, as well as the rules for their usage and application. (UW Madison)
The OSF (Open Science Framework) provides a tutorial on how to make a data dictionary for tabular data.
Milner Library
Illinois State University
Campus Box 8900
201 North School Street
Normal, Il 61790-8900
(309) 438-3451
Contact Us
Have comments or questions about our guides?
Please contact Instruction and Student Engagement:
liblearn@ilstu.edu