Skip to Main Content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.

Research Commons

A space and place for those seeking help with research-related needs.

Overview: what to include in this section (Data Produced)

  • Give a summary of the data you will collect or create, noting the content, coverage and data type, e.g., tabular data, survey data, experimental measurements, models, software, audiovisual data, physical samples, etc.
  • Consider how your data could complement and integrate with existing data, or whether there are any existing data or methods that you could reuse.
  • Indicate which data are of long-term value and should be shared and/or preserved.
  • If purchasing or reusing existing data, explain how issues such as copyright and IPR have been addressed. You should aim to minimize any restrictions on the reuse (and subsequent sharing) of third-party data.

Components Required for Types of Data Produced


  • Outline how the data will be collected and processed.  Cover relevant standards or methods, quality assurance and data organization
  • Indicate how the data will be organized during the project, mentioning, e.g., naming conventions, version control and folder structures.
  • Explain how the consistency and quality of data collection will be controlled and documented.  This may include processes such as calibration, repeat samples or measurements, standardized data capture, data entry validation, peer review of data or representation with controlled vocabularies.

Over the project life (and possibly beyond the end of the project) data may change.  Obviously, this affects how you organize data as well as the degree or version levels you may need.


  • If the data will grow
  • If previously recorded data will be subject to correction
  • Will there be a need to keep track of versions of the data

Common categories of data sets:

  • Fixed datasets: never change after collection/generation
  • Growing datasets: new data may be added, but the old data is never changed or deleted
  • Revisable datasets: new data may be added, and old data may be changed or deleted


The volume of data produced depends, obviously, upon the type(s) of data produced.  For instance, image data requires a great deal of storage space.  Thus, determining if retention of all images is necessary . . . as well as setting out where data is 'housed', the specifics of what is retained, a plan for discarding unwanted items, when (timing) materials are discarded, who has and/or retains control of decisions for retaining/discarding, and clear statement of understanding of the archiving organizations capacity for storage and backup.

Estimate the growth rate of the data:

  • Are you manually collecting and recording data?
  • Are you using observational instruments and computers to collect data?
  • Is your data collection highly iterative?
  • How much data will you accumulate every month or every 90 days?
  • How much data do you anticipate collecting and generating by the end of the project?


File formats likely readable in the the future:

  • Non-proprietary
  • Open, with documented standards
  • In common usage by the research community
  • Using standard character encoding (i.e., ASCII, UTF-8)
  • Uncompressed (space permitting)

See the section below, organized by format types, for specific file format recommendations.


Data Forms / File Formats From UKData Service Guidance