Include in the metadata record information about software and hardware for accessing proprietary formats. Learn more on the Metadata page.

The USGS must comply with National Archives and Records Administration (NARA) formats for records deemed to be permanent at the time of transfer to NARA (refer to https://www.archives.gov/records-mgmt/policy/transfer-guidance-tables.html).

It’s important to think about file formats before you acquire data because your decisions at this stage may have implications in other stages of the science data lifecycle. The best format for collecting and processing the data, might not be the best for analyzing the data. The best format for analysis might not be the best format for distribution of the data, which in turn, might not be the best format for preservation of the data. Understanding these differences and connections at the beginning of a project can be helpful. Keep in mind that every time data are converted to a different format, there is a risk for introducing data error or loss.

Avoid storing multiple data types in a single column (for example, remark codes such as > or < in the same field as numeric values). Instead, a remark field should be used to store any remark characters that qualify a numeric value.

Check to see that any markup, such as highlights or bolded text, are either removed, or are moved to the metadata, so that important ancillary information is not lost in the conversion.

If sharing different formats of the same file, be sure to name each file with the same name (e.g. bison_data_v1.xlsx and bison_data_v1.txt).

Below are links to file format recommendations from the National Archives and Records Administration. You can also check out the Library of Congress Recommended Formats Statement, which is updated annually, for more information.

Examples of file formats are comma-separated values (.csv), ascii text (.txt), Microsoft Excel (.xlsx), JPEG (.jpg), or Audio-Video Interleave format (.avi).

Within the files, avoid application of formatting, such as highlighting or color, to serve as metadata, because it will likely be lost when converting to different formats

If you need to collect data in a proprietary format, ensure that it can easily be converted to another non-proprietary, open format.

Proprietary formats used for acquisition and analysis should be converted into standard and long lasting formats by the researcher familiar with the data, once the data analysis is complete.

The May 9, 2013, Office of Management and Budget (OMB) memorandum “Open Data Policy—Managing Data as an Asset” also requires agencies to provide free public access to data collected or created by using Federal funds, and to collect or create data in a way that supports downstream processing and dissemination activities.  This includes using machine-readable or open formats, data standards, and common-core and extensible metadata for all data released to the public.

Include in the metadata record information about software and hardware for accessing proprietary formats. Learn more on the Metadata page.

Check to see that any markup, such as highlights or bolded text, are either removed, or are moved to the metadata, so that important ancillary information is not lost in the conversion.

Official websites use .gov A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS A lock () or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

If saving different formats of the same file, but sure to name each file with the same name (e.g. bison_data_v1.xlsx and bison_data_v1.txt).