Skip to Main Content

Data Management

Resources in documenting, storing and preserving research data

File Format Recommendations

The file format you choose for your data is a primary factor in someone else's ability to access it and re-use it. It's important to share data in widely adopted, nonproprietary formats. In other words, the format should not be owned or restricted by a specific company and should work across various software and operating systems. By using open formats and ensuring your data is compatible with different platforms, you can reduce the technical obstacles that might hinder its reuse.

Examples of proprietary formats include Microsoft Excel (.xslx) or Word (.docx); comparable nonproprietary options include comma-separated values (.csv) and plain text files (.txt).

Formats likely to be accessible by others or in the future are:

  • Non-proprietary
  • Open, with documented standards
  • In common usage by the research community
  • Using standard character encodings (i.e., ASCII, UTF-8)
  • Uncompressed (space permitting)
Material Preferred File Format
Tabular

ASCII or UTF-8 encoded,

.csv, .tsv

Geospatial

Formats compatible with widely adopted GIS (e.g. ArcGIS)*

Database

.sqlite, .db, .db3

Text

ASCII or UTF-8 encoded,

.txt, .html, .pdf, .xml

Archiving/Compression

.tar, .gzip, .zip

Still Images

.tiff, .jpg, .jp2, .png, .gif, .bmp, .pdf, .svg

Moving Images

.mov, .mpeg

Audio

.wave, .mp3

Websites

.warc

Source: Library of Congress Recommended Formats Statement 2024-2025