Tabular data format standards

Date adopted: 
June 26, 2026
Last update: 
June 26, 2026

This page has guidance for government teams that are publishing tabular data (that is data structured in a table, such as Excel spreadsheets and CSV files)

This guidance should be used when publishing tabular data on open.yukon.ca. For guidance on geospatial data formats, including publishing to GeoYukon, contact [email protected].

Why publish data on open.yukon.ca?

Publishing data publicly on open.yukon.ca:

  • Makes Yukon government data more accessible, easier to find, and more useful to the public and to other Yukon government staff
  • Raises the Yukon government’s profile as a leader for transparency and data quality
  • Delivers positive public impact to citizens and businesses that use open data

Even if data is already public (in a PDF report, custom web application or database) it’s still worth publishing in a machine-readable format (like CSV). Learn about the Yukon’s open government program.

How to publish tabular data on open.yukon.ca

The open data section of open.yukon.ca stores public tabular data for Yukon government departments. Data in the open data section is provided to the public in machine-readable formats.

For information on onboarding to open.yukon.ca and publishing or updating data, contact the eServices team at [email protected].

If you’re not sure what kind of data you have, you can also contact the eServices team. If you have tabular data contained only within a PDF file or Word document, this should be published in the open information section of open.yukon.ca.

However, tabular data contained in PDF files and Word documents is much harder for people to analyze and use. We recommend always additionally publishing the same data (or the underlying data used to produce it) in a machine-readable format, in the open data section, in order to maximize its public value.

Formatting your tabular data in a machine-readable format allows anyone to use your data in any data analysis, visualization, or processing tool of their choice.

Technical guidelines for tabular data

Tabular data is typically produced by spreadsheet software (such as Excel) or database exports (such as SQL databases).

When preparing tabular data for publishing on open.yukon.ca, you should use the following specifications to provide the best possible experience to open data users. This helps ensure that the data can reliably be used by data analysis and data visualization tools (such as PowerBI, Tableau, Observable, R, and Python).

You can use this R conversion helper tool to convert existing Excel or CSV files to match the specifications below.

Category

Specifications

File type

  • CSV (comma-separated values) or TSV (tab-separated values)

Character encoding

  • UTF-8 (Unicode)
  • If possible, do not use a byte order marker (BOM)
  • Do not use other character encodings

Separators

  • Commas (CSV) or tabs (TSV)

Line breaks

  • LF or CRLF (“\n” or “\r\n”)

Column names

  • All unique
  • All lowercase
  • Only alphanumeric characters
  • Use underscores in column names (do not use spaces, hyphens, or other symbols between words)
  • Use the first row of the file for column names (a header row is required)
  • Do not have any empty column names (or cells without a matching column header)

Date formats

  • YYYY-MM-DD in all cases (ISO 8601 format)

Row and data structure

  • Organized as tidy data (each row is an observation)
  • Sort rows with date values in chronological order from oldest entries at the top to newest entries at the bottom
  • No blank rows

Empty cells

  • Either blank or NA

Summary or total rows

  • Do not include summary or total rows

Footnotes, endnotes and data notes

  • Do not include footnotes, endnotes or data notes directly in your data (unless they are brief and specific to a single row)
  • You can include these in the “Methodology” or description sections of the dataset metadata page

Single versus separate files

  • Use a single file for an entire set of related observations (do not create separate files for each year, region, etc., except in the case of very large files)

Very large files

  • Split data into multiple files when the CSV or TSV file size would exceed 100 MB (split these files on a predictable delimiter, such as year or region)
  • In rare cases where the data can not be reasonably split into separate files, you can contain the file in a ZIP compressed file
  • If you have several very large files, compress each CSV or TSV file individually (so that each ZIP file only contains one CSV or TSV file)

Data updates

  • When you have new data, update an existing open data resource whenever possible instead of creating a new one
  • Do not remove old data that is still accurate (there is no maximum retention period for open.yukon.ca publications)

File names

  • Use lowercase filenames without spaces
  • Do not include dates in the filename
  • When updating an existing open data resource, use the same filename as before (this preserves any external links directly to the data URL)

 

Alternative formats

You can optionally publish data in Excel format. It should meet as many of the specifications above as possible. When publishing in Excel format, you should also create a CSV or TSV version of the same data. eServices can help you do this with automated tools.

Tidy data structure

Structuring your data as tidy data makes it more useful for data analysis tools. In tidy data, each row is a separate observation. When more observations are added, the structure of the data (for example, column names) will generally not change.

An example of tidy data

This table is an example of tidy data, where each row is a separate observation. This can be easily analyzed in data analysis and data visualization tools, for example by using pivot table functions. (Example data is from Q4 population reports.)

Recommended structure:

year

community

population_estimate

2024

Dawson City

2,418

2024

Haines Junction

1,048

2024

Watson Lake

1,480

2025

Dawson City

2,409

2025

Haines Junction

1,069

2025

Watson Lake

1,480

 

The table below is an example of data that is not tidy. Each row contains more than one observation. Before uploading data like this to open.yukon.ca, we recommend converting it into a tidy data structure like the table above.

Not recommended structure:

Community

2024 population

2025 population

Dawson City

2,418

2,409

Haines Junction

1,048

1,069

Watson Lake

1,480

1,480

 

Being consistent in how you structure data makes it easier for people to use and analyze. If you have footnotes, endnotes or data notes, put these in the “Methodology” or description sections of the dataset description instead of in your tabular data file.

Automated updates

Whenever possible, you should create an automated update pipeline to keep the tabular data up to date. For example, this might mean updating a CSV file on open.yukon.ca from your internal database data automatically on a daily, weekly, or monthly basis.

The update frequency (daily, weekly, monthly, etc.) may vary depending on the nature of your data. You should specify the intended update frequency in the metadata for the dataset entry on open.yukon.ca.

You can use integrations like the FME CKAN package, the ckanr R package, or the CKAN API directly in order to automate data updates on open.yukon.ca. The eServices team can help you with this.

Questions or feedback

If you have questions or feedback on these specifications, contact [email protected].