1. Knowledge Base
  2. WindESCo Products, Approach & Learnings

Data File Specifications

File specifications for providing data to WindESCo

Overview

The purpose of this document is to define the CSV format for data files used to transfer data to WindESCo.

Other formats are acceptable, but WindESCo may add a surcharge to transform the data into the required format.

Wide and Narrow Formats

There are two types of file formats: wide and narrow.

In wide format, each row begins with a timestamp, and each subsequent column has a signal value that corresponds to that timestamp. This format corresponds well with a database table. For example:

ts,signal1,signal2,signal3

2020-02-01T00:00:00Z,1,2,3

2020-02-01T00:00:01Z,4,5,6

2020-02-01T00:00:02Z,7,,9

...


In narrow format, each row contains a timestamp, a signal identifier (e.g. tag name), and a value. This format allows each signal to be time-stamped independently. When exporting from OSIsoft PI systems, a narrow format is typically used. For example:

ts,signal_name,value

2020-02-01T00:00:00Z,signal1,1

2020-02-01T00:00:00Z,signal2,2

2020-02-01T00:00:00Z,signal3,3

2020-02-01T00:00:01Z,signal1,1

2020-02-01T00:00:01Z,signal2,1

2020-02-01T00:00:01Z,signal3,4

...


Format Requirements

Comma delimiter

Data elements and headers shall be separated by commas.

Headers

There shall be only one row of headers, each header describing the column below it. No additional metadata shall be included anywhere in the file. Headers may be enclosed in quotes, but it’s not required.

Timestamps

Date and time stamps are combined into one column in the ISO 8601 format, e.g. 2021-02-09T14:32:35Z. Timestamps shall have a minimum of 1-second resolution. It is acceptable to have millisecond resolution, but not acceptable to have 1-minute resolution. Timestamps shall be in UTC. Timestamps may be enclosed in quotes, but it’s not required.

Number format

Numbers shall be formatted with no thousands separator, and decimals are notated with the period character, not the comma character. Numbers shall not be enclosed in quotes.

No strings in data

There shall be no strings in the data. For example, turbine state should be a number, not a name. Bad quality data is omitted rather than printing a “bad quality”.

Data shape

Each row of the file shall have the same number of columns, including the header row.

Data partitioning

Each file shall contain all signals for a single turbine. Each file shall not exceed 1 GB. If the time range of the data requires data that exceeds this limit, multiple files shall be provided.