Skip to main content
Version: v2

Data rate estimation and cache sizing

OIBus sends values to a target application via North connectors (OIConnect, OIAnalytics...). There are two sending modes:

  • through a file with a files' endpoint
  • through JSON payloads with a values' endpoint.

The volumes to be taken into account can be estimated according to the data to be sent and the sending mode selected. These estimates can also be used to size the amount of cache storage needed to ensure the store and forward under good conditions.

This section gives some hints on how to estimate the cache size.

Sending files (CSV)

We will focus on data in the form of CSV files. In this case the volume will depend on several parameters:

  • The data sampling frequency
  • The file sending frequency
  • The timestamp format
  • The data format: number of characters used (precision)
  • The size of data references
  • The file format: in lines or in columns

In the following examples, we will calculate how much space a CSV file generated by OIBus takes. We took the following assumptions:

  • The sampling frequency: one point per minute.
  • The frequency of sending the file: one file every 30 minutes.
  • The timestamp format: ISO 8601 format, 24 bytes in size.
  • Data format: 3 digits with a separator for the decimal places. Therefore, the data in the following examples have a size of 4 bytes.
  • The size of the point ID (data reference): DataXXX, where XXX represents three numbers characters. Therefore, the references of the following examples have a size of 7 bytes.

Column files

This format is particularly suitable for data repeated on the same timestamp. It saves space compared to a lines format.

Timestamp	                Data001	    Data002	    Data003
2020-02-01T20:04:00.000Z 12.0 10.0 10.0
2020-02-01T20:05:00.000Z 10.0 19.0 10.0
2020-02-01T20:06:00.000Z 10.0 10.0 14.0
...

The size of the header is 10 + 1 + 7 + 1 + 7 + 1 + 7 + 1 = 35 bytes.

The size of one line is 24 + 1 + 4 + 1 + 4 + 1 + 4 + 1 = 40 bytes (column separators and newlines are taken into account).

The number of lines depends on the frequency of the data, here one line every minute. With a file sent every 30 minutes, it will therefore have a size of 35+40x30 = 1235 bytes. Over a day, there will be 48 files, a total of 59,280 bytes or 58 kB.

Row files

This format is particularly suitable when the different data transmitted do not have the same sampling frequency. In the example we assume that all data has the same sample rate.

Timestamp	                Reference	    Value
2020-02-01T20:04:00.000Z Data001 12.0
2020-02-01T20:04:00.000Z Data002 10.0
2020-02-01T20:04:00.000Z Data003 10.0
2020-02-01T20:05:00.000Z Data001 10.0
2020-02-01T20:05:00.000Z Data002 19.0
2020-02-01T20:05:00.000Z Data003 10.0
2020-02-01T20:06:00.000Z Data001 10.0
2020-02-01T20:06:00.000Z Data002 10.0
2020-02-01T20:06:00.000Z Data003 14.0
...

The size of the header is 10 + 1 + 9 + 1 + 6 + 1 = 28 octets. The size of a line is 24 + 1 + 7 + 1 + 4 + 1 = 38 bytes (column separators and newlines are taken into account).

The number of lines depends on the frequency of the data and the number of references, here one line every minute multiplied by 3 references (which makes 3 lines per minute). With one file sent every 30 minutes, it will therefore have a size of 28+38x30x3 = 3448 bytes. Over a day, there will be 48 files, a total of 165,504 bytes or 162 kB.

Column row files

This format has the advantage of the column file and allows the pooling of data identifiers (001, 002, 003) with the references if there are several, which is not the case here since only Data is used. This allows you to obtain the references Data001, Data002, Data003.

Timestamp	                Reference	001	    002	    003
2020-02-01T20:04:00.000Z Data 12,0 10,0 10,0
2020-02-01T20:05:00.000Z Data 10.0 19.0 10.0
2020-02-01T20:06:00.000Z Data 10.0 10.0 14.0
...

The size of the header is 10 + 1 + 9 + 1 + 3 + 1 + 3 + 1 + 3 + 1 = 33 bytes.

The size of a line here is 24 + 1 + 4 + 1 + 4 + 1 + 4 + 1 + 4 + 1 = 45 bytes (column separators and newlines are taken into account).

The number of lines depends on the frequency of the data and the number of references, here a line every minute multiplied by a reference (which makes one line per minute). With one file sent every 30 minutes, the file to be sent will therefore have a size of 33+45x30 = 1383 bytes. Over a day, there will be 48 files, a total of 66,384 bytes or 65 kB.

Sending values (JSON payload)

When values are retrieved by the North connector and sent to a values' endpoint (OIConnect or OIAnalytics), they are formatted in an array like this:

[
{"timestamp": "2020-02-01T20:04:00.000Z", "pointId":"Data001", "data": {"value": "12.0", "quality": "192"}},
{"timestamp": "2020-02-01T20:04:00.000Z", "pointId":"Data002", "data": {"value": "10.0", "quality": "192"}},
{"timestamp": "2020-02-01T20:04:00.000Z", "pointId":"Data003", "data": {"value": "10.0", "quality": "192"}}
]

Each field has the following meaning:

  • timestamp: indicates the timestamp of the value in ISO 8601 format
  • pointId: reference of the value
  • data: JSON object containing the recorded value (value) and the quality (quality)

We will focus on data in JSON file format. In this case the size depends on several parameters:

  • The data sampling frequency
  • The number of points grouped by sending (defined by Group Count)
  • The sending frequency (defined by Send Interval)
  • The format of data and quality: number of characters used (precision)
  • The size of the data references

It is then possible to estimate the space occupied by a value.

  • The timestamp size is 39 bytes ("timestamp": "2020-02-01T20: 00: 00.000Z")
  • The pointId size is of the form of "pointId": "DataXXX", i.e. 13 bytes added to the number of bytes of the reference (here the 7 bytes of DataXXX)
  • The data field size is 10 bytes ("data": {...}) added to its content:
    • The value field is of the form of "value": "10.0", i.e. 11 bytes added the variable number of bytes on which is encoded the value (here 4 bytes)
    • The quality field is of the form of "quality": "192", i.e. 13 bytes plus the variable number of bytes on which the quality is encoded (here 3 bytes)

Hence, the size of the object representing a value can be broken down into:

  • The constant object size: 39 + 13 + 10 + 11 + 13 + 6 = 92 bytes (6 corresponding to the separators of the different elements: commas...)
  • The size of the reference: 7 bytes
  • The size of the value: 4 bytes
  • The size of the quality: 3 bytes

The size of a single object to sent is therefore 106 bytes, for a single value.

With a sampling frequency of 1 point per minute and 3 data, with Group Count equal to 1000 and Send Interval equal to 1000ms, then OIBus will transmit a JSON every minute with 3 data or 318 bytes.

Over one day, this will represent 318 x 24 x 60 = 457,920 bytes, or 447 kB.

Comparison

Under the conditions defined in the example, it appears that the transmission mode and the data format have a significant impact on the transmitted volumes. This will be even more critical when the number of data and their sampling frequency are higher than described in this example.

CSV columnsCSV rowsCSV rows + columnsJSON payload
Sent by day58 kB162 kB65 kB447 kB
Size by value13,7 bytes38,3 bytes15,4 bytes106 bytes