How is data processed historically?
- Last UpdatedApr 10, 2025
- 5 minute read
Samples and streams
Any individual value that may be historized is stored in a sample.
Each sample contains:
-
Value - The data value
-
Time Stamp - Time at which the value applies
-
Quality - A quality indication of the value
In the previous real-time versions of AVEVA™ Production Management, samples were used to store values; however, no real use was made of their time stamp. It is this component of the sample that is critical to the operation of historical processing.
If a value is required to be processed historically, then the sample that contains that value and its associated time stamp is inserted into a stream. A stream can contain many different values for a given item property, with each value stored in a sample, and each sample relating to a different time.
In the following diagram, the stream contains the values of a Boolean ScadaVariable called Variable. This stream is associated with the Samples property of the Variable item.
The diagram shows that the value of the variable changes several times over the period of the stream. Note that only the value component of each sample is shown.
Notes about streams
There are several things to note about streams.
-
They have a definite start. As a rule, the samples in the stream occur on or after the start of the stream, however there may be a single sample with a time stamp before the start of the stream. This is required so that the value can be determined between the start of the stream and the time of the next sample.
-
They have a definite end. There is no sample on or after the end time of the stream.
-
Samples that exist in a stream cannot be modified.
-
New samples can only be written after the end of the stream. As new samples are written to the end of the stream, the stream end time is updated to the end of the update period.
-
Generally, samples are only stored in the stream if their value or quality has changed. This means that a sample is expected to be current from the time of its time stamp, until the time stamp of the next sample.
Understand dependencies
The contents of historical streams can come from several different sources. An obvious case is from connectors that source data from external systems such as SCADA. Streams within AVEVA™ Production Management can also have their contents derived from or based on the contents on other streams also within AVEVA™ Production Management. The simplest example of stream contents being based on the contents of one or more other streams is an expression.
Consider a calculated variable whose resulting value is the addition of two variables that source data from an external system. This is shown in the following diagram.
In this example, there is one calculated result sample for each unique sample time in both the input streams.
Since the resulting stream is calculated based on the data from both the input streams, it follows that data must be available in both streams for the period being processed before the resulting stream can be calculated. This makes the two variable streams "dependencies" of the resulting stream. This type of dependency is managed by the Dependency Manager, whose job it is to be certain that data is only processed for a period when the necessary inputs are available for that period.
Perhaps this is better understood by stepping through a typical scenario. Consider the expression as defined previously. It has processed until time 10.
Subsequent polling of the underlying data source returns data for Variable2. The data returned is for the period from time 10 to 16. This is appended to the end of the Variable2.Samples stream.
Subsequent polling of the underlying data source returns data for Variable1. The data returned is for the period from time 10 to 20. This is appended to the end of the Variable1.Samples stream.
The Dependency Manager knows that the result stream requires data from both the Variable1.Samples and Variable2.Samples streams, so examines both the streams to find a period where they both have new and unprocessed data.
The result expression is then permitted to execute over the common period, generating result data that is then appended to the end of the Result Samples stream.
There are several things to note about dependencies and the Dependency Manager.
-
Each set of dependencies is managed independently. This means that the period that each set of dependencies is permitted to process may be different.
-
The Dependency Manager enables that for a set of dependencies, time only ever moves forward. That is, a new period that is processed follows on from the last period that was processed.