UniInt failover configuration overview

Save PDF

Last UpdatedApr 03, 2023
4 minute read

PI System
PI Universal Interface UniInt Framework 4.7.3
Interfaces

UniInt failover minimizes data loss by enabling a backup interface instance to collect data using a shared file if the primary interface instance fails. UniInt failover is designed to handle cases with a single point of failure, either for the interface failing to connect to the data source, or the interface failing to connect to Data Archive.

To configure failover, you create instances of the same interface on two different computers. Failover configuration modes are hot, warm, or cold.

Note: Synchronization through the data source (phase 1 failover) is now deprecated and is not recommended. To migrate to shared file (phase 2) failover, see Convert from phase 1 to phase 2 failover.

Shared file failover operation

Shared-file UniInt failover uses PI points and a shared file to coordinate failover operation. Status information from the points is maintained in a shared file, removing the requirement for a Data Archive connection after the instances are running. If the shared file cannot be accessed, the interface instances read status information from Data Archive if it is available. The first interface instance to start (IF-Node1) acts as the primary interface instance, and the second interface instance (IF-Node2) acts as the backup. Either interface can be the primary.

Failover seeks to keep a running instance of the interface connected to the data source using the following process:

If IF-Node1 fails, IF-Node2 becomes the primary and takes over transmitting data from the data source to Data Archive.
If IF-Node1 is restored and IF-Node2 then fails, IF-Node1 becomes the primary again.

If both IF-Node1 and IF-Node2 can neither connect to Data Archive nor the shared file, IF-Node1 remains as the primary with the "Primary Error" state and IF-Node2 remains as the backup with the "Backup Error" state. Data loss is possible if IF-Node1 loses connection to the data source. The interface with the status "Backup Error" does not collect data, and cannot become the primary unless it reconnects to either Data Archive or the shared file.

In a hot failover configuration, each interface instance queues three failover intervals' worth of data to prevent any data loss. When failover occurs, data for up to three intervals might overlap. The exact amount of overlap is determined by the timing and the cause of the failover. For example, if the update interval is five seconds, data can overlap between 0 and 15 seconds.

For more information about UniInt failover scenarios, see UniInt failover scenarios.

Failover with disconnected startup

If the interface instances are configured to use disconnected startup, they can start and trigger failover in order to continue collecting data, even if Data Archive is unavailable, as long as they both have access to the shared file. If the interface instances do not have access to Data Archive or the shared file, they will both enter Backup No PI state until they reconnect and establish the primary and backup roles. If buffering has been enabled, the primary interface instance then sends any data that was collected.

Failover status and keywords

During normal operation, IF-Node1 collects data from the data source and sends it to Data Archive. The interface instances read and write to the shared file to update status information. To identify failover attributes that are written to the shared file, the interface uses the heartbeat, active ID, and failover ID failover keywords are defined in the extended descriptor (exdesc) attribute.

Both IF-Node1 and IF-Node2 perform the following operations simultaneously:

Update their own heartbeat value at a configured interval
Monitor the heartbeat value and device status for the other instance
Check the active ID value in the shared file

Normal operation continues as long as the following conditions are met:

The heartbeat value for the primary interface indicates that it is running.
The active ID point has not been changed manually.
The device status on the primary interface is good.
The active ID keyword and its corresponding entry in the shared file are set to the failover ID of the primary interface instance.

To indicate that it is running, each interface instance refreshes its heartbeat value by incrementing it at the rate specified by the failover update interval. The heartbeat value starts at 1 and increments until it reaches 15, at which point it resets to 1.

If the instance loses its connection to Data Archive, the value of the heartbeat cycles from 17 to 31. When the connection is restored, the heartbeat value reverts back to the range for a running interface. During a normal shutdown process, the heartbeat value is set to zero. For more information about failover status points, see Failover status points.

Failover with collectives

If you have a PI collective and Data Archive sends outputs to the data source through the interface in your deployment, you can use UniInt interface failover to ensure the availability of the Data Archive node that provides outputs. Each interface receives outputs from a specific Data Archive node in the collective. If that Data Archive node becomes unavailable, the interface will no longer receive outputs. However, you can configure each interface to receive outputs from a different collective member. For more information on configuring interface outputs for UniInt failover in a PI collective, see the host parameter in Standard parameters topic.

If the Data Archive node connected to the primary interface fails, the backup interface takes over, receives outputs from its collective member, and reports time-series data to the collective. The backup interface must remain connected to the data source to trigger this failover scenario.

PI Universal Interface UniInt Framework