AVEVA™ System Platform

Tuning Redundant Engine attributes

Save PDF

Tuning Redundant Engine attributes

Save PDF

Last UpdatedAug 14, 2025
6 minute read

Multiple variables (I/O points, number of objects, number of historized attributes, DIObject distribution) are involved in the detection and execution of a Redundant AppEngine Failover. The following tables describes some key Engine attribute values that can be modified to ensure proper failover performance.

AppEngine Object Settings

Parameter	Forced failover timeout
Editor Tab	Redundancy
Attribute	Redundancy.ForcedFailoverTimeout
Description	The maximum allowed time, in milliseconds, for a standby engine to become active after a forced failover has been initiated using the ForceFailoverCmd attribute. If the standby engine does not become active within this time period, the engine reverts to the active engine.
Default	90,000 ms (90 seconds)
Tuning	30,000 ms (less than 3,000 I/O) 45,000 to 240,000 ms (from 3,000 I/O at to 40,000 I/O) 300,000 ms (more than 40,000 I/O)
Notes	I/O values represent the load on the individual AppEngine, not the Galaxy size. If setting is too small, forced failover will not succeed. If setting is too large, failure will not be detected in a timely manner. Tuning values represent a range that can be adjusted as required.

Parameter	Maximum checkpoint deltas buffered
Editor Tab	Not shown, edit Attribute value if necessary
Attribute	Redundancy.CheckpointDeltasBufferedMax
Description	The maximum number of checkpoint deltas that can be buffered before a full checkpoint synchronization is performed.
Default	0
Tuning	N/A
Notes	N/A

Parameter	Maximum alarm state changes buffered
Editor Tab	Parameter not shown, edit Attribute value if necessary
Attribute	Redundancy.AlarmStateChangesBufferedMax
Description	The maximum number of alarm state changes that can be buffered before a full snapshot of the alarm state changes for the engine is performed.
Default	0
Tuning	N/A
Notes	N/A

Parameter	Active engine heartbeat period
Editor Tab	Redundancy
Attribute	Redundancy.ActiveHeartbeatPeriod
Description	The time interval, in milliseconds, at which heartbeats are sent by the failover service on the active engine to the failover service on the standby engine via RMC.
Default	1000 ms (1 second)
Tuning	May be increased to avoid false failovers.
Notes	N/A

Parameter	Standby engine heartbeat period
Editor Tab	Redundancy
Attribute	Redundancy.StandbyHeartbeatPeriod
Description	The time interval, in milliseconds, at which heartbeats are sent by the failover service on the standby engine to the failover service on the active engine via RMC.
Default	1000 ms (1 second)
Tuning	May be increased to avoid false failovers.
Notes	N/A

Parameter	Maximum consecutive heartbeats missed from Active engine
Editor Tab	Redundancy
Attribute	Redundancy.ActiveHeartbeatsMissedConsecMax
Description	The maximum number of heartbeats from the active engine that can be missed before a bad connection is assumed by the standby engine via RMC. For example, if the maximum consecutive heartbeats missed from active engine is configured as 5, and the active engine heartbeat period is configured as 1000 milliseconds, then the standby engine will assume a bad connection from the active engine if no heartbeats are received within five seconds.
Default	5
Tuning	5 (less than 3,000 I/O) 10 to 30 (from 3,000 I/O to 40,000 I/O) ~60 (more than 40,000 I/O)
Notes	I/O values represent the load on the individual AppEngine, not the Galaxy size. Setting this value too low produces false failovers. Setting this value too high results in slow detection of a required failover.

Parameter	Maximum consecutive heartbeats missed from Standby engine
Editor Tab	Redundancy
Attribute	Redundancy.StandbyHeartbeatsMissedConsecMax
Description	The maximum number of heartbeats from the standby engine that can be missed before a bad connection is assumed by the active engine. If a bad connection is detected, the active engine will switch to the "Active - Standby Not Available" state via RMC. For example, if the maximum consecutive heartbeats missed from the standby engine configured as 5, and the standby engine heartbeat period is configured as 1000 milliseconds, then the active engine assumes a bad connection from the standby engine if no heartbeats are received within five seconds.
Default	5
Tuning	5 (less than 3,000 I/O) 10 to 30 (from 3,000 I/O to 40,000 I/O) ~60 (more than 40,000 I/O)
Notes	I/O values represent the load on the individual AppEngine, not the Galaxy size. Setting this value too low produces false failovers. Setting this value too high results in slow detection of a required failover.

Parameter	Maximum time to maintain good quality after failure
Editor Tab	Redundancy
Attribute	Redundancy.StandbyActivateTimeout
Description	The maximum time period, in milliseconds, after the active engine fails before subscribed references to it are set to "uncertain."
Default	15,000 ms (15 seconds)
Tuning	15,000 ms (less than 3,000 I/O) 120,000 ms (from 3,000 I/O to 40,000 I/O) 150,000 ms (more than 40,000 I/O)
Notes	I/O values represent the load on the individual AppEngine, not the Galaxy size. Assuming remote I/O, setting the value too low causes all I/O references to unsubscribe, then resubscribe on failover. The optimum setting ensures that remote I/O references are preserved for failover. This behavior also applies in the RDI Object context.

Parameter	Maximum time to discover partner
Editor Tab	Redundancy
Attribute	Redundancy.PartnerConnectTimeout
Description	The maximum time period, in milliseconds, allowed for the connection to the failover partner to be established before the failover partner state is set to "unknown."
Default	15,000 ms (15 seconds)
Tuning	N/A
Notes	N/A

Parameter	Restart engine when it fails
Editor Tab	Parameter not shown, can be viewed in Attribute tab
Attribute	Engine.RestartOnFailure
Description	The AppEngine object automatically attempts to restart if a failure occurs.
Default	True
Tuning	N/A
Notes	This behavior cannot be changed, even if the attribute is set to false.

Parameter	Checkpoint period
Editor Tab	General
Attribute	Scheduler.CheckpointPeriod
Description	Checkpointing saves run-time attribute values. The checkpoint period is the time, in milliseconds, at which checkpointing is performed. The default checkpoint period is 10,000 ms. If set to 0, the checkpoint period defaults to the scan period, but may occur at a slower rate (it is done as fast as possible as a background task). The minimum checkpoint interval for retentive attributes is 10,000 ms. Retentive attributes are defined as those attributes configured as calculated retentive, or object- or user-writeable. If the checkpoint period is set to less than 10,000 ms, retentive attributes will not be saved at every checkpoint. For example, if the checkpoint period is set to 4,000 ms, retentive attribute values will only be saved at every third checkpoint (4,000 x 3 = 12,000 ms). Retentive attributes retain the last value set during run time, and the run-time value is saved across redeployments. Non-retentive attributes revert to their configured values at redeployment.
Default	10,000 ms (10 seconds)
Tuning	10,000 ms (up to 3,000 I/O 20,000 ms (up to 20,000 I/O) 60,000 ms (more than 20,000 I/O)
Notes	I/O values represent the load on the individual AppEngine, not the Galaxy size. Setting this value too low results in high resource usage. Setting this value too high means that if both partners fail, checkpointed data may not be current.

WinPlatform Object Settings

Parameter	NMX heartbeat period
Editor Tab	General
Attribute	NetNMXHeartbeatPeriod
Description	The time interval, in milliseconds, at which heartbeats are sent to other platforms. Heartbeats will only be established between platforms if a publish/subscribe relationship exists between engines on the platforms. For example, if an engine on WinPlatformA is subscribed to data from an engine on WinPlatformB, then heartbeats will be sent between WinPlatformA and WinPlatformB. WinPlatformA will send heartbeats to WinPlatformB at the rate specified by the WinPlatformA NetNMXHeartbeatPeriod attribute. WinPlatformB will send heartbeats to WinPlatformA at the rate specified by the WinPlatformB NetNMXHeartbeatPeriod attribute.
Default	2,000 ms (2 seconds)
Tuning	Use the default value a platform object with a low I/O count (up to 3,000).
Notes	I/O values represent the load on individual AppEngines, not the Galaxy size

Parameter	Consecutive number of missed NMX heartbeats allowed
Editor Tab	General
Attribute	NetNMXHeartbeatsMissedConsecMax
Description	The maximum number of consecutive heartbeats that are allowed to be missed from a platform before a platform communication error is generated for that platform. For example, assume an engine on WinPlatformA is subscribed to data from an engine on WinPlatformB. If the NetNMXHeartbeatsMissedConsecMax attribute on WinPlatformB has a value of 5, then WinPlatformA will generate a platform communication error when it misses six consecutive heartbeats from WinPlatformB. If the NetNMXHeartbeatsMissedConsecMax attribute on WinPlatformA has a value of 2, then WinPlatformB will generate a platform communication error when it misses three consecutive heartbeats from WinPlatformA.
Default	3
Tuning	Small configuration (up to 10,000 I/O per engine): 3 Larger configurations (more than 10,000 I/O per engine): 6
Notes	I/O values represent the load on individual AppEngines, not the Galaxy size. Missed consecutive heartbeats determines the number of missed heartbeats that will trigger the redundant engine to act. Setting the values smaller makes the engines more sensitive to network failure. Setting the values larger makes the engines more tolerant of high CPU loads that can cause missed heartbeats. Specifying a value of 0 is not recommended, as this may trigger false communication errors that can deteriorate the system performance.

Failover services talk between themselves using the RMC and determine the communication status between the two nodes. The status is provided by monitoring Heartbeat attributes.

Message Channel Heartbeat settings control the heartbeat intervals; i.e., how often the redundant platforms send each heartbeat through the RMC.

AVEVA™ System Platform