Advanced Configuration and Best Practice
- Last UpdatedJan 07, 2025
- 6 minute read
Key Processes
Information is managed by four key processes:
-
Extract is responsible for extracting information, which may be either a delta or full update, as appropriate for the interface.
-
Capture manages the capturing of information and tracking of changes at an interface level. The capture of information requires an ID for each record – if an ID is not established the record will be omitted. Capture is greedy, recording as much information as is provided, and if a record with the same ID is encountered multiple times, the last encounter will win.
-
Interpret is responsible for interpreting (mapping) or reinterpreting information using the current (active) class library/standard from ISM and managing this state at interface level. The Interpret process maps captured fields to attributes and translates records to classified objects aligned with the class library (in the case of a primary register). If the active class library is changed (for example, adding new rules that were not known previously, or changing class library mapping rules for interfaces), then any captured information will be reinterpreted according to the new Active Standard when executing the next update. This explains why changes in the class library, even minor updates can result in significantly increased processing times during the execution of the next update. The interpret process may also generate attributes from ValueExpression rules defined on the register and define reference attributes (associations).
-
Consolidation is responsible for consolidating (combining) records from all registers within a category to form a holistic view of an object and tracking changes up to this state. This process combines all interpreted records with a matching ID in a single category into a consolidated record with a simple rule that a higher priority record (priority as defined by the information interface) will win over a lower priority, and a more recently updated record will win over an older record (when the priority is the same).
Additionally, the Registers Gateway supports withdrawing which allows an interface to be removed from one or more of the capture, interpret and consolidation processes.
Data Formats
Register data may be updated from different formats. Multiple formats may be supplied in a single update transaction.
|
Format |
Description |
|
CSV Strict (RFC4180) |
CSV data is read using strict RFC4180 rules, optionally specifying an alternate separator, new line and quote strings (multi-character permitted). |
|
TAB Strict (RFC4180) |
Same as CSV Strict, with a TAB character separator. |
|
EXCEL |
XLS or XLSX files. |
|
CSV Relaxed |
CSV is read as Excel reads, which may in some cases read data incorrectly where quoting is incorrect. |
|
TAB Relaxed |
Same as CSV Relaxed, with a TAB character separator. |
|
EIWM XML |
AIM EIWM XML format, with root objects and datasets only. |
|
XML Simple |
XML is read with a simple XPath expression. |
|
JSON Simple |
JSON is read with a simple JPath expression. |
Implementation Best Practice
Processing Steps
The diagram below illustrates the processing steps of the Registers Gateway:

The ISM model is processed and published to both the AIM staging area and the Reporting
Database.
The data is being captured and interpreted, then consolidated and published to the
AIM staging area and to the Reporting Database.
Asset Register Design
A register is used to manage a set of information. This may be thought of as an individual database of information with two states (captured raw data and interpreted data which is mapped to the class library). Multiple registers can be consolidated into a master asset register.
A register is:
-
A managed set of information with a clear purpose.
-
An isolated, clear and named data concern we wish to manage over a lifecycle.
A register is not:
-
A file (in this context), though it is typically updated through one or more files being passed into the register.
Register Design Best Practice
-
The Register ID should be named and readable.
-
Employ a two-node level structure in ISM where the first level describes a system (set) of registers with a common nomenclature and the second level is specific registers related to a data concept (Tag, Instrument, Functional Location, Equipment, etc.).
-
Use only alpha characters for a register ID with title case, no spaces, underscores, numbers, or other special characters.
-
Avoid using long names for a register ID. During data processing, the data is placed in a path constructed from the register node hierarchy, starting from the first node under InfoInterface:Data Sources. The total path length, including all folders and file names should not exceed 256 characters.
-
Be mindful of the number of columns, especially when many contain empty values. Combining data into a single large register with excessive columns is not recommended, as it can negatively impact processing performance. Remember, a register is a set of information with a specific purpose and a defined data concern to manage over a lifecycle, it is not intended as an interface for bulk data dumping.
-
Adding capture filters to reject incorrect or incomplete data is recommended, as it reduces the volume of invalid data and improves processing efficiency.
-
Avoid using overly complex data expressions, such as multiple nested if...then statements. Consider whether the data being processed would be better fixed by pre-processing or correcting it in an external system before supplying it to AIM. Fixing poor quality or confusing data is not intended to be a function of AIM.
-
Do not try to specify multiple registers with the same input file. In this scenario, a linked register should be used.
Troubleshooting Issues and Addressing Slow Performance
The Registers Gateway provides a detailed log describing each step of the data processing operations. When issues arise, the first step is to review these logs. Log messages are categorized as information, warnings, and errors. To locate specific entries, search for [ERR] or [FTL] to find errors, and [WRN] for warnings. The entry provides clear details about the error and the interface/category or processing step where it occurred.
The log messages are timestamped, allowing the time taken for each operation to be measured. This provides valuable insights when diagnosing slow performance, helping to identify which interface or operation is the most time-consuming. These issues can be addressed by tuning the configuration, correcting the data, or adding filters to exclude unnecessary, incorrect or incomplete data.
For more information on data loading performance, see Data Loading Performance.
Reducing the Number of Unknown Tags
The number of unknown tags can grow massively resulting in long processing and importing times. This can result from unoptimized configuration. The main source of unknowns is scraping, if the scraping patterns are not configured properly, the scraping process could end up generating massive amounts of data, leading to the creation of a significant number of document to unknown tag associations.
Optimizing Scraping Patterns
The default AIM-A deployment comes with a predefined set of scraping patterns. They can be customized according to customer data. AVEVA keep revising these patterns to find the best balance between maximum capture of correct tags and reducing the capture of incorrect ones. Scraping performance is influenced by the number and complexity of patterns used. Fewer and simpler patterns result in faster and more efficient scraping. The following list contains the latest optimized set of patterns.
<?xml version="1.0" encoding="utf-8"?>
<ConfigurationSettings>
<Search>
<!-- Process lines -->
<Pattern Value="(?<![A-Za-z0-9\p{Pd}_])([0-9]{1,5}\s?[0-9.\/]{1,5}|[0-9.]{1,5}?[0-9.\/]{0,5})-?([A-Za-z|"]|'')[0-9A-Za-z\p{Pd}&?\/\\_]{1,50}(?![A-Za-z0-9\p{Pd}\/\\])"/>
<ClassID Value="LINE"/>
<Context Value="NA"></Context>
</Search>
<Search>
<!-- Tags starting with an alpha -->
<Pattern Value="(?<![A-Z0-9]|([A-Z0-9][\p{Pd}_]))[A-Za-z]{1,10}(?=[\w.\p{Pd}&?\/\\]*[0-9][\w.\p{Pd}&?\/\\]*)[0-9\p{Pd}_&?][A-Za-z0-9\p{Pd}_&?\/\\]{4,50}(\.[A-Za-z]{1,5})?(?![A-Za-z0-9\p{Pd}\/\\])"/>
<ClassID Value="TAG"/>
<Context Value="NA"></Context>
</Search>
<Search>
<!-- Tags starting with a number -->
<Pattern Value="(?<![A-Z0-9\p{Pd}_]|([A-Za-z0-9][\p{Pd}_\/\\]))[0-9]{1,10}[\p{Pd}_&?]?(?=[\w.\p{Pd}&?\/\\]*[A-Za-z][\w.\p{Pd}&?\/\\]*)[A-Za-z\p{Pd}_][A-Za-z0-9\p{Pd}_&?\/\\]{1,50}(\.[A-Za-z]{1,5})?(?![A-Za-z0-9\.\p{Pd}\/\\])"/>
<ClassID Value="TAG"/>
<Context Value="NA"></Context>
</Search>
<Search>
<!-- Design lines and isometric line numbers -->
<Pattern Value="(?<![A-Za-z0-9\p{Pd}_\/\\])[A-Za-z]{1,9}(?=[\w.\p{Pd}&?\/\\]*[0-9][\w.\p{Pd}&?\/\\]*)[0-9\p{Pd}][A-Za-z0-9\p{Pd}?&\/\\_]{1,50}(?![A-Za-z0-9\.\p{Pd}\/\\])"/>
<ClassID Value="LINE"/>
<Context Value="NA"></Context>
</Search>
<Search>
<!-- Numbers - SAP IDs, mainly Equipment and Work Orders-->
<Pattern Value="(?<![A-Za-z0-9\.\p{Pd}_\/\\])[0-9]{8,20}(?![A-Za-z0-9\.\p{Pd}\/\\])"/>
<ClassID Value="NUMBER"/>
<Context Value="NA"></Context>
</Search>
<Search>
<!-- Instrument and Equipment special case WELL-01, XMAS-01 -->
<Pattern Value="(?<![A-Za-z0-9\.\p{Pd}_])(?=[\w.\p{Pd}]*[A-Za-z][\w.\p{Pd}]*)(?=[\w.\p{Pd}]*[0-9][\w.\p{Pd}]*)[a-zA-Z0-9]{1,20}([.,\p{Pd}][a-zA-Z0-9]{1,20}){1,7}(?![A-Za-z0-9\.\p{Pd}\/\\])"/>
<ClassID Value="OPC"/>
<Context Value="NA"></Context>
</Search>
</ConfigurationSettings>
Applying Capturing Filters
Capturing filters can help with rejecting unwanted invalid data. The rejected data will be reported in the logs. An example of utilizing the capturing filters is adding a rule that allows only tags containing characters, digits, and hyphens and requiring at least one of each (this is just an example, and must be adjusted based on the customer data and naming templates):
CaptureRejectRecordRule="{{ if @[TagID] not matches /^(?=.*[A-Za-z])(?=.*\d)(?=.*-)[A-Za-z\d-]+$/ then concat('Invalid Tag: ', @[ TagID]) }}"
Using Include Records
The IncludeRecords rule is another way to filter out unwanted data. While CaptureRejectRecordRule occurs during capture operation, the IncludeRecords rule will be applied during interpretation. This means that all data will be captured first, and the filters will be applied at the interpretation phase. As a result, the overall processing time can be longer when using the IncludeRecords rule compared to the CaptureRejectRecordRule.
An example of using IncludeRecords:
IncludeRecords="{{ @[TagID] matches /^(?=.*[A-Za-z])(?=.*\d)(?=.*-)[A-Za-z\d-]+$/ }}"