ALEC Documentation

What is ALEC?

The Architecture for Learning Enabled Correlation (ALEC) is a framework for logically grouping related faults (alarms) into higher level objects (situations) with OpenNMS.

How does it work?

Inventory in OpenNMS is converted to a graph

Nodes, their components and their relations are used to build a graph.

Model example

The layout of the graph, and the different types of elements on the graph are what compose "the meta-model". In ALEC, the vertices on the graph are referred to as "inventory objects" and edges are used to represent "relationships".

Alarms are attached to the graph

Alarms are enriched to help identify which component in the model they relate to. The alarms are attached to the graph when they are triggered.

Model example with alarms

In OpenNMS, we can now populate the managed object (MO) type and MO instance of an alarm when it is created. These fields are used (and required) to relate the alarm to a specific IO on the graph.

We group alarms on the graph into clusters

We aim to group the alarms into clusters based on their level or similarity, and whether or not they share the same root cause. The engine will periodically group (cluster) the alarms.

Model example with cluster

Different clustering algorithms can be used:

  1. Deep Learning (AI with TensorFlow)

  2. DB-Scan (Unsupervised ML)

When we determine that a group of alarms is related, we send an event to OpenNMS. OpenNMS will then create a new situation for the set of related alarms. Situations are also managed like alarms.

Model example with situation

We use Drools rules for managing the state of all alarms including situations:

  • These are used to propagate the severity:

    • the severity of a situation is the maximum severity of all related alarms + 1

  • They are used to propagate acknowledgments:

    • if all of the related alarms on a situation are acknowledged, then the situation is also acknowledged