Dataflows for Apache NiFi

Introduction

Apache NiFi is a dataflow system for the stream and batch processing of data. Dataflows are configured in the NiFi web GUI to perform the following tasks:

  1. EDR parsing from text-based EDRs into JSON based EDRs.
  2. Storage of EDRs as parsed into the n2reporting database
  3. Service determination, to split call control and provisioning EDRs into the correct processing paths.
  4. Aggregation of EDR data into summarised forms for reporting.
  5. Database extracts (of the N2ACD database) for integrated EDR + service data reports.

N-Squared Reporting Dataflows Example

This N2ACD NiFi guide will provide step-by-step instructions for the configuration of N2ACD dataflows, however for comprehensive details on how to use Apache NiFi, see the NiFi user guide.

A description of each NiFi process group is provided first. For installation instructions and how to configure, see below.

The “N2SVCD EDR Parsing” Process Group

The N2SVCD EDR Parsing process group is responsible for parsing N2SVCD text based EDR files into individual EDR records and then determining the relevant service each EDR belongs to.

N2SVCD EDR Parsing Process Group

This process group will:

  1. Read EDR files from a configured EDR directory. Note EDRs read from this directory are deleted as they are read in to NiFi. Backups of EDR files must be taken before being placed in the NiFi input directory unless the NiFi processing itself is altered to store out the processed file again.
  2. Convert each EDR in each file read. EDRs are converted in to JSON format for subsequent processing, with key fields (including the session ID and EDR event timestamp) extracted.
  3. Determine the relevant logical service each EDR belongs to. For some EDRs this is determined directly from the EDR itself. Other EDRs must be correlated with an initial EDR (e.g. an INITIALDP or SIP INVITE).

Subsequent actions (such as storing EDRs, summarising EDRs or generating errors or statistics) are performed by subsequent connected process groups.

The output ports on this process group may differ, depending on the specific N-Squared products installed and how the service determination is configured.

Configuration

The following configuration is required for the N2SVCD EDR Parsing process group:

Parameter Default Value Purpose
N2SVCD_EDR_ERROR_DIR /opt/nifi/edr/error If an error occurs while reading an EDR file as a whole (before processing of individual EDRs could be done), the EDR file will be written back to this directory.
N2SVCD_EDR_INPUT_DIR /opt/nifi/edr/input The directory on the reporting server from where EDR files can be read. EDR files should be moved into this directory (rather than written) such that the move is an atomic filesystem operation.
N2SVCD_EDR_READER_SCRIPT /usr/share/n2rep/etc/nifi/n2svcd_reader.groovy The location of the Groovy script for parsing N2SVCD EDRs.
N2REPORTING_PG_DB_DRIVER /opt/nsquared/ocs/lib/postgresql-42.3.1.jar The location of the Java jar file for PosgreSQL JDBC connectivity.
N2REPORTING_PG_DB_URL jdbc:postgresql://127.0.0.1/n2reporting The full JDBC URL for the PostgreSQL n2reporting database.
N2REPORTING_PG_USERNAME n2reporting_writer The username to connect to the reporting database with.
N2REPORTING_PG_PASSWORD n2reporting_writer The database password to connect to the reporting database with.

The “N2ACD Service EDR Processing” Process Group

This process group will store raw EDRs identified as part of the ACD service into the n2acd.raw_json_edr database table. It will aggregate EDRs received such that one row per voice call is stored in the database table n2ad.summarised_edr.

N2SVCD EDR Parsing Process Group

See the reporting db node installation instructions for creating the reporting database, and the reporting database data model for details on the reporting tables themselves.

This process group relies on previous service determination. EDRs not belonging to calls processed by the N2ACD service are not expected to be processed by this process group.

This process group will:

  1. Store EDRs to the N2ACD reporting database raw_json_edr database table.
  2. Process EDRs to determine the purpose of the EDR and update the CDR record for the ACD call in the summarised_edr table.

Configuration

The following configuration is required for the N2ACD Service EDR Processing process group:

Parameter Default Value Purpose
N2ACD_EDR_READER_SCRIPT /usr/share/n2acd/etc/nifi/n2acd_reader.groovy The location of the Groovy script for additional N2ACD specific parsing of EDR data.
N2REPORTING_PG_DB_DRIVER /opt/nsquared/ocs/lib/postgresql-42.3.1.jar The location of the Java jar file for PosgreSQL JDBC connectivity.
N2REPORTING_PG_DB_URL jdbc:postgresql://127.0.0.1/n2reporting The full JDBC URL for the PostgreSQL n2reporting database.
N2REPORTING_PG_USERNAME n2reporting_writer The username to connect to the reporting database with.
N2REPORTING_PG_PASSWORD n2reporting_writer The database password to connect to the reporting database with.

The “N2ACD DB Extract” Process Group

This process group extracts source data from the N2ACD service database into database tables stored in the n2acd schema. Each extract is timestamped, and multiple extracts will be stored (based on storage capacity and partitioning configuration)

N2ACD DB Extract process group

See the reporting db node installation instructions for creating the reporting database, and the reporting database data model for details on the reporting tables themselves.

This process group will:

  1. On a regular basis copy customer, service and flow data from the service database to the reporting databse.

Configuration

The following configuration is required for the N2ACD DB Extract process group:

Parameter Default Value Purpose
N2REPORTING_PG_DB_DRIVER /opt/nsquared/ocs/lib/postgresql-42.3.1.jar The location of the Java jar file for PosgreSQL JDBC connectivity.
N2REPORTING_PG_DB_URL jdbc:postgresql://127.0.0.1/n2reporting The full JDBC URL for the PostgreSQL n2reporting database.
N2REPORTING_PG_USERNAME n2reporting_writer The username to connect to the reporting database with.
N2REPORTING_PG_PASSWORD n2reporting_writer The database password to connect to the reporting database with.
N2ACD_SERVICE_DB_URL jdbc:postgresql://n2-p-acd-sms-01/n2in The full JDBC URL for the PostgreSQL n2in database with the n2acd service database schema.
N2ACD_SERVICE_DB_PG_USERNAME n2acd_owner The username to connect to the N2ACD SMS service database with.
N2ACD_SERVICE_DB_PG_PASSWORD n2acd_owner The database password to connect to the N2ACD SMS service database with.

NiFi Dataflow Installation

NiFI dataflow installation and configuration requires two steps:

  1. The import of process groups as templates, and then the creation of process groups from those templates.
  2. The installation-specific configuration of those process groups.

Importing Process Group Templates

Each process group template file is installed with the N2ACD SMS API package and can be found in the /usr/share/n2rep/etc/nifi and /usr/share/n2acd/etc/nifi directory on the reporting server. Import these process groups as NiFi templates, then create process groups from them.

File Description
N2SVCD_EDR_Parsing.xml A template for the N2SVCD EDR Parsing process group.
N2ACD_Service_EDR_Processing.xml A template for the N2ACD Service EDR Processing process group.
N2ACD_DB_Extract.xml A template for the N2ACD DB Extract process group

To upload a template:

  1. Right click on the canvas in the NiFI GUI and select Upload template.
  2. Select the template to upload from the local drive. This will be an xml file.
  3. Upload the template by accepting the selected file. If the template is uniquely named, the template will appear in the templates list.

Then to use a template:

  1. Using the Template Option from the header in NiFi, select the template to create a process group from by dragging the icon onto the canvas.
  2. Edit the new process group (right click the process group and select Configure) and in the General tab, configure the Process Group Parameter Context to be the process group parameter context (see below).
  3. If required, configure the password for any controller services that access the database.
  4. Enable each “Controller Services” service in the configuration for each progress group. Each controller needs to be enabled even if the controller requires no site-specific configuration.
  5. Enable the process group (right click and select Start).

Configuring Installation Specific Parameters

Configuration for the dataflow templates provided by N-Squared for NiFi is done through the NiFi “Parameter Contexts” feature. To access the parameter contexts, use the burger bar in the top-right of the NiFi header:

NiFi Parameter Contexts Menu

In this menu, create or edit a parameter context, creating the parameters listed for each process group, as listed in this configuration manual:

NiFi Parameter Context Parameters

The parameter context group must be named. The name can be unique to your installation.

Once the parameter context is created, configure each of the process groups by right clicking on the process group and configuring the parameter context for the process group:

NiFi Parameter Context group as used

Note that this must be done for each process group individually. Process groups do not inherit their parent parameter context group.

Configuring Passwords

Passwords are considered sensitive information in NiFi and are not stored in templates - even when parameters are being used. Edit the following three services in the imported NiFi templates and set the passwords for each database connection.

Note that the password can be set to the parameter by using the parameter formatted value (#{N2ACD_SERVICE_DB_PG_PASSWORD} or #{N2REPORTING_PG_PASSWORD}), or can be set directly to the password.

Passwords are required for:

  1. The reporting database. In the N2SVCD EDR Parsing Process Group, in the n2reporting controller service, in the field Password.
  2. The reporting database. In the N2ACD Service EDR Processing Process Group, in the n2reporting controller service, in the field Password.
  3. The reporting database. In the N2ACD DB Extract Process Group, in the n2reporting controller service, in the field Password.
  4. The ACD SMS service database. In the N2ACD DB Extract Process Group, in the N2ACD Service Database controller service, in the field Password.