Data integration overview – Kepion Support Center

Introduction

This overview explains the data integration (a.k.a. extract-transform-load (ETL)) process from a data source to a Kepion SQL database and covers required data integration tools.

Concepts

Each Kepion Application has a SQL database that can be used for data integration. There are three concepts you need to know:

Data Sources
Staging Tables
Target Tables

Data sources

Data sources are where you extract the data you wish to transfer and load it to Kepion databases. Some potential data sources are:

Excel Workbooks
SQL databases

Staging tables

Staging tables are an intermediate workspace for data processing before loading your data to your target tables. Typically, staging tables are truncated when the data integration process runs (i.e., data is cleared) before they are populated with the loading data.

Dimension staging tables

Includes tables starting with DS_ (Dimension Staging) and DSE_ (Dimension Staging Error). These tables are used in conjunction with the Dimension data loading Stored Procedure (SP):

[dbo].[sp_Load_Dimension_From_Staging]

Attention: You must go through the dimension loading process to update your dimension data. This process ensures Dimension MemberIds remain consistent with your solution's assets (e.g., forms, rules, etc.).

Model staging tables

Includes tables starting with Staging_F. These tables are pre-created for convenience but optional in data integration. You can also use the target F_ (Model/Fact) tables directly. Be aware that changes made to an F_ table will impact the Application.

Target tables

The target tables hold the solution's data. There are three types of target tables: Dimension, Hierarchy, and Model.

Note: Model data are structured in a star schema. The F_ table contains hierarchy keys referencing H_ tables, which have dimension keys referencing D_ tables.

Dimension tables

Includes any table that begins with D_ or DO_. The D_ tables store the list of dimension members; the DO_ tables store the ordering information of the dimension members.

Attention: Avoid truncating records in the D_ table as you may lose MemberIds. However, if there are no fact records or assets (e.g., forms, rules, etc.) tied to the MemberIds, there is no concern.

Hierarchy tables (a.k.a. member lists)

Includes any table that begins with H_ or HO_. The H_ tables store a subset of a D_ table's dimension members and any Parent-Child (PC) relationships. The HO_ tables store the hierarchy members' ordering information.

Note: Each PC hierarchy (i.e., member List in Hierarchy mode) has a V_H_ view and SysCache_V_H table to optimize application performance. When updating H_ tables, truncate SysCache_V_H_ and re-insert the records from V_H_.

Model tables (a.k.a. Fact tables)

Includes any table that begins with F_. The F_ table stores all of a model's fact records. There are two types of F_ tables that can belong to a model:

Writeback – Stores data that will change based on user input. For example, user-entered forecast records are stored in the Writeback table.
Non-Writeback – Stores all non-user-entered data. For example, actuals are stored in the Actuals table, not the Writeback table.

Tip: The FH_ table is a Model table that logs all user-entered data when the Audit option is enabled. Refer to our audit trail article for more information.

Process

The data integration process occurs in two steps:

Extracting data from a source system and loading it to staging tables.
Loading data from the staging tables to target tables.

Tip: Before starting data integration, ensure you have a backup of your application database in case of accidental data updates.

As a rule of thumb, execute data integration in the following order:

1. Update dimension

2. Update hierarchy

3. Update model/fact

Click the links above to view how-to articles on the three data integration steps.

Tools

For data integration tools, we have three recommendations:

SQL Server Management Studio (SSMS)

SSMS is the essential tool for data integration with Kepion databases. You must have it in order to conduct advanced data integration processes.

Azure Data Factory

ADF, released in 2015, is a serverless data integration service based on Microsoft Azure. ADF has a collection of 'pipelines,' the equivalent to SSIS's project and package structures.

ADF will suit your business if the following applies:

You have Kepion hosted on the cloud
Your Kepion cloud is not integrated with your on-premise infrastructure

Pros	Cons
Cloud-integrated and works with most cloud data sources	Pricing scheme can be complex depending on the services your company needs
No installation, machine, or SQL server required

Integration Services

Launched in 2005, SSIS has long been the top data integration tool for businesses using Microsoft SQL Server. SSIS combines its functional units of data integration work into 'packages,' which can be run individually or in groups.

SSIS will suit your business if you have Kepion installed on-premises.

Pros	Cons
Easily connects with Microsoft tools	Limited integration with cloud data sources and non-Microsoft tools
Long-established product	May require additional components and/or licensing
One-time payment. No subscription or service charges.

Third-party data integration

Third-party data Integration allows data transfer between Kepion and other applications:

Kepion enables CData and Skyvia users to configure processes that extract, transform, and load data from an external system into Kepion.