Defining Derived Columns

Any subscription that replicates tables (or objects that are treated as tables) can contain derived columns. Each derived column must be based on a SQL expression that is valid for the source DBMS. A valid SQL expression can be a constant, function, or any combination of source column names, constants, and functions connected by operators.

If you are performing incremental replication, the SQL expression must also be valid for the Db2 for LUW staging database; if the expression for the Db2 for LUW staging database differs from that of the DBMS host, enter it in the Staging Expression field. For example, when replicating from an Oracle source database, you might wish to use Oracle-specific derived column expressions that are not standard "SQL92" scalar functions. Using the Staging Expression field to supply the equivalent function for Db2 LUW avoids the need to create User Defined Functions (UDF's) as described in Example 6a below.

For successful replication, a derived column must return the same number of rows as the other columns in the subscription. In practice, this means that a derived column that is based on an aggregate function cannot be used in the same subscription as columns based on individual source rows.

See the tech note Hints for Derived Columns for additional examples and other hints.

The following examples illustrate some of the ways that derived columns can be used. Adding a derived column to the destination data provides flexibility for manipulating the data, such as to:

add a column that contains a constant value
perform an arithmetic operation on individual rows
perform an arithmetic operation on the entire source table
perform arithmetic operations on groups of source table rows
concatenate text fields
convert data to use a different data type (Using User Defined Functions, Db2 Character Data to SQL Server Datetime and Oracle Date to Db2 Date)
add columns containing capture metadata for auditing and change detection purposes.

Derived column syntax must comply with SQL syntax rules for the source, as well as the Db2 for LUW staging database (if you are performing incremental replication). Different source platforms and software versions may require slight variations in syntax. For more information about specific SQL requirements, consult the documentation for your source DBMS.

The first four examples are based on a source table that contains sales data in four columns: ITEM, PRICE, TAXRATE, and SALESPERSON. The fifth example is based on a source table that contains personnel data including the columns FIRSTNAME and LASTNAME. The sixth example shows how to convert a Db2 character date to a SQL Server datetime data type and back to a character string.

Derived column definitions can be imported from a JSON script, using the Import Mapping Script button on the Destination panel after selecting a destination; see Importing Mapping Scripts. Some of the following examples include the JSON script that implements the described derived column.

Example #1: Constants

Goal: Add a column called Event that contains a constant value.

Expression: 'Spring Sale'
Column Name: EVENT
Data Type: char
Precision: 15

{ "action": "column", "schema": "%","table": "%", "dest_column":"EVENT", "source_expression":"\"Spring Sale\"", "stage_expression":"","dest_type":"char", "precision":7, "scale":0, "nullable":false, "comment": "column" }

Example #2: Arithmetic operation on individual rows

Goal: Add a column called SalesTax where PRICE is multiplied by TAXRATE for each source row.

If an arithmetic expression is applied to a source column containing a null value, the corresponding destination table value may also be null. In the example listed here, a null value in either the PRICE or TAXRATE column would produce a null value in the SalesTax column for the corresponding destination table row.

Expression: PRICE * TAXRATE
Column Name: SalesTax
Data Type: decimal
Precision: 8
Scale: 2

{ "action": "column", "schema": "%","table": "%", "dest_column":"SalesTax", "source_expression":"PRICE * TAXRATE", "stage_expression":"","dest_type":"decimal", "precision":8, "scale":2, "nullable":false, "comment": "column" }

Example #3: Arithmetic operation on entire source table

Goal: Create a table that shows the total sum of all values in the PRICE column of the source table.

To achieve the goal of this example, you must delete all original source columns from the list to be replicated. The values in these columns cannot be replicated to the same table as a sum, because the sum returns only one row.

Expression: SUM(PRICE)
Column Name: TOTAL
Data Type: decimal
Precision: 8
Scale: 2

{ "action": "drop", "schema": "%", "table": "%", "source_column": "PRICE" }
{ "action": "drop", "schema": "%", "table": "%", "source_column": "TAXRATE" }
{ "action": "column", "schema": "%","table": "%", "dest_column":"TOTAL", "source_expression":"SUM(PRICE)", "stage_expression":"","dest_type":"decimal", "precision":8, "scale":2, "nullable":false, "comment": "column" }

Example #4: Arithmetic operation on groups of source table rows

Goal: Create a table that shows subtotals of values in the PRICE column, grouped by SALESPERSON.

To achieve the goal of this example, you must delete all original source columns from the list to be replicated, except the SALESPERSON column by which records will be grouped. The values in the other columns cannot be replicated to the same table as a sum grouped by salesperson, because the sum returns only as many rows as there are salespeople listed in the source table.

Expression: SUM(PRICE)
Column Name: TOTAL
Data Type: decimal
Precision: 8
Scale: 2
Criteria (enter on the Criteria tab): PRICE > 0 GROUP BY SALESPERSON

{ "action": "drop", "schema": "%", "table": "%", "source_column": "PRICE" }
{ "action": "drop", "schema": "%", "table": "%", "source_column": "SALESPERSON" }
{ "action": "column", "schema": "%","table": "%", "dest_column":"TOTAL", "source_expression":"SUM(PRICE)", "stage_expression":"","dest_type":"decimal", "precision":8, "scale":2, "nullable":false, "comment": "column" }

Criteria: PRICE > 0 GROUP BY SALESPERSON

Example #5: Concatenating text fields

Goal: Add a column in which LASTNAME and FIRSTNAME values appear together, separated by a comma and a space.

Expression (for SQL Server source): LASTNAME + ', ' + FIRSTNAME
Expression (for Db2 for i or Oracle source): LASTNAME || ', ' || FIRSTNAME
Column Name: FULLNAME
Data Type: char
Precision: 35

{ "action": "column", "schema": "%","table": "%", "dest_column":"FULLNAME", "source_expression":"LASTNAME || ', ' || FIRSTNAME", "stage_expression":"","dest_type":"char", "precision":35, "scale":0, "nullable":false, "comment": "column" }

Example #6a: Using User Defined Functions

You can use User Defined Functions (UDF's) to convert data between different DBMS types. If you are using incremental replication, the UDF must be defined in both the source database and the local Db2 for LUW staging database. More information about creating and using UDF's will appear in a future technical document; contact StarQuest Support for assistance.

Several UDF's are provided for several common Db2 for i date conversion requirements where the source column is defined as decimal (numeric), zoned, or packed, and contains various forms of dates.

The following functions are provided:

MDY(MMDDCCYY) - where MMDDCCYY is the name of a DECIMAL(8,0) column containing a date expressed as MMDDCCYY, where MM is month, DD is day of month, CC is century and YY is year within century.
YMD(CCYYMMDD) - where CCYYMMDD is the name of a DECIMAL(8,0) column containing a date expressed as CCYYMMDD, where CC is century; YY is year within century, MM is month and DD is day of month.
CYMD(CC,YY,MM,DD) - where the date is stored in 4 separate DECIMAL(2,0) columns on the Db2 for i source.

These functions return a result of data type SQL_DATE. NULL input results in a result of NULL. A value of zero is mapped to '1901-01-01'.

Example: MDYZ is the name of a DECIMAL(8,0) column containing a date value defined as a zoned field (a DDS definition of 8S); the resulting destination will contain a derived column named MDYZ_1 of type date.

Expression: MDY(MDYZ)
Column Name: MDYZ_1
Data Type: date
Precision: 10
Scale: 0
Nullable: TRUE

If the SQDR Plus Staging Agent for the Db2 for i source was created prior to installing SQDR Plus 4.50, contact StarQuest Support for assistance in creating the UDF's in the local Db2 for LUW staging database. StarQuest Support can also provide assistance for creating the UDF's in the Db2 for i source database if you are not using incremental replication.

Example #6b: Converting Db2 Character Data to SQL Server Datetime

Goal: Convert a Db2 character data type date field to the SQL Server datetime data format, and back to the character data type on Db2. The source Db2 character string does not conform to the SQL Server datetime data format, but can be converted by using expressions to create a derived column on the target with the data formatted according to the desired data type.

In the SQDR subscription that replicates a char(8) data type from Db2 for i to a SQL Server datetime format, add a derived column that consists of the following expression. The Db2 timestamp function converts the char field to a timestamp and appends a time value so the column can be replicated to a SQL Server datetime field.

Expression: TIMESTAMP (<column_name> || '000000')
Data Type: datetime

To replicate the SQL Server datetime column back to the Db2 char(8) formatted column, use the SQL Server CONVERT function to add the data as a derived column to the Db2 target.

Expression: CONVERT (char(8), <column_name>, <date_style_value>)

The <date_style_value> determines the date format to use when converting the datetime data to character data. For example, specifying 112 converts it to the ISO standard format of yymmdd. Refer to the SQL Server documentation for a complete list of the style values and information about the CONVERT function.

Example #6c: Converting Oracle Date to Db2 Date

Goal: Replace an existing Date column with a derived column that uses an Oracle TO_CHAR function to convert the Date field to a Char field. The properly formatted Char field can be then replicated to a Db2 Date field.

Select the existing Date column listed in the Columns pane of the subscription properties, highlight the Date column and click Delete Column. Click the Insert Derived Column button and set the column properties as follows:

Expression: TO_CHAR (<original_date_column_name>, 'YYYY-MM-DD')
Column Name: use the same name as the original column name
Data Type: DATE

If your Db2 system uses a different DATE format than what is shown in the example, you can change the TO_CHAR function to conform to the desired format. Refer to the Oracle documentation for more date formatting options.

Capture Metadata

The following macros can be used to insert SQDR Plus capture metadata into a derived column in the destination table when the @ character occupies the first position of the derived column string. A space in front of the @ character acts as an escape character and disables value substitution.

Macro	Data Type	Notes
@TX	BINARY(10)	The SQDR Plus-generated hex-binary transaction id for the change record. Note that this is not the transaction id from the source system.
@TS	TIMESTAMP	The time on the source system when the change was made (or NULL if no changes have occurred to the row). Use DATETIME or DATETIME2 for SQL Server Destination.
@ts	TIMESTAMP	The time on the source system when the table was baselined OR when the row was last updated.
@LT	TIMESTAMP	Load Time - the time on the source system when the subscription was run i.e. when the table was baselined
@lt	TIMESTAMP	The time when a row was first inserted into the destination table, either as a result of a baseline OR an INSERT during incremental Apply.
@tz	DECIMAL(6,0)	Source system timezone offset (HHMMSS)
@TYPE	CHAR(1)	Also known as Operation. I/D/U indicates Insert/Delete/Update source table operations. S indicates that the subscription needs to be rebaselined. This can be an indication of a REORG at the source or that there is a disruption of the SQDR Plus Capture Agent’s log reading position. X indicates that the source table has been dropped. A indicates that the source table has been altered. B may be generated for subscriptions with “criteria” (i.e. a where clause). This additional change data row sent for each update contains the values of all the columns prior to the update. This option is not available for some source databases, such as Db2 for i for with only *AFTER journaling or SQL Server sources. C indicates that all change records collected during a baseline copy have been processed. This record can refer to baselines for other groups or even other agents subscribing to the same table.
@SEQ	BIGINT	An internal sequence number that is used to order staged change data when applied to the destination. It represents the relative order in which the update occurs in the source log within a given transaction. For a given source log, it is also the sequential order in which the operation was processed within the log.
@RRN	BIGINT (Db2 for i source) CHAR(18) (Oracle source)	Relative Record Number (Db2 for i) ROWID (Oracle) See ROWID_RRN_settings in Advanced Settings for information about properties that allow you to customize the default definitions of derived columns based on ROWID and RRN.
@JED @JED('element')	VARCHAR(254)	Journal Entry Detail: configure collectLogDetail=true on the SQDR Plus staging agent. See below for details. Db2 for i source only.
@SCHEMA	VARCHAR(128) Nullable	Source Schema
@OBJECT	VARCHAR(128) Nullable	Source Object
@MBR	CHAR(20)	Db2 for i member name. Typically null, or the same name as the table.
@DSN	VARCHAR(128) Nullable	Source name of the subscription
@null		Defines the value of NULL as the derived column expression. May be used for either the source and/or the staging derived column expression.
@skip		Provide means of defining a column on a destination table (the column will appear in the CREATE TABLE statement), without replicating a value during Apply. The initial value will be NULL. This is different than the @null expression which supplies the value NULL for the data element during incremental replication. @skip leaves the value unchanged for purposes of Apply. This allows the end user to manage the contents of the column on the destination independently of the replication process.

Note that the data types are guidelines and may vary depending on the destination DBMS.

Be sure to set the Nulls checkbox (allow the destination column to be NULLABLE) in the Column Properties dialog, as the initial value when the baseline is run will be NULL. The data will be updated after an Insert or a Delete.

@JED - Journal Entry Detail (Db2 for i source only)

The Journal Entry Detail is an XML structure containing multiple elements. @JED returns the entire XML structure. Use the syntax @JED('element') to obtain the value of a single element. The destination type should be large enough to accept the maximum expected value and should be nullable, since the value is undefined for a baseline. For example, @JED('journal') can be defined as VARCHAR(20) NULLABLE. Some elements such as @JED('rrn') or @JED('seqno') may be mapped to bigint.

@JED('enttype') returns a default value of "RR" after the table has been baselined.

IBM CDC columns	equivalent SQDR macro	source type	precision
Commit Cycle ID (&CCID)	@JED('ccid')	VARCHAR	20
Entry Type (&ENTTYP)	@JED('enttyp')	CHAR	2
Entry Type Code (&CODE)	@JED('code')	VARCHAR	2
Journal Name (&JOURNAL)	@JED('journal')	VARCHAR	20
Journal Sequence Number (&SEQNO)	@JED('seqno')	VARCHAR	20
Record Modification User (&USER)	@JED('user')	VARCHAR	10
Source Job Name (&JOB)	@JED('job')	VARCHAR	10
Source Job Number (&JOBNO)	@JED('jobno')	VARCHAR	6
Source Job User (&JOBUSR)	@JED('jobuser')	VARCHAR	10
Source Program Name (&PROGRAM)	@JED('program')	VARCHAR	10
Source RRN (&CNTRRN)	@JED('rrn')	VARCHAR	20