automate data fixes how to

Manual Data Fixes?

Do you have a DevOps solution in place for your applications, integrations, and databases, but find yourself managing “data fixes” for your ERP manually? This is often the case because data is the life-blood for companies and must be protected from corruption. I have seen several different variations, but the process generally looks the same:

  1. A developer creates a SQL script to perform a “data fix” and passes it off to a DBA, along with expected row counts or other queries to validate the data.
  2. The DBA backs up the data which will be impacted by the data fix.
  3. The DBA executes the SQL script, and validates the expected results. If the row counts and validation queries return the expected results, they commit the transaction, otherwise they perform a rollback and work back with the developer to reconcile any issues.

Manual No More!

The manual process is cumbersome and inefficient, but is generally accepted as a necessary evil in order to protect such a valuable asset. Data validation cannot be avoided, and that is accepted as fact. However, that does not mean that automation cannot assist in the execution, allowing data fixes to be managed as part of your pipeline.

FlexDeploy has native data fix support in two of its plugins.

  • Oracle E-Business Suite plugin – manage data fixes along with other customizations, such as forms, report, and PL/SQL objects.
  • JDBC plugin – manage data fixes for any application/system which is backed by one of the many supported databases.

The general concept is that the developer creates the SQL update script(s), along with other metadata which is used to backup the data and validate that the fix provides the intended result. The backup is automated, and then the SQL script(s) are executed. The validations are executed. If they all return the expected results, the transaction is committed, otherwise a rollback is issued. So, in a sense, the process is the same as what is generally done manually with collaboration between the developer and the DBA.

This information is modeled as an XML file. The fix and instruction (which is normally communicated to a DBA) is documented in a standard format.

<?xml version="1.0" encoding="UTF-8"?>
<DataFix xmlns="http://flexagon.com/database/datafix">
  <Description>Fix line items for product IDs 47809 and 45203</Description>
  <FixSQL>update XXHR.xxhr_order_items set unit_price = 100 where product_id = 47809; update XXHR.xxhr_order_items set unit_price = 75 where product_id = 45203; update XXHR.xxhr_product_list set
          list_price = 100 where product_id = 47809; update XXHR.xxhr_product_list set list_price = 75 where product_id = 45203;</FixSQL>
  <BackupSQL>create table xxhr.xxhr_order_items_bkp as (select * from xxhr.xxhr_order_items); create table xxhr.xxhr_product_list_bkp as (select * from xxhr.xxhr_product_list);</BackupSQL>
  <Validations>
    <Validation>
      <Description>Verify number of rows modified for product with id 47809</Description>
      <ValidationSQL>select count(*) from xxhr.xxhr_order_items where list_price = 100 and product_id = 47809</ValidationSQL>
      <Environment>
        <EnvironmentCode>DEV</EnvironmentCode>
        <UpdateCount>
          <GreaterThan>80</GreaterThan>
          <LessThan>100</LessThan>
        </UpdateCount>
      </Environment>
      <Environment>
        <EnvironmentCode>default</EnvironmentCode>
        <UpdateCount>
          <GreaterThan>87</GreaterThan>
          <LessThan>93</LessThan>
        </UpdateCount>
      </Environment>
    </Validation>
    <Validation>
      <Description>Verify number of rows modified for product with id 45203</Description>
      <ValidationSQL>select count(*) from xxhr.xxhr_order_items where list_price = 75 and product_id = 45203</ValidationSQL>
      <Environment>
        <EnvironmentCode>default</EnvironmentCode>
        <After>
          <GreaterThan>120</GreaterThan>
          <LessThan>125</LessThan>
        </After>
      </Environment>
    </Validation>
  </Validations>
</DataFix>


Let’s review some of the features in the XML schema:

<FixSQL></FixSQL>

Defines the SQL statements to perform the actual data fix.

<BackupSQL></BackupSQL>

Defines the SQL statement(s) to perform the backup of the data.  This may backup an entire table, or a subset of rows.

<Validations></Validations>

Defines one or more validation SQL statements and the expected results. The SQL statement must be scalar – returning one row and one column, which is an integer value. The results can be by FlexDeploy environment, and use comparison operators such as equal, less than, or greater than equal. When used in aggregate, you can effectively provide ranges. The “default” environment provides the expected results when another named environment is not explicitly listed.

The full definition of the XML schema, and the documentation can be found in the plugin guides (links provided above).

Summary

There is no getting around the rigor which must go into validating the quality of your data. By automating the execution you can streamline your pipeline using the same process for all types of changes.

Automating data fixes in a pipeline

Erick Jones

I have been working with Java EE technologies for 20 years, and have applied industry experience implementing IBM and Oracle solutions. Through my architecture and development experience I have implemented several homegrown and third party automation solutions to manage the software delivery lifecycle. At Flexagon I am a DevOps evangelist, with a passion for helping customers deliver software to their customers faster, cheaper, and with higher quality. My primary focus is on FlexDeploy product strategy, and pre-sales and post-sales implementations.

More posts by Erick Jones
    

Leave a Reply

Your email address will not be published. Required fields are marked *