Determining the impact and data lineage for database objects such as stored procedure, table, view, column can help you troubleshoot stored procedure; capture and represent column level data lineage between applications; find table/view and column references and more.
The following are some examples:
* A stored procedure is running slowly. You examine the data lineage for the stored procedure.
Looking at the data lineage helps you determine that the stored procedure queries a table that lacks useful indexes.
* A stored procedure does not run. You examine which objects have an impact on that stored procedure.
Looking at the object impact helps you determine that the stored procedure exports data to a table column
that has been deleted.
Understanding the impact and data lineage for objects can also help you understand the possible ramifications of making changes to objects. For example, you plan to modify a table. You examine the impact of that table on your stored procedure and discover that the stored procedure uses the table to extract sales data. If you modify the table columns, the stored procedure might fail.
Typically, there are two types of impact and data lineage.
Object B has an impact on object A.
Changes to object B can cause the behavior of object A to change or cause
object A to fail.
we call this impact lineage, such kind of lineage can be found but not limited
in where clause/join condition/group by/order by clause in a SQL query.
SELECT a.deptno "Department",
a.num_emp / b.total_count "Employees",
a.sal_sum / b.total_sal "Salary"
FROM (SELECT deptno,
COUNT(*) num_emp,
SUM(sal) sal_sum
FROM scott.emp
GROUP BY deptno) a,
(SELECT COUNT(*) total_count,
SUM(sal) total_sal
FROM scott.emp) b
in above SQL,
"Employees" depends on: scott.emp, scott.emp.deptno.
If we change data in scott.emp.deptno, or change definition of scott.emp.deptno,
selected result of "Employees" will be changed.
Object A provides data to object C. The lineage of data is from object A to
object C.
we call this data/persistence lineage. such kind of lineage can be found but
not limited in create view, set clause in update statement and insert statement.
UPDATE sds_tst_prdct
SET base_cow_id = b.bsecowid
FROM sds_tst_prdct a,
bog_pr_jil b
WHERE a.std_tst_id = b.tstid
AND b.bsecowid > 1
in above SQL,
bog_pr_jil.bsecowid provides data to sds_tst_prdct.base_cow_id, so there is a
data lineage from bog_pr_jil to sds_tst_prdct.
There in no builtin tool in major databases such as Oracle, SQL Server, Sybase, Teradata and DB2 can trace impact and data lineage for you. So we created some tools based on our powerful SQL Parser which can provides an in-depth and detailed analysis of SQL scripts for various databases.
our tool read some SQL scripts and generate an impact and data lineage map automatically.
check this demo for detailed information.