基于alibaba druid的血缘解析
1、前言
仅仅对mysql数据库的select查询语句进行了血缘解析,该血缘解析包含了原始表字段、临时表字段和目标表字段的关联关系。
2、涉及到技术
主要使用了druid的如下接口对语法树进行解析:
(1)selement
- SQLSelectStatement
(2)sqlselectquery
- SQLUnionQuery
- SQLSelectQueryBlock
- MySqlSelectQueryBlock
(3)tablesource
- SQLExprTableSource
- SQLJoinTableSource
- SQLSubqueryTableSource
- SQLUnionQueryTableSource
(4)sqlexpr
- SQLAggregateExpr
- SQLAllColumnExpr
- SQLIdentifierExpr
- SQLPropertyExpr
- SQLQueryExpr
3、成果展示
主要和datablau血缘解析工具进行对比
(1)select语句中包含关联查询和字段中包含子查询
SELECT
e.employee_id,
e.name,
e.gender,
e.date_of_birth,
e.date_of_hire,
d.department_name,
ts.total_salary,
b.bonus,
(
SELECT
B.id,
B.CGFS,
B.WLZ,
B.CPMC,
B.CGSL,
B.ZXCJ,
B.ZXJE,
B.HTGLXTHTH,
B.CGPZH,
B.WZBM,
B.HSZJ,
B.BEDNR JHBH,
B.WCXTCGDDID,
B.ZHTDAT,
D.WERKS_NAME XQGCMC,
D.WERKS XQGCBM
FROM
(
SELECT
A.id,
A.ZCGFS CGFS,
A.MATKL WLZ,
A.TXZ01 CPMC,
A.MENGE CGSL,
A.NAME1 ZXCJ,
A.ZHSDJ ZXJE,
A.ZHTNUM HTGLXTHTH,
A.EBELN CGPZH,
A.MATNR WZBM,
A.ZHSZJ HSZJ,
A.ZHTDAT,
C.EMATN,
C.BEDNR,
C.ZCGDDID WCXTCGDDID
FROM
TB_ERP_ZP11MMRP008 A,
TB_ERP_ME2M C
WHERE
A.EBELN = C.EBELN
and A.MATNR = C.EMATN
) B,
TB_ERP_ZP10MMDG030 D
WHERE
B.BEDNR = D.BEDNR
AND B.EMATN = D.MATNR
and B.ZHTDAT > to_date('2023/05/01 00:00:00', 'YYYY/MM/DD HH24:MI:SS')
) as sfs,
(
select
s.d
from
student s
where
student_id = e.employee_id
) as saf12
FROM
employee e
INNER JOIN department d ON e.department_id = d.department_id
INNER JOIN tmp_salary ts ON e.employee_id = ts.employee_id
LEFT OUTER JOIN bonus b ON e.employee_id = b.employee_id;
本血缘解析工具分析结果:
-- TB_ERP_ZP11MMRP008.NAME1 -> B.ZXCJ -> SELECT_STATEMENT_2.ZXCJ -> SELECT_STATEMENT_0.sfs
-- employee.date_of_birth -> SELECT_STATEMENT_0.date_of_birth
-- TB_ERP_ZP11MMRP008.ZHTDAT -> B.ZHTDAT -> SELECT_STATEMENT_2.ZHTDAT -> SELECT_STATEMENT_0.sfs
-- department.department_name -> SELECT_STATEMENT_0.department_name
-- bonus.bonus -> SELECT_STATEMENT_0.bonus
-- student.d -> SELECT_STATEMENT_5.d -> SELECT_STATEMENT_0.saf12
-- TB_ERP_ZP11MMRP008.MATNR -> B.WZBM -> SELECT_STATEMENT_2.WZBM -> SELECT_STATEMENT_0.sfs
-- TB_ERP_ZP11MMRP008.EBELN -> B.CGPZH -> SELECT_STATEMENT_2.CGPZH -> SELECT_STATEMENT_0.sfs
-- employee.name -> SELECT_STATEMENT_0.name
-- TB_ERP_ZP11MMRP008.ZHTNUM -> B.HTGLXTHTH -> SELECT_STATEMENT_2.HTGLXTHTH -> SELECT_STATEMENT_0.sfs
-- TB_ERP_ZP11MMRP008.MATKL -> B.WLZ -> SELECT_STATEMENT_2.WLZ -> SELECT_STATEMENT_0.sfs
-- TB_ERP_ZP10MMDG030.WERKS -> SELECT_STATEMENT_2.XQGCBM -> SELECT_STATEMENT_0.sfs
-- TB_ERP_ZP11MMRP008.ZCGFS -> B.CGFS -> SELECT_STATEMENT_2.CGFS -> SELECT_STATEMENT_0.sfs
-- TB_ERP_ZP11MMRP008.MENGE -> B.CGSL -> SELECT_STATEMENT_2.CGSL -> SELECT_STATEMENT_0.sfs
-- employee.gender -> SELECT_STATEMENT_0.gender
-- TB_ERP_ZP11MMRP008.TXZ01 -> B.CPMC -> SELECT_STATEMENT_2.CPMC -> SELECT_STATEMENT_0.sfs
-- TB_ERP_ZP11MMRP008.id -> B.id -> SELECT_STATEMENT_2.id -> SELECT_STATEMENT_0.sfs
-- tmp_salary.total_salary -> SELECT_STATEMENT_0.total_salary
-- TB_ERP_ZP11MMRP008.ZHSZJ -> B.HSZJ -> SELECT_STATEMENT_2.HSZJ -> SELECT_STATEMENT_0.sfs
-- TB_ERP_ME2M.ZCGDDID -> B.WCXTCGDDID -> SELECT_STATEMENT_2.WCXTCGDDID -> SELECT_STATEMENT_0.sfs
-- TB_ERP_ZP11MMRP008.ZHSDJ -> B.ZXJE -> SELECT_STATEMENT_2.ZXJE -> SELECT_STATEMENT_0.sfs
-- TB_ERP_ME2M.EMATN -> B.EMATN
-- TB_ERP_ME2M.BEDNR -> B.BEDNR -> SELECT_STATEMENT_2.JHBH -> SELECT_STATEMENT_0.sfs
-- employee.date_of_hire -> SELECT_STATEMENT_0.date_of_hire
-- employee.employee_id -> SELECT_STATEMENT_0.employee_id
-- TB_ERP_ZP10MMDG030.WERKS_NAME -> SELECT_STATEMENT_2.XQGCMC -> SELECT_STATEMENT_0.sfs
datablua血缘解析工具结果:
(2)select联合查询
select s.id from student s union select e.id from employees e
本血缘解析工具分析结果:
-- student.id -> SELECT_STATEMENT_0.id
-- employees.id -> SELECT_STATEMENT_1.id
datablua血缘解析工具结果:
4 待优化点
- 不支持没有别名的select查询
- 联合查询没有结果汇总,有待优化