TY - GEN
T1 - ANMAT
T2 - 2019 International Conference on Management of Data, SIGMOD 2019
AU - Qahtan, Abdulhakim
AU - Tang, Nan
AU - Ouzzani, Mourad
AU - Cao, Yang
AU - Stonebraker, Michael
N1 - Publisher Copyright:
© 2019 Association for Computing Machinery.
PY - 2019/6/25
Y1 - 2019/6/25
N2 - Knowledge discovery is critical to successful data analytics. We propose a new type of meta-knowledge, namely pattern functional dependencies (PFDs), that combine patterns (or regex-like rules) and integrity constraints (ICs) to model the dependencies (or meta-knowledge) between partial values (or patterns) across different attributes in a table. PFDs go beyond the classical functional dependencies and their extensions. For instance, in an employee table, ID “F-9-107”, “F” determines the finance department. Moreover, a key application of PFDs is to use them to identify erroneous data; tuples that violate some PFDs. In this demonstration, attendees will experience the following features: (i) PFD discovery - automatically discover PFDs from (dirty) data in different domains; and (ii) Error detection with PFDs - we will show errors that are detected by PFDs but cannot be captured by existing approaches.
AB - Knowledge discovery is critical to successful data analytics. We propose a new type of meta-knowledge, namely pattern functional dependencies (PFDs), that combine patterns (or regex-like rules) and integrity constraints (ICs) to model the dependencies (or meta-knowledge) between partial values (or patterns) across different attributes in a table. PFDs go beyond the classical functional dependencies and their extensions. For instance, in an employee table, ID “F-9-107”, “F” determines the finance department. Moreover, a key application of PFDs is to use them to identify erroneous data; tuples that violate some PFDs. In this demonstration, attendees will experience the following features: (i) PFD discovery - automatically discover PFDs from (dirty) data in different domains; and (ii) Error detection with PFDs - we will show errors that are detected by PFDs but cannot be captured by existing approaches.
KW - Constrained Patterns
KW - Error Detection
KW - Knowledge Discovery
KW - Pattern Functional Dependencies
UR - https://www.scopus.com/pages/publications/85069516123
U2 - 10.1145/3299869.3320209
DO - 10.1145/3299869.3320209
M3 - Conference contribution
AN - SCOPUS:85069516123
T3 - Proceedings of the ACM SIGMOD International Conference on Management of Data
SP - 1977
EP - 1980
BT - SIGMOD 2019 - Proceedings of the 2019 International Conference on Management of Data
PB - Association for Computing Machinery
Y2 - 30 June 2019 through 5 July 2019
ER -