TY - GEN
T1 - Explaining entity resolution predictions
T2 - 2019 Workshop on Human-In-the-Loop Data Analytics, HILDA 2019, co-located with SIGMOD 2019
AU - Thirumuruganathan, Saravanan
AU - Ouzzani, Mourad
AU - Tang, Nan
N1 - Publisher Copyright:
© 2019 Association for Computing Machinery.
PY - 2019/7/5
Y1 - 2019/7/5
N2 - Entity resolution (ER) seeks to identify the set of tuples in a dataset that refer to the same real-world entity. It is one of the fundamental and well studied problems in data integration with applications in diverse domains such as banking, insurance, e-commerce, and so on. Machine Learning and Deep Learning based methods provide the state-of-the-art results. For practitioners, it is often challenging to understand why the classifier made a particular prediction. While there has been extensive work in the ML community on explaining classifier predictions, we found that a direct application of those techniques is not appropriate for ER. There is a huge gap between the needs of lay ER practitioners and the explanation community. In this paper, we provide a comprehensive taxonomy of these challenges, discuss research opportunities and propose preliminary solutions.
AB - Entity resolution (ER) seeks to identify the set of tuples in a dataset that refer to the same real-world entity. It is one of the fundamental and well studied problems in data integration with applications in diverse domains such as banking, insurance, e-commerce, and so on. Machine Learning and Deep Learning based methods provide the state-of-the-art results. For practitioners, it is often challenging to understand why the classifier made a particular prediction. While there has been extensive work in the ML community on explaining classifier predictions, we found that a direct application of those techniques is not appropriate for ER. There is a huge gap between the needs of lay ER practitioners and the explanation community. In this paper, we provide a comprehensive taxonomy of these challenges, discuss research opportunities and propose preliminary solutions.
UR - https://www.scopus.com/pages/publications/85072805005
U2 - 10.1145/3328519.3329130
DO - 10.1145/3328519.3329130
M3 - Conference contribution
AN - SCOPUS:85072805005
T3 - Proceedings of the ACM SIGMOD International Conference on Management of Data
BT - Proceedings of the Workshop on Human-In-the-Loop Data Analytics, HILDA 2019
PB - Association for Computing Machinery
Y2 - 5 July 2019
ER -