Mining High-Utility Patterns in Uncertain Tensors

Abstract

Transactional datasets are 0/1 matrices, which generically stand for objects having Boolean properties. If every cell of the matrix is additionally associated with a real number called utility, a high-utility itemset relates to a all-ones sub-matrix with utilities that sum to a high-enough value. This article shows that “having a total utility exceeding a threshold” is a piecewise (anti-)monotone constraint, even in presence of both positive and negative utilities. For that reason, a generic algorithm, multidupehack, can prune the search of the high-utility patterns defined in a broader context than the 0/1 matrix: the uncertain tensor. Moreover, the utilities may relate to only some of the dimensions, the patterns can be forced (or not) to be closed and to satisfy additional constraints. A real-world application, which exploits all those possibilities, is presented. Despite its versatility, the proposal is also shown competitive when it comes to mining 0/1 matrices, a special case treated by dozens of specific algorithms.

Publication
KES’18: Proceedings of the 22nd International Conference on Knowledge-Based and Intelligent Information & Engineering Systems