by Joachim Staib, Marcel Spehr, Stefan Gumhold
Abstract:
The non-negative matrix factorization provides a valuable tool for the analysis of positive data by representing it as an additive linear superposition of a small number of non-negative base elements. This property allows the base elements to be interpreted in the same domain as the input data. The problem though lies in the ambiguity of equally valid solutions from which only one is obtained. Its selection depends on the initialization of the applied factorization algorithm or further constraints. We propose a new approach which is based on sampling the set of valid factorizations, given one initial solution. First we derive a parameterization of the ambiguity. A parameter tuple can be probed for membership through an oracle function that either returns true or false. Then we present an algorithm that explores and samples parts of the non-convex solution set. To assist the otherwise automatic process and to alleviate the drawbacks of sampling a non-convex space, we provide a graphical user interface that puts the human in the loop. From an initial set of samples the user is allowed to select elements that serve as the starting point for subsequent samplings. With this browser-like tool a steering of the sampling of the NMF can be performed without further knowledge on the underlying algorithm and without the need to express possibly hard to formulate constraints. An evaluation of the sampling procedure reveals promising results for a factorization with a rank up to 4.
Reference:
User Assisted Exploration and Sampling of the Solution Set of Non-Negative Matrix Factorizations (Joachim Staib, Marcel Spehr, Stefan Gumhold), In Proceedings of the IADIS Multi Conference on Computer Science and Information Systems (MCCSIS 2014) on Data Mining, IADIS Press, 2014. (, Awarded best paper)
Bibtex Entry:
@inproceedings{Staib:2014,
title = {User Assisted Exploration and Sampling of the Solution Set of Non-Negative Matrix Factorizations},
author = {Joachim Staib and Marcel Spehr and Stefan Gumhold},
keywords = {Data Mining, Interactive Sampling, Non-Negative Matrix Factorization},
affiliations = {CGV},
areas = {areava},
pages = {29--38},
booktitle = {Proceedings of the IADIS Multi Conference on Computer Science and Information Systems (MCCSIS 2014) on Data Mining},
year = {2014},
month = {July},
publisher = {IADIS Press},
address={Lissabon, Portugal},
note={, Awarded best paper},
abstract={
The non-negative matrix factorization provides a valuable tool for the analysis of positive data by representing
it as an additive linear superposition of a small number of non-negative base elements. This property allows the
base elements to be interpreted in the same domain as the input data. The problem though lies in the ambiguity of
equally valid solutions from which only one is obtained. Its selection depends on the initialization of the applied
factorization algorithm or further constraints. We propose a new approach which is based on sampling the set of
valid factorizations, given one initial solution. First we derive a parameterization of the ambiguity. A parameter
tuple can be probed for membership through an oracle function that either returns true or false. Then we present
an algorithm that explores and samples parts of the non-convex solution set. To assist the otherwise automatic
process and to alleviate the drawbacks of sampling a non-convex space, we provide a graphical user interface that
puts the human in the loop. From an initial set of samples the user is allowed to select elements that serve as the
starting point for subsequent samplings. With this browser-like tool a steering of the sampling of the NMF can be
performed without further knowledge on the underlying algorithm and without the need to express possibly hard to
formulate constraints. An evaluation of the sampling procedure reveals promising results for a factorization with a rank up to 4.
}
}