Skip to contents

The nmfbin R package provides a simple Non-Negative Matrix Factorization (NMF) implementation tailored for binary data matrices. It offers a choice of initialization methods, loss functions and updating algorithms.

NMF is typically used for reducing high-dimensional matrices into lower (k-) rank ones where k is chosen by the user. Given a non-negative matrix X of size m×nm \times n, NMF looks for two non-negative matrices W (m×km \times k) and H (k×nk \times n), such that:

XW×HX \approx W \times H

In topic modelling, if X is a word-document matrix then W can be interpreted as the word-topic matrix and H as the topic-document matrix.

Unlike most other NMF packages, nmfbin is focused on binary (Boolean) data, while keeping the number of dependencies to a minimum. For more information see the website.

Installation

You can install the development version of nmfbin from GitHub with:

# install.packages("remotes")
remotes::install_github("michalovadek/nmfbin")

Usage

The input matrix can only contain 0s and 1s.

# load
library(nmfbin)

# Create a binary matrix for demonstration
X <- matrix(sample(c(0, 1), 100, replace = TRUE), ncol = 10)

# Perform Logistic NMF
results <- nmfbin(X, k = 3, optimizer = "mur", init = "nndsvd", max_iter = 1000)

Citation

@Manual{,
  title = {nmfbin: Non-Negative Matrix Factorization for Binary Data},
  author = {Michal Ovadek},
  year = {2023},
  note = {R package version 0.2.1},
  url = {https://michalovadek.github.io/nmfbin/},
}

Contributions

Contributions to the nmfbin package are more than welcome. Please submit pull requests or open an issue for discussion.