An implementation of the AdaptiveImpute algorithm for matrix completion
for sparse matrices.
Usage
adaptive_impute(
X,
rank,
...,
initialization = c("svd", "adaptive-initialize", "approximate"),
max_iter = 200L,
check_interval = 1L,
epsilon = 1e-07,
additional = NULL
)
# S3 method for class 'sparseMatrix'
adaptive_impute(
X,
rank,
...,
initialization = c("svd", "adaptive-initialize", "approximate"),
additional = NULL
)
# S3 method for class 'LRMF'
adaptive_impute(
X,
rank,
...,
epsilon = 1e-07,
max_iter = 200L,
check_interval = 1L
)Arguments
- X
A sparse matrix of
Matrix::sparseMatrix()class.- rank
Desired rank (integer) to use in the low rank approximation. Must be at least
2Land at most the rank ofX. Note that the rank ofXis typically unobserved and computations may be unstable or even fail whenrankis near or exceeds this threshold.- ...
Unused additional arguments.
- initialization
How to initialize the low rank approximation. Options are:
"svd"(default). In the initialization step, this treats unobserved values as zeroes."adaptive-initialize". In the initialization step, this treats unobserved values as actually unobserved. However, the currentAdaptiveInitializeimplementation relies on dense matrix computations that are only suitable for relatively small matrices."approximate". An approximate variant ofAdaptiveInitializethat is less computationally expensive. Seeadaptive_initializefor details.
Note that initialization matters as
AdaptiveImputeoptimizes a non-convex objective. The current theory shows that initializing withAdaptiveInitializeleads to a consistent estimator, but it isn't know if this is the case for SVD initialization. Empirically we have found that SVD initialization works well nonetheless.- max_iter
Maximum number of iterations to perform (integer). Defaults to
200L. In practice 10 or so iterations will get you a decent approximation to use in exploratory analysis, and and 50-100 will get you most of the way to convergence. Must be at least1L.- check_interval
Integer specifying how often to perform convergence checks. Defaults to
1L. In practice, check for convergence requires a norm calculation that is expensive for large matrices and decreasing the frequency of convergence checks will reduce computation time. Can also be set toNULL, which casemax_iteriterations of the algorithm will occur with no possibility of stopping due to small relative change in the imputed matrix. In this casedeltawill be reported asInf.- epsilon
Convergence criteria, measured in terms of relative change in Frobenius norm of the full imputed matrix. Defaults to
1e-7.- additional
Ignored except when
alpha_method = "approximate"in which case it controls the precise of the approximation toalpha. The approximate computation ofalphawill always understandalpha, but the approximation will be better for larger values ofadditional. We recommend makingadditionalas large as computationally tolerable.
Value
A low rank matrix factorization represented by an
adaptive_imputation() object.
References
Cho, Juhee, Donggyu Kim, and Karl Rohe. “Asymptotic Theory for Estimating the Singular Vectors and Values of a Partially-Observed Low Rank Matrix with Noise.” Statistica Sinica, 2018. https://doi.org/10.5705/ss.202016.0205.
———. “Intelligent Initialization and Adaptive Thresholding for Iterative Matrix Completion: Some Statistical and Algorithmic Theory for Adaptive-Impute.” Journal of Computational and Graphical Statistics 28, no. 2 (April 3, 2019): 323–33. https://doi.org/10.1080/10618600.2018.1518238.
Examples
mf <- adaptive_impute(ml100k, rank = 3L, max_iter = 5L, check_interval = NULL)
#> INFO [2025-05-02 19:14:25] Using svd initialization.
#> INFO [2025-05-02 19:14:25] Done initializing.
#> INFO [2025-05-02 19:14:25] Beginning AdaptiveImpute (max 5 iterations).
#> INFO [2025-05-02 19:14:25] Iter 1 complete. delta = Inf, alpha = 315.498594765074
#> INFO [2025-05-02 19:14:25] Iter 2 complete. delta = Inf, alpha = 236.478430784209
#> INFO [2025-05-02 19:14:25] Iter 3 complete. delta = Inf, alpha = 192.407973190366
#> INFO [2025-05-02 19:14:25] Iter 4 complete. delta = Inf, alpha = 164.221353309034
#> INFO [2025-05-02 19:14:25] Iter 5 complete. delta = Inf, alpha = 144.66332718209
#> Warning:
#> Reached maximum allowed iterations. Returning early.
mf
#>
#> Adaptively Imputed Low Rank Matrix Factorization
#> ------------------------------------------------
#>
#> Rank: 3
#>
#> Rows: 943
#> Cols: 1682
#>
#> d[rank]: 467.486
#> alpha: 144.663
#>
#> Components
#>
#> u: 943 x 3 [matrix]
#> d: 3 [numeric]
#> v: 1682 x 3 [matrix]