Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[<- method removing names #67

Closed
Yunuuuu opened this issue Jan 28, 2024 · 5 comments
Closed

[<- method removing names #67

Yunuuuu opened this issue Jan 28, 2024 · 5 comments

Comments

@Yunuuuu
Copy link
Contributor

Yunuuuu commented Jan 28, 2024

mock_matrix <- function(ngenes, ncells) {
    cell.means <- 2^stats::runif(ngenes, 2, 10)
    cell.disp <- 100 / cell.means + 0.5
    cell.data <- matrix(stats::rnbinom(ngenes * ncells,
        mu = cell.means,
        size = 1 / cell.disp
    ), ncol = ncells)
    rownames(cell.data) <- sprintf("Gene_%s", formatC(seq_len(ngenes),
        width = 4, flag = 0
    ))
    colnames(cell.data) <- sprintf("Cell_%s", formatC(seq_len(ncells),
        width = 3, flag = 0
    ))
    cell.data
}
mat <- mock_matrix(2000, 200)
path <- normalizePath(tempfile(tmpdir = tempdir()), mustWork = FALSE)
obj <- BPCells::write_matrix_dir(mat = as(mat, "dgCMatrix"), dir = path)
#> Warning: Matrix compression performs poorly with non-integers.
#> • Consider calling convert_matrix_type if a compressed integer matrix is intended.
#> This message is displayed once every 8 hours.
value <- matrix(sample(mat, length(mat)), nrow = nrow(mat))
obj[1:10, ] <- value[1:10, ]
mat[1:10, ] <- value[1:10, ]
testthat::expect_equal(as.matrix(obj), mat)
#> Warning: Converting to a dense matrix may use excessive memory
#> This message is displayed once every 8 hours.
obj[, 1:10] <- value[, 1:10]
mat[, 1:10] <- value[, 1:10]
testthat::expect_equal(as.matrix(obj), mat)
obj[1:10, 1:10] <- value[1:10, 1:10]
#> Warning: cbind: column names presenent on some but not all matrices. Setting
#> missing names to ""
#> Warning: rbind: column names are mismatched. Setting names to match first
#> matrix
mat[1:10, 1:10] <- value[1:10, 1:10]
testthat::expect_equal(as.matrix(obj), mat)
#> Error: as.matrix(obj) not equal to `mat`.
#> Attributes: < Component "dimnames": Component 2: 190 string mismatches >

Created on 2024-01-28 with reprex v2.0.2
~

@Yunuuuu
Copy link
Contributor Author

Yunuuuu commented Jan 28, 2024

I have tried to fix this by following codes, all classes work well except ConvertMatrixType, any opinions for these?

methods::setMethod(
    "[<-", c("IterableMatrix", "ANY", "ANY", "ANY"),
    function(x, i, j, ..., value) {
        value <- as(value, "dgCMatrix")
        methods::callGeneric()
    }
)

methods::setMethod(
    "[<-", c("IterableMatrix", "ANY", "ANY", "dgCMatrix"),
    function(x, i, j, ..., value) {
        if (x@transpose) {
            value <- t(methods::as(t(value), "IterableMatrix"))
        } else {
            value <- methods::as(value, "IterableMatrix")
        }
        methods::callGeneric()
    }
)

methods::setMethod(
    "[<-", c("IterableMatrix", "ANY", "ANY", "IterableMatrix"),
    function(x, i, j, ..., value) {
        i <- selection_index(i, nrow(x), rownames(x))
        ni <- if (length(i) > 0) seq_len(nrow(x))[-i] else seq_len(nrow(x))
        x_i <- x[i, ]
        x_ni <- x[ni, ]
        # dispatch the "IterableMatrix", "missing", "ANY", "IterableMatrix" method
        x_i <- methods::callGeneric(x = x_i, j = j, value = value)
        rbind(x_i, x_ni)[order(c(i, ni)), ]
    }
)

methods::setMethod(
    "[<-", c("IterableMatrix", "ANY", "missing", "IterableMatrix"),
    function(x, i, j, ..., value) {
        i <- selection_index(i, nrow(x), rownames(x))
        ni <- if (length(i) > 0) seq_len(nrow(x))[-i] else seq_len(nrow(x))

        x_i <- x[i, ]
        x_ni <- x[ni, ]
        if (any(dim(x_i) != dim(value))) {
            stop("Mismatched dimensions in assignment to subset")
        }
        rownames(value) <- rownames(x_i)
        colnames(value) <- colnames(x_i)
        rbind(value, x_ni)[order(c(i, ni)), ]
    }
)

methods::setMethod(
    "[<-", c("IterableMatrix", "missing", "ANY", "IterableMatrix"),
    function(x, i, j, ..., value) {
        x <- t(x)
        x[j, , ...] <- t(value)
        t(x)
    }
)

methods::setMethod(
    "[<-", c("IterableMatrix", "missing", "missing", "IterableMatrix"),
    function(x, i, j, ..., value) {
        if (any(dim(x) != dim(value))) {
            stop("Mismatched dimensions in assignment to subset")
        }
        rownames(value) <- rownames(x)
        colnames(value) <- colnames(x)
        value
    }
)

@Yunuuuu
Copy link
Contributor Author

Yunuuuu commented Jan 28, 2024

The original [<- method also gave wrong values for ConvertMatrixType object, see #68 for small examples.

@bnprks
Copy link
Owner

bnprks commented Feb 1, 2024

This should be fixed by my most recent commit 450c2b9. Please comment/reopen if you spot remaining issues.

Thanks again for finding this problem and providing such clear examples to reproduce. Sorry to not go with your suggested solution due to my code-style preferences, but I appreciate all the bugs you've been identifying

@bnprks bnprks closed this as completed Feb 1, 2024
@Yunuuuu
Copy link
Contributor Author

Yunuuuu commented Feb 4, 2024

Hi, @bnprks , thanks for the fast fix, but it remains not perfect.
For some situations, [<- will also removing names, here are some examples:

library(BPCells)
mock_matrix <- function(ngenes, ncells) {
    cell.means <- 2^stats::runif(ngenes, 2, 10)
    cell.disp <- 100 / cell.means + 0.5
    cell.data <- matrix(stats::rnbinom(ngenes * ncells,
        mu = cell.means,
        size = 1 / cell.disp
    ), ncol = ncells)
    rownames(cell.data) <- sprintf("Gene_%s", formatC(seq_len(ngenes),
        width = 4, flag = 0
    ))
    colnames(cell.data) <- sprintf("Cell_%s", formatC(seq_len(ncells),
        width = 3, flag = 0
    ))
    cell.data
}
mat <- mock_matrix(200, 200)
path <- normalizePath(tempfile(tmpdir = tempdir()), mustWork = FALSE)
obj <- BPCells::write_matrix_dir(mat = as(mat, "dgCMatrix"), dir = path)
#> Warning: Matrix compression performs poorly with non-integers.
#> • Consider calling convert_matrix_type if a compressed integer matrix is intended.
#> This message is displayed once every 8 hours.
obj <- BPCells:::rank_transform(obj, "col")
obj <- BPCells::convert_matrix_type(obj, "double")
mat <- as.matrix(obj)
#> Warning: Converting to a dense matrix may use excessive memory
#> This message is displayed once every 8 hours.
value <- matrix(sample(mat, length(mat)), nrow = nrow(mat))
obj[1:10, 1:10] <- value[1:10, 1:10]
#> Warning: cbind: column names present on some but not all matrices. Setting
#> missing names to ""
#> Warning: rbind: column names are mismatched. Setting names to match first
#> matrix
mat[1:10, 1:10] <- value[1:10, 1:10]
testthat::expect_equal(as.matrix(obj), mat)
#> Error: as.matrix(obj) not equal to `mat`.
#> Attributes: < Component "dimnames": Component 2: 190 string mismatches >

Created on 2024-02-04 with reprex v2.0.2

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.3.1 (2023-06-16)
#>  os       Ubuntu 22.04.3 LTS
#>  system   x86_64, linux-gnu
#>  ui       X11
#>  language en
#>  collate  C.UTF-8
#>  ctype    C.UTF-8
#>  tz       Asia/Shanghai
#>  date     2024-02-04
#>  pandoc   2.9.2.1 @ /usr/bin/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package          * version    date (UTC) lib source
#>  BiocGenerics       0.46.0     2023-04-25 [1] Bioconductor
#>  bitops             1.0-7      2021-04-24 [1] CRAN (R 4.3.1)
#>  BPCells          * 0.1.0      2024-02-02 [1] Github (bnprks/BPCells@ff0c56f)
#>  brio               1.1.4      2023-12-10 [1] CRAN (R 4.3.1)
#>  cli                3.6.2      2023-12-11 [1] CRAN (R 4.3.1)
#>  desc               1.4.3      2023-12-10 [1] CRAN (R 4.3.1)
#>  digest             0.6.33     2023-07-07 [1] CRAN (R 4.3.1)
#>  evaluate           0.23       2023-11-01 [1] CRAN (R 4.3.1)
#>  fansi              1.0.6      2023-12-08 [1] CRAN (R 4.3.1)
#>  fastmap            1.1.1      2023-02-24 [1] CRAN (R 4.3.1)
#>  fs                 1.6.3      2023-07-20 [1] CRAN (R 4.3.1)
#>  GenomeInfoDb       1.36.1     2023-06-21 [1] Bioconductor
#>  GenomeInfoDbData   1.2.10     2023-08-08 [1] Bioconductor
#>  GenomicRanges      1.52.0     2023-04-25 [1] Bioconductor
#>  glue               1.6.2      2022-02-24 [1] CRAN (R 4.3.1)
#>  htmltools          0.5.7      2023-11-03 [1] CRAN (R 4.3.1)
#>  IRanges            2.34.1     2023-06-22 [1] Bioconductor
#>  knitr              1.45       2023-10-30 [1] CRAN (R 4.3.1)
#>  lattice            0.22-5     2023-10-24 [1] CRAN (R 4.3.1)
#>  lifecycle          1.0.4      2023-11-07 [1] CRAN (R 4.3.1)
#>  magrittr           2.0.3      2022-03-30 [1] CRAN (R 4.3.1)
#>  Matrix             1.6-4      2023-11-30 [1] CRAN (R 4.3.1)
#>  pillar             1.9.0      2023-03-22 [1] CRAN (R 4.3.1)
#>  pkgload            1.3.3      2023-09-22 [1] CRAN (R 4.3.1)
#>  purrr              1.0.2      2023-08-10 [1] CRAN (R 4.3.1)
#>  R.cache            0.16.0     2022-07-21 [1] CRAN (R 4.3.1)
#>  R.methodsS3        1.8.2      2022-06-13 [1] CRAN (R 4.3.1)
#>  R.oo               1.25.0     2022-06-12 [1] CRAN (R 4.3.1)
#>  R.utils            2.12.2     2022-11-11 [1] CRAN (R 4.3.1)
#>  R6                 2.5.1      2021-08-19 [1] CRAN (R 4.3.1)
#>  Rcpp               1.0.11     2023-07-06 [1] CRAN (R 4.3.1)
#>  RCurl              1.98-1.12  2023-03-27 [1] CRAN (R 4.3.1)
#>  reprex             2.0.2      2022-08-17 [1] CRAN (R 4.3.1)
#>  rlang              1.1.2      2023-11-04 [1] CRAN (R 4.3.1)
#>  rmarkdown          2.25       2023-09-18 [1] CRAN (R 4.3.1)
#>  rprojroot          2.0.4      2023-11-05 [1] CRAN (R 4.3.1)
#>  RSpectra           0.16-1     2022-04-24 [1] CRAN (R 4.3.1)
#>  S4Vectors          0.38.1     2023-05-02 [1] Bioconductor
#>  sessioninfo        1.2.2      2021-12-06 [1] CRAN (R 4.3.1)
#>  styler             1.10.2     2024-01-13 [1] Github (r-lib/styler@17f91fc)
#>  testthat           3.2.1.9000 2024-01-19 [1] Github (r-lib/testthat@fe50a22)
#>  utf8               1.2.4      2023-10-22 [1] CRAN (R 4.3.1)
#>  vctrs              0.6.5      2023-12-01 [1] CRAN (R 4.3.1)
#>  withr              2.5.2      2023-10-30 [1] CRAN (R 4.3.1)
#>  xfun               0.41       2023-11-01 [1] CRAN (R 4.3.1)
#>  XVector            0.40.0     2023-04-25 [1] Bioconductor
#>  yaml               2.3.8      2023-12-11 [1] CRAN (R 4.3.1)
#>  zlibbioc           1.46.0     2023-04-25 [1] Bioconductor
#> 
#>  [1] /home/yun/Rlibrary/4.3
#>  [2] /usr/local/lib/R/site-library
#>  [3] /usr/lib/R/site-library
#>  [4] /usr/lib/R/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────

@bnprks
Copy link
Owner

bnprks commented Feb 5, 2024

Thanks for following up on this. Should be fixed in commit 9e9dd2e

Interestingly this bug originated in some relatively new changes in [, not in [<-.

Hopefully this won't continue to be an unending source of bugs -- at the very least we get a new correctness test each time you find a new issue. I'll try to at least be prompt with fixes if you keep finding more problems 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants