sparse setindex and stored zeros #9906

mauro3 · 2015-01-24T13:58:27Z

During construction of sparse matrices and when using setindex!, entries with values 0 are not included/dropped from the underlying storage. This is no good for certain applications. In particular when preallocating sparse matrices.

For example we currently got:

julia> sparse([1],[1], [0.], 5 ,5)
5x5 sparse matrix with 0 Float64 entries:

Instead

julia> sparse([1],[1], [1.], 5 ,5)
5x5 sparse matrix with 1 Float64 entries:
        [1, 1]  =  0.0

or

julia> sparse([1],[1], [1.], 5 ,5, keepzeros=true)
5x5 sparse matrix with 1 Float64 entries:
        [1, 1]  =  0.0

My use case are preallocated Jacobian matrices when solving PDEs. Usually the pattern of potential non-zeros in the Jacobian is known, so it can be preallocated. However, when repeatedly filling the Jacobian during the calculation there is no guarantee that its underlaying storage will be kept fixed.

Maybe there could be a sparse type for fixed size sparse matrix, say SparseMatrixCSCfixed. Constructed with sparse([1],[1], [1.], 5 ,5, keepzeros=true). Then setindex! would not drop zeros and error when trying to assign to a not-allocated entry.

The text was updated successfully, but these errors were encountered:

mlubin · 2015-01-24T15:45:03Z

I agree that a keepzeros flag would be useful. But anyway if you're preallocating why not just use dummy nonzero values?

mauro3 · 2015-01-24T19:15:35Z

Yes, I use ones to preallocate, which is fine. But the other problem does not have a workaround: if I update the Jacobian J[i,j]=someval and someval happens to be zero at that iteration then the whole sparse matrix is redone, only to be restored to the same state once someval is non-zero again.

JeffBezanson · 2015-01-24T19:31:50Z

dup of JuliaLang/LinearAlgebra.jl#60

Looks like it is possible to have explicit zeros by calling the SparseMatrixCSC constructor directly.

mauro3 · 2015-01-24T20:12:43Z

Thanks for pointing out that issue. I am aware that zeros can be stored in a sparse matrix, but not easily as setindex! will just purge that entry from the matrix. I think there should be a way to keep them when using standard indexing. The Julian way to do this would be with a new type.

mlubin · 2015-01-24T21:51:57Z

I don't want to rehash JuliaLang/LinearAlgebra.jl#60, but I believe the conclusion is that explicit zeros are allowed in SparseMatrixCSC because they're generally harmless, but that high-level interfaces like getindex are free to remove them. J[i,j]=someval in a loop is potentially a very inefficient way to update a sparse matrix anyway. If you're aiming for efficiency you have to work at the level of CSC format.

JeffBezanson · 2015-01-24T22:15:26Z

That's true; if you want (1) fast changes to individual elements, (2) keeping zeros in place, it seems the best thing to do is directly manipulate the representation.

tkelman · 2015-01-25T00:47:26Z

The Julian way to do this would be with a new type.

I don't think a new type would be a good idea for dealing with this, at least not with the current design. You'd have to reimplement a lot of methods for any new type, or introduce an AbstractCSC level to the hierarchy. Maybe a boolean type parameter could work, but that could also get pretty messy.

I think the best solution at the moment would be to add a specific setexplicitzero! method for this. I think there's already a non-exported helper function that can be used to do this, but I don't remember exactly which one.

ViralBShah · 2015-01-25T02:51:53Z

New type is definitely not a good idea. The best thing for now is perhaps to have new methods for this. We could even have a module with various methods that store their zeros.

mauro3 · 2015-01-25T12:56:32Z

Thanks for all the input. So I suggest two things:

add a keyword to the constructors keepzeros to force keeping zeros
mention in docs that zeros are dropped in constructors and setindex!

Let me know if that is good and I can make a pull request for those two points

Also, I just discovered this:

julia> sparse([1,1], [2,2], [1,-1])
1x2 sparse matrix with 1 Int64 entries:
        [1, 2]  =  0

Is this a bug?

ViralBShah · 2015-01-26T04:50:47Z

I guess the only constructor where this makes sense is sparse(). We can certainly do that.

Yes - that is a bug. Could you file a separate issue for that?

mauro3 · 2015-01-26T08:16:28Z

keepzeros also applies to spdiagm.

Filed issue #9928

ViralBShah · 2015-02-21T04:06:12Z

I feel that a better approach may be to have AbstractSparseMatrixCSC, and have all the methods implemented on the abstract type. The concrete types can be SparseMatrixCSC (default without stored zeros) and SparseMatrixCSCStoredZeros. They will share largely the same implementation, except that there will be special methods where stored zeros have to be treated differently.

I prefer the tags to be in the type system, rather than in the function arguments or flags in the data structure, and multiple dispatch will do the rest for us. The code will have to be refactored to use accessor methods, but I feel that overall, this is the better approach.

tkelman · 2015-02-21T05:26:12Z

I thought the majority opinion in #9928 was that no one actually thinks removing structural nonzeros with values numerically equal to zero is all that important.

ViralBShah · 2015-02-21T05:39:23Z

IMO, that will only be true until scores of users send in bug reports.

tkelman · 2015-02-21T05:46:04Z

It's good enough for SciPy?

ViralBShah · 2015-02-21T06:14:14Z

Matlab religiously squeezes out zeroes in all sparse matrix operations. This is SciPy's behaviour:

What should sparse() do, when there are zeros in the input? SciPy keeps them.
What should constructors like sparse() do, when arithmetic on repeated indices produces zeros? SciPy keeps them.
What should find() do? Even if there are stored zeros, SciPy does not return them - I don't like the check for zeros in find.
What should setindex!() do if assigning 0 or arrays containing 0 values in them? SciPy will explicitly store them.
What happens when stored zeros participate in arithmetic or matrix multiply. In SciPy, they get squeezed out in the result - you cannot retain the structure in the result of the arithmetic operation, which I would have liked in a stored zeros case.
What happens when stored zeros arise as a result of the arithmetic? In SciPy, they get squeezed out in the result.

jiahao · 2015-03-05T18:08:40Z

In #6769 (comment), I also looked up UMFPACK and MKL's documentation, which both state that they allow explicitly stored zeros.

tkelman · 2015-03-05T21:32:30Z

Pardiso in fact requires them on the diagonal.

ViralBShah · 2015-03-06T14:30:40Z

Here's the behaviour I would like to have for 0.4, which will not be a ton of work.

Built-in operations always remove the zeros that may arise due to arithmetic (eg. in sparse() or binary operators etc.). setindex! etc. will ignore zeros in the input. This would differ from scipy. If as a user you haven't introduced explicit zeros, Julia won't add them.
If there are stored zeros, operations such as find will not weed them out. They are treated just like other nonzeros.

nalimilan · 2015-03-06T15:12:48Z

Point 2. Sounds both dangerous and a bit odd to me, as it's inconsistent with dense arrays, and implementation-wise, it would be very easy for find to get rid of stored zeros if it finds any.

ViralBShah · 2015-03-06T16:25:15Z

Ah I remember that now. That is why we have the current behaviour. Thanks.

tkelman · 2015-03-06T18:59:14Z

Having an API to easily set an explicit zero without having to dig into the internal representation would be exceedingly helpful. That API does not have to be setindex! however, as long as things are documented well.

I'd be okay with just trying to change the behavior in the #9928 case and deferring other changes to post-0.4. But any proposed solution to #9928 should be carefully vetted that it does not cause significant performance regressions in other cases.

ArjunNarayanan · 2015-08-09T20:56:27Z

Hello everyone,

I was trying to follow the discussion on this thread to figure out how to explicitly store zeros in a sparse matrix. I haven't figured it out yet, so it would be helpful if someone could point out a reference.

I need this in order to implement knot interval vectors in an unstructured t-spline mesh. Such a mesh can have empty entries which store no information (which I would like to treat as the usual omitted zeros in a sparse matrix). Zero and positive real numbers are considered valid entries in the knot interval vector.

I am not anticipating performing any linear algebra directly on these matrices. I'd like to store it in a sparse format to allow for efficient storage of very large meshes. I also think that the algorithms to extract information from these meshes can be more easily implemented if the data is in a sparse format.

If you have any alternative suggestions, I'd be very glad to hear them.

Cheers,

tkelman · 2015-08-09T21:35:45Z

You need to directly modify the colptr, rowval, and nzval fields of a sparse matrix to insert a stored nonzero at this time. Hopefully we'll come up with a real API for doing this in the future.

mlubin · 2015-08-09T21:42:41Z

Or in this case, @ArjunNarayanan could build the matrix with sparse, replacing zeros with -1, and then go through nzval to replace -1 entries with zeros.
Anyway, my recommendation is to first think about the right sparse data structure for your problem. If CSC or CSR are convenient, then it's worth figuring out how to interact with SparseMatrixCSC. Otherwise, just code up the data structure yourself.

ArjunNarayanan · 2015-08-10T02:22:45Z

Thanks @tkelman and @mlubin. I can enter zero values by setting the nzval field directly, so @mlubin's first suggestion could work pretty well for me.

I think a CSC or CSR could work equally well in my case, as long as I am sensible about ordering my data so that it is accessed correctly (the only difference between CSC and CSR is that one would access CSC row-wise and CSR column-wise, correct?). If I could work more elegantly by dealing directly with SparseMatrixCSC, that would be great. Could you point out a reference from where I can figure out how to do this?

Thanks for the help!

tkelman · 2015-08-10T21:29:58Z

Other way around, it's better to iterate through CSC by columns and CSR by rows. There's a pretty good selection of resources online if you search on google/duckduckgo for "compressed sparse column." Tim Davis' book Direct Methods for Sparse Linear Systems is also good, there's even a series of lectures on youtube - https://www.youtube.com/playlist?list=PL5EvFKC69QIyRLFuxWRnH6hIw6e1-bBXB.

The main thing to keep in mind with these references and standard data structures is whether the indexing is 1-based or 0-based. Julia and Fortran are 1-based, C and Python are 0-based.

ArjunNarayanan · 2015-08-11T00:29:04Z

@tkelman thanks for your reply. I was actually interested in a reference for how to directly use SparseMatrixCSC in Julia. But your references will surely come in handy in the long run, thank you.

tkelman · 2015-08-11T00:33:07Z

Depends what you want to do. The source is at https://github.com/JuliaLang/julia/blob/master/base/sparse/sparsematrix.jl and in related files in that same directory. There are several packages that make use of sparse matrices in various ways, but most sparse matrix code tends to be pretty application-specific. You can also look through @ViralBShah's JuliaCon presentation notebook at https://github.com/JuliaSparse/JuliaCon2015/blob/master/Sparse%20Matrices%20in%20Julia.ipynb, hopefully the videos will get posted too one of these days.

ArjunNarayanan · 2015-08-11T01:43:03Z

@tkelman like I mentioned eariler, I'm only using these matrices for storage, and my only special need is the ability to easily store explicit zeros in my sparse matrix. Doing this via nzval works for me, and I was wondering how I would do it (as @mlubin suggested) by directly dealing with SparseMatrixCSC.

I'm not sure if I am at the stage where I can figure out how to do it from the source (beginner woes!) but I'll give it a shot. Thanks again!

tkelman · 2015-08-11T01:46:42Z

Ah, sorry, if all you're asking for is how to translate @mlubin's suggestion into code, it would look something like this:

A = sparse(rows, cols, vals, m, n) # where vals[k] = -1 for entries you want to turn into stored zeros
A.nzval[A.nzval .== -1] = 0

ArjunNarayanan · 2015-08-11T01:52:24Z

Ah, I think I was completely confused!

I figured out how to do it through nzval from @mlubin's comment. I think I misunderstood the additional statement -- "then it's worth figuring out how to interact with SparseMatrixCSC" -- as meaning that there was another way of doing the same thing. My bad!

Anyway, it looks like I'm good to go for now. Thanks for all the help!

mauro3 · 2015-08-11T07:15:24Z

Slightly off-topic: If you're on 0.4 nzrange and related can (and should) be used to do most of the direct manipulation:

help?> nzrange
search: nzrange

Base.nzrange(A, col)

   Return the range of indices to the structural nonzero values of a
   sparse matrix column. In conjunction with "nonzeros(A)" and
   "rowvals(A)", this allows for convenient iterating over a sparse
   matrix

      A = sparse(I,J,V)
      rows = rowvals(A)
      vals = nonzeros(A)
      m, n = size(A)
      for i = 1:n
         for j in nzrange(A, i)
            row = rows[j]
            val = vals[j]
            # perform sparse wizardry...
         end
      end

If on 0.3, just copy the definition of nzrange and rowvals into your code:

julia/base/sparse/sparsematrix.jl

Line 33 in cceac50

nzrange(S::SparseMatrixCSC, col::Integer) = S.colptr[col]:(S.colptr[col+1]-1)

fqiang · 2016-03-04T17:24:13Z

Is there a SparseMatrixCSC that can keep zeros in it yet?

@mauro3 thanks for pointing out the nzrange. Use this idea I come out with a version of sparse matrix that can keep zero values.
It is slow, but any suggestion to speed up will great.

function SparseMatrix.sparse(I,J,V, M, N;keepzeros=false)
    if(!keepzeros)
        return sparse(I,J,V,M,N)
    else
        full = sparse(I,J,ones(Float64,length(I)),M,N)
        actual = sparse(I,J,V,M,N)
        fill!(full.nzval,0.0)

        for c = 1:N
            crange = nzrange(actual,c)
            full.nzval[crange] = actual.nzval[crange] 
        end        
        return full
    end
end

tkelman · 2016-03-05T02:25:47Z

As of #15242, sparse keeps explicit zeros and you can use dropzeros! to remove them. setindex! has not been modified though.

Sacha0 · 2016-10-16T16:14:55Z

Did #17404 resolve this issue? Do any points from this issue still stand? Best!

mauro3 · 2016-10-16T18:49:08Z

Yep, resolved. Thanks for all hard work in this area!

IainNZ added the sparse Sparse arrays label Jan 24, 2015

ViralBShah changed the title ~~sparse matrices and zeros~~ sparse matrices and stored zeros Feb 19, 2015

mauro3 mentioned this issue May 21, 2015

Improve performance of issym and ishermitian for sparse matrices #11371

Merged

KristofferC mentioned this issue Aug 13, 2015

A keepzeros option for sparse(I, J, V) #12605

Closed

ViralBShah changed the title ~~sparse matrices and stored zeros~~ sparse setindex and stored zeros Oct 1, 2015

Sacha0 mentioned this issue Dec 19, 2015

methods for left-division ops involving triangular matrices and sparse vectors #14447

Merged

Sacha0 mentioned this issue Jan 26, 2016

MIT-licensed sparse() parent method and expert driver #14798

Closed

KristofferC mentioned this issue Mar 20, 2016

RFC/WIP: make setindex! not remove zeros from sparsity pattern #15568

Closed

Sacha0 mentioned this issue Jul 13, 2016

Make setindex! for sparse matrices and vectors not purge stored entries on zero assignment #17404

Merged

mauro3 closed this as completed Oct 16, 2016

JaredCrean2 mentioned this issue Dec 8, 2017

[WIP] Introduce AbstractSparseMatrixCSC into Type Hierarchy #24973

Closed

sparse setindex and stored zeros #9906

sparse setindex and stored zeros #9906

Comments

mauro3 commented Jan 24, 2015

mlubin commented Jan 24, 2015

mauro3 commented Jan 24, 2015

JeffBezanson commented Jan 24, 2015

mauro3 commented Jan 24, 2015

mlubin commented Jan 24, 2015

JeffBezanson commented Jan 24, 2015

tkelman commented Jan 25, 2015

ViralBShah commented Jan 25, 2015

mauro3 commented Jan 25, 2015

ViralBShah commented Jan 26, 2015

mauro3 commented Jan 26, 2015

ViralBShah commented Feb 21, 2015

tkelman commented Feb 21, 2015

ViralBShah commented Feb 21, 2015

tkelman commented Feb 21, 2015

ViralBShah commented Feb 21, 2015

jiahao commented Mar 5, 2015

tkelman commented Mar 5, 2015

ViralBShah commented Mar 6, 2015

nalimilan commented Mar 6, 2015

ViralBShah commented Mar 6, 2015

tkelman commented Mar 6, 2015

ArjunNarayanan commented Aug 9, 2015

tkelman commented Aug 9, 2015

mlubin commented Aug 9, 2015

ArjunNarayanan commented Aug 10, 2015

tkelman commented Aug 10, 2015

ArjunNarayanan commented Aug 11, 2015

tkelman commented Aug 11, 2015

ArjunNarayanan commented Aug 11, 2015

tkelman commented Aug 11, 2015

ArjunNarayanan commented Aug 11, 2015

mauro3 commented Aug 11, 2015

fqiang commented Mar 4, 2016

tkelman commented Mar 5, 2016

Sacha0 commented Oct 16, 2016

mauro3 commented Oct 16, 2016