Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

normalization added #176

Merged
merged 16 commits into from
Nov 18, 2023
Merged

normalization added #176

merged 16 commits into from
Nov 18, 2023

Conversation

mhajij
Copy link
Member

@mhajij mhajij commented Jun 25, 2023

often in DL computations normalization must be added to matrices for computations. This file is a collection of popular methods typically used in a DL setting.

@codecov
Copy link

codecov bot commented Jun 26, 2023

Codecov Report

Attention: 19 lines in your changes are missing coverage. Please review.

Comparison is base (a2808f2) 76.09% compared to head (396f70f) 96.87%.
Report is 410 commits behind head on main.

Files Patch % Lines
toponetx/utils/normalization.py 83.03% 19 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##             main     #176       +/-   ##
===========================================
+ Coverage   76.09%   96.87%   +20.77%     
===========================================
  Files          22       32       +10     
  Lines        2460     3518     +1058     
===========================================
+ Hits         1872     3408     +1536     
+ Misses        588      110      -478     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mhajij mhajij requested a review from michaelschaub June 26, 2023 05:52
toponetx/utils/normalization.py Show resolved Hide resolved
toponetx/utils/normalization.py Outdated Show resolved Hide resolved
toponetx/utils/normalization.py Outdated Show resolved Hide resolved
toponetx/utils/normalization.py Outdated Show resolved Hide resolved
return out


def kipf_adjacency_matrix_normalization(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should use descriptive names for these functions (in that the names should indicate what the function does).

Something like:
normalized_symmetric_adjacency_matrix

is what is computed here. It is true (and perhaps) interesting that this is used in Kipf's work, but it is not an invention of that paper as well..
The information where this normalization is used could be provided in the function description with a reference. Examples: Kipf... [REF]

Second, it would be good to have all names consistent, atm we have things like

normalize_x_laplacian (verb first), and ..._adjacency_matrix_normalization.

We should decide for one style and apply consistently, Perhaps something like:

compute_xy_normalized_zmatrix ??

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michaelschaub do you know a better reference for the above normalization ?

Also, do you have a ref for D^{-1}A normalization?

please let me know if you like the current naming convection-I can change it again if you have something better in mind.

_compute_D1,
_compute_D2,
_compute_D3,
_compute_D5,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are internal functions (starting with an underscore) and should not be imported into the toponetx namespace.

_compute_D1,
_compute_D2,
_compute_D3,
_compute_D5,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I see these internal functions are not called directly here and should not be imported.


L = csr_matrix(L).asfptype()
normalized_L = compute_laplacian_normalized_matrix(L)
expected_result = csr_matrix(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to construct a csr_matrix here when you convert it to dense arrays below again. Just use np.array.

L = csr_matrix([[2.0, -1.0, 0.0], [-1.0, 3.0, -1.0], [0.0, -1.0, 2.0]])
Lx = csr_matrix([[1.0, 0.0, 0.0], [0.0, 0.0, 1.0], [0.0, 1.0, 0.0]])
normalized_Lx = compute_x_laplacian_normalized_matrix(L, Lx)
expected_result = csr_matrix([[0.25, 0.0, 0.0], [0.0, 0.0, 0.25], [0.0, 0.25, 0.0]])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above.

L = csr_matrix([[4.0, 0], [0.0, 4.0]])
Lx = csr_matrix([[0.0, 1.0], [1.0, 0.0]])
normalized_Lx = compute_x_laplacian_normalized_matrix(L, Lx)
expected_result = csr_matrix([[0.0, 0.25], [0.25, 0.0]])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above.

----------
A_opt : numpy.ndarray
The adjacency matrix.
add_identity : bool, optional
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
add_identity : bool, optional
add_identity : bool, default=False

In general optional boolean flags feel weird, there is nothing optional about them. They default to true or to false, that's it.

Notes
-----
This normalization is based on Kipf's formulation, which computes the row-sums,
constructs a diagonal matrix D^(-0.5), and applies the normalization as D^(-0.5) * A_opt * D^(-0.5).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe format this a proper LaTeX formulas? This would render better on the documentation website. See https://numpydoc.readthedocs.io/en/latest/format.html#notes and their rendered example.

B : numpy.ndarray or scipy.sparse.csr_matrix
The asymmetric matrix.
is_sparse : bool, optional
If True, treat B as a sparse matrix.
Copy link
Member

@ffl096 ffl096 Jun 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether that parameter is actually useful? Just test whether B is sparse (scipy.sparse.issparse) should be sufficient, right? Or is there ever a case where B is of sparse type but you want to set is_sparse=False?


Parameters
----------
B1 : numpy array or scipy csr_matrix
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

numpy array and scipy csr_matrix should be replaced by np.ndarray and scipy.sparse.csr_matrix, respectively, throughout the document


Returns
-------
_ : numpy array or scipy csr_matrix
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove the _ : here and below. If a function returns just one value (as opposed to a tuple of values), just state the type.

@ffl096 ffl096 added the enhancement New feature or request label Jun 29, 2023
@ninamiolane
Copy link
Collaborator

@mhajij @ffl096 update on this?

test/utils/test_normalization.py Outdated Show resolved Hide resolved
toponetx/utils/normalization.py Outdated Show resolved Hide resolved
toponetx/utils/normalization.py Outdated Show resolved Hide resolved
toponetx/utils/normalization.py Outdated Show resolved Hide resolved
toponetx/utils/normalization.py Outdated Show resolved Hide resolved
@USFCA-MSDS USFCA-MSDS requested a review from ffl096 November 17, 2023 16:28
Copy link
Contributor

@devendragovil devendragovil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requesting some changes.

toponetx/utils/normalization.py Outdated Show resolved Hide resolved
test/utils/test_normalization.py Outdated Show resolved Hide resolved
Copy link
Contributor

@devendragovil devendragovil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

References need to be reformatted.

Comment on lines 187 to 192
[1] Schaub, M. T., Benson, A. R., Horn, P., Lippner, G., & Jadbabaie, A.
"Random walks on simplicial complexes and the normalized hodge 1-laplacian."

[2] Bunch, E., You, Q., Fung, G., & Singh, V.
"Simplicial 2-Complex Convolutional Neural Networks."
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mhajij mhajij requested a review from devendragovil November 17, 2023 21:59
Copy link
Contributor

@devendragovil devendragovil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@mhajij mhajij closed this Nov 18, 2023
@mhajij mhajij reopened this Nov 18, 2023
@mhajij mhajij merged commit 3c7290f into main Nov 18, 2023
12 of 14 checks passed
@ffl096 ffl096 deleted the normalization branch February 2, 2024 13:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants