-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Persistence diagram for image data / embedding data #58
Comments
@dgm2 Why not using the cubical complex ? from sklearn.datasets import fetch_openml
import matplotlib.pyplot as plt
import gudhi as gd
X, y = fetch_openml("mnist_784", version=1, return_X_y=True, as_frame=False)
# X[17] is an '8'
cc = gd.CubicalComplex(top_dimensional_cells=X[17], dimensions=[28, 28])
diag = cc.persistence()
gd.plot_persistence_diagram(diag, legend=True)
plt.show() # X[1] is an '0'
cc = gd.CubicalComplex(top_dimensional_cells=X[1], dimensions=[28, 28])
diag = cc.persistence()
gd.plot_persistence_diagram(diag, legend=True)
plt.show() Another example is available on this tuto about cubical complex. |
sounds good, thanks! What would be the best way to replicate something like the following Dionysus code with GUDHI ?
references many thanks
output
|
🤔 strange to me your second point... import itertools
import numpy as np
from torchvision import datasets
from gudhi.cubical_complex import CubicalComplex
from gudhi.wasserstein import wasserstein_distance
from gudhi import bottleneck_distance
def pers_diag(pts):
pers = CubicalComplex(top_dimensional_cells=pts, dimensions=[28, 28]).persistence()
res = np.array([list(b) for (_, b) in pers])
return res
dataset2 = datasets.MNIST('data', train=False, download=True)
diagrams = []
labels = []
n = 10
for dat, lab in zip(dataset2.data[:n], dataset2.train_labels[:n]):
pts = dat.data.numpy().reshape(-1)
diagrams.append(pers_diag(pts))
labels.append(lab.item())
def print_wd(i, j):
print("labels ", labels[i], labels[j], " | was ", wasserstein_distance(diagrams[i], diagrams[j]), " | bot ", bottleneck_distance(diagrams[i], diagrams[j]))
for i, j in itertools.combinations(range(n), 2):
print_wd(i, j) outputs:
|
@dgm2 what is your gudhi version ? |
Here is an example on how to do the same code with dionysus and gudhi: import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import fetch_openml
import dionysus as d
import gudhi as gd
X, y = fetch_openml("mnist_784", version=1, return_X_y=True, as_frame=False)
# a zero
a = X[1].reshape((28,28))
#a = np.random.random((10,10))
plt.matshow(a)
plt.colorbar()
plt.show()
f_lower_star = d.fill_freudenthal(a)
p = d.homology_persistence(f_lower_star)
dgms = d.init_diagrams(p, f_lower_star)
for i,dgm in enumerate(dgms):
print(i)
for pt in dgm:
print(pt)
# 0
# (0,inf)
# (0,165)
# (84,96)
# 1
# (0,255)
# (173,252)
# (223,253)
# (225,252)
# (225,253)
# (238,253)
# (240,253)
# (246,253)
# (252,253)
# (252,253)
# (252,253)
# (252,253)
# (253,255)
cc = gd.CubicalComplex(top_dimensional_cells=a)
cc.compute_persistence()
cc.persistence_intervals_in_dimension(0)
# array([[ 84., 96.],
# [ 0., 165.],
# [ 0., inf]])
cc.persistence_intervals_in_dimension(1)
#array([[173., 252.],
# [225., 252.],
# [252., 253.],
# [246., 253.],
# [240., 253.],
# [238., 253.],
# [252., 253.],
# [252., 253.],
# [225., 253.],
# [252., 253.],
# [237., 253.],
# [223., 253.],
# [253., 255.],
# [ 0., 255.]]) |
Hello,
Thanks for maintaining this repo.
Two questions on processing image datasets (e.g. torchvision MNIST).
Example
I put RipsComplex but any object for persistence would be ok.
Many thanks!
The text was updated successfully, but these errors were encountered: