Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

losses: add likelihood loss and test, improve base loss #38

Merged
merged 2 commits into from
Mar 7, 2023

Conversation

AldoGl
Copy link
Contributor

@AldoGl AldoGl commented Mar 6, 2023

Proposed changes

I add a loss based on the computation of the likelihood function, I improve the modularity of the base loss class and I add a Jupyter notebook where the new loss is used to compute likelihood and posterior for the BH4 model.

Types of changes

What types of changes does your code introduce?
Put an x in the boxes that apply

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Checklist

Put an x in the boxes that apply.

  • I have read the CONTRIBUTING doc
  • I am making a pull request against the develop branch (left side). Also you should start your branch off our develop.
  • Lint and unit tests pass locally with my changes
  • I have added tests that prove my fix is effective or that my feature works

@AldoGl AldoGl requested a review from marcofavoritobi March 6, 2023 11:05


class LikelihoodLoss(BaseLoss):
"""Class for a likelihood loss computation using a Gaussian kernel function."""
Copy link
Contributor

@marcofavoritobi marcofavoritobi Mar 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK for the first version using a Gaussian kernel function; but one might want to use a different density estimation approach?

(I would convert this as an issue, after merging the PR, if considered relevant)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, of course this can be useful, we can expand the class as necessary in the future



class LikelihoodLoss(BaseLoss):
"""Class for a likelihood loss computation using a Gaussian kernel function."""
Copy link
Contributor

@marcofavoritobi marcofavoritobi Mar 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we might want to add some references here if considered relevant (perhaps Grazzini et al 2017? Although we might not be interested in the posterior distribution)

@codecov-commenter
Copy link

codecov-commenter commented Mar 6, 2023

Codecov Report

Merging #38 (447e8de) into main (8aff1e5) will decrease coverage by 0.10%.
The diff coverage is 96.87%.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main      #38      +/-   ##
==========================================
- Coverage   97.17%   97.08%   -0.10%     
==========================================
  Files          29       30       +1     
  Lines        1452     1508      +56     
==========================================
+ Hits         1411     1464      +53     
- Misses         41       44       +3     
Impacted Files Coverage Δ
black_it/loss_functions/likelihood.py 95.34% <95.34%> (ø)
black_it/loss_functions/base.py 100.00% <100.00%> (ø)
black_it/samplers/base.py 98.03% <0.00%> (-1.97%) ⬇️

@AldoGl AldoGl force-pushed the likelihood_loss branch 4 times, most recently from d9cb33d to 759041f Compare March 7, 2023 16:10
@AldoGl AldoGl force-pushed the likelihood_loss branch from 759041f to 66ea8d8 Compare March 7, 2023 16:17
return h

@staticmethod
def _get_bandwidth_scott(N: int, D: int) -> float:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not used; maybe we can have a configuration in the loss (either scott or silverman)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Will do


from black_it.loss_functions.likelihood import LikelihoodLoss

np.random.seed(3)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a bit dangerous as the line is executed at module level, but it is not guaranteed the seeds, across tests, are set in the same order because of changes in the order in how the test modules are imported. I suggest using a different approach; how is it done in other tests?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok I see each test calls np.random.seed in the body of each test. I argue against this practice as np.random.seed does a global side-effect and might cause dependency among tests. This should be refactored in using a local random number generator (np.default_rng(seed)) whenever possible.

For this PR, I suggest moving this call within the body of each test.

Copy link
Contributor Author

@AldoGl AldoGl Mar 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I set the seed within each test. In any case, the tests should work across different seeds

Comment on lines 85 to 86
for i in range(num_coords):
filter_ = filters[i]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for i in range(num_coords):
filter_ = filters[i]
for i, filter_ in enumerate(filters):

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, much better!


@staticmethod
def _filter_data(
num_coords: int,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If suggestion below is accepted, num_coords is not needed here and above at line 68:

Suggested change
num_coords: int,

weights = self._check_coordinate_weights(num_coords)
filters = self._check_coordinate_filters(num_coords)

filtered_data = self._filter_data(num_coords, filters, sim_data_ensemble)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See below

Suggested change
filtered_data = self._filter_data(num_coords, filters, sim_data_ensemble)
filtered_data = self._filter_data(filters, sim_data_ensemble)

@@ -0,0 +1,74 @@
# Black-box ABM Calibration Kit (Black-it)
# Copyright (C) 2021-2022 Banca d'Italia
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are in 2023, so we should change the year range of the licenses accordingly in all the copyright headers; but that's a task for another day...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I will change it for this module only in the meanwhile

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the copyright check will fail :( , however, you could add the year 2023 among the options in the regex of the script scripts/check_copyright.py.

In particular, at line 36, from:

# Copyright \(C\) 2021-2022 Banca d'Italia

To:

# Copyright \(C\) 2021-(2022|2023) Banca d'Italia

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks 🙏🏼

@AldoGl AldoGl force-pushed the likelihood_loss branch from cde5c27 to 97065b4 Compare March 7, 2023 16:55
@AldoGl AldoGl force-pushed the likelihood_loss branch from 97065b4 to 4d15f2d Compare March 7, 2023 16:59
@marcofavoritobi marcofavoritobi self-requested a review March 7, 2023 17:16
Copy link
Contributor

@marcofavoritobi marcofavoritobi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work 🚀

@AldoGl
Copy link
Contributor Author

AldoGl commented Mar 7, 2023

Thank you @marcofavoritobi 🙏🏼

@AldoGl AldoGl merged commit 2b760e1 into main Mar 7, 2023
@AldoGl AldoGl deleted the likelihood_loss branch March 7, 2023 17:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants