Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add filters before loss function computation #24

Merged
merged 1 commit into from
Nov 21, 2022
Merged

Conversation

AldoGl
Copy link
Contributor

@AldoGl AldoGl commented Nov 18, 2022

Proposed changes

I add the possibility of adding a list of callables to act on each dimension of the simulated time series before the loss is evaluated. I also add a specific HP filter in the utils/time_series.py module. Finally I add a test for the new feature

Types of changes

What types of changes does your code introduce?
Put an x in the boxes that apply

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Checklist

Put an x in the boxes that apply.

  • I have read the CONTRIBUTING doc
  • I am making a pull request against the develop branch (left side). Also you should start your branch off our develop.
  • Lint and unit tests pass locally with my changes
  • I have added tests that prove my fix is effective or that my feature works

Further comments

If this is a relatively large or complex change, kick off the discussion by explaining why you chose the solution you did and what alternatives you considered, etc...

@AldoGl AldoGl force-pushed the add_filters_in_loss branch from ff1871e to e6f12db Compare November 21, 2022 09:16
@codecov-commenter
Copy link

codecov-commenter commented Nov 21, 2022

Codecov Report

Merging #24 (16db99d) into main (e26ebd7) will decrease coverage by 1.21%.
The diff coverage is 58.33%.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main      #24      +/-   ##
==========================================
- Coverage   98.89%   97.68%   -1.22%     
==========================================
  Files          23       23              
  Lines        1176     1209      +33     
==========================================
+ Hits         1163     1181      +18     
- Misses         13       28      +15     
Impacted Files Coverage Δ
black_it/utils/time_series.py 73.21% <28.57%> (-26.79%) ⬇️
black_it/loss_functions/base.py 100.00% <100.00%> (ø)

@AldoGl AldoGl force-pushed the add_filters_in_loss branch 2 times, most recently from 0d39d7e to ca1509d Compare November 21, 2022 09:29
@AldoGl
Copy link
Contributor Author

AldoGl commented Nov 21, 2022

Hi @marcofavoritobi, following your suggestion I added the possibility of specifying a mixed list of the kind [function, None, function, function] as the list of coordinate filters in the loss computation. If None is selected no filter is applied to the coordinate

Comment on lines +79 to +92
if self.coordinate_filters is None:
# a list of identity functions
filters = [None] * num_coords
else:
nb_coordinate_filters = len(self.coordinate_filters)
assert_(
nb_coordinate_filters == num_coords,
(
"the length of coordinate_filters should be equal "
f"to the number of coordinates, got {nb_coordinate_filters} and {num_coords}"
),
exc_cls=ValueError,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this validation could be moved into the constructor; otherwise an error would occur only when compute_loss is called, which could surprise the user a bit

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed this already. Unfortunately it is not possible since the BaseLoss constructor does not know the number of coordinates when it's called

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, but a sanity check with coordinate_weights could be useful

Comment on lines 97 to 102
filtered_data = np.array(
[
sim_data_ensemble[j, :, i]
for j in range(sim_data_ensemble.shape[0])
]
)
Copy link
Contributor

@marcofavoritobi marcofavoritobi Nov 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the same of the following?

Suggested change
filtered_data = np.array(
[
sim_data_ensemble[j, :, i]
for j in range(sim_data_ensemble.shape[0])
]
)
filtered_data = sim_data_ensemble[:, :, i]

Just replicating the old behaviour in case filter is None

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it is, thanks, I will fix it

self.compute_loss_1d(sim_data_ensemble[:, :, i], real_data[:, i])
* weights[i]
)
if filters[i] is None:
Copy link
Contributor

@marcofavoritobi marcofavoritobi Nov 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An idea to simplify the code is to move this check in the constructor, pre-process the coordinate_filters argument so to have either user-defined filters or the identity functions. I am fine with what we have now, this is just a suggestion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once again, this is not possible since the constructor does not know the number of coordinates unfortunately

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this part is possible because in case filters[i] is None, we can assign a lambda function like filters[i] = lambda x: x, and then just use it rather than checking if filters[i] is None

time_series: NDArray[np.float64], lamb: float = 1600
) -> Tuple[NDArray[np.float64], NDArray[np.float64]]:
"""
Apply the HP filter to a time series.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be a good idea to put a bibliographic reference here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thanks

def __init__(
self,
coordinate_weights: Optional[NDArray] = None,
coordinate_filters: Optional[List[Optional[Callable]]] = None,
Copy link
Contributor

@marcofavoritobi marcofavoritobi Nov 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we could have three behaviours of this parameter:

  1. None -> the identity function
  2. callable -> a filter that will be applied to every dimension
  3. list of callables/none -> a filter specified for each dimension

1 and 3 are in place; if useful, we could add 2 (even after the PR is merged)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks this might be a good idea for a future PR if we find it to be potentially useful. For the moment I would not implement it though

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another idea we talked about is to use dictionaries rather than lists.

For example, say we have 10 dimensions, but we want to apply a filter for the 3rd, the 5th and the 7th dimensions. With the current interface we had to do something like:

[None] * 2 + [filter_for_3rd] + [None] + [filter_for_5th] + [None] + [filter_for_7th]  + [None] * 3

Instead, if we used dictionaries with dimension indexes as keys, we could do:

{
  2 : filter_for_3rd # (or 3?)
  4 : filter_for_5th # (or 5?)
  6 : filter_for_7th # (or 7?)
}

@AldoGl AldoGl force-pushed the add_filters_in_loss branch from 2ff997d to 5e7b924 Compare November 21, 2022 09:46
@AldoGl
Copy link
Contributor Author

AldoGl commented Nov 21, 2022

Thank you @muxator for your comment on the correction of the hp_filter docstring. I implemented that too


References:
Hodrick, R.J, and E. C. Prescott. 1980. "Postwar U.S. Business Cycles: An Empirical Investigation."
Carnegie Mellon University discussion paper no. 451`.
Copy link
Contributor

@marcofavoritobi marcofavoritobi Nov 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Carnegie Mellon University discussion paper no. 451`.
Carnegie Mellon University discussion paper no. 451.

🙂 (I am unbearably picky, I know.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well spotted!!

@AldoGl AldoGl force-pushed the add_filters_in_loss branch 5 times, most recently from 63cb92b to fe2d416 Compare November 21, 2022 13:30
I add the possibility of adding a list of callables to act on each dimension of the simulated time series before the loss is evaluated. I also add a specific HP filter in the utils/time_series.py module. Finally I add a test for the new feature
fix


fix


fix


add filters before loss computation (option2: None instead of lambda function)


fix 


fix


fix


fix


fix


fix


fix
@AldoGl AldoGl force-pushed the add_filters_in_loss branch from fe2d416 to 16db99d Compare November 21, 2022 13:37
@AldoGl
Copy link
Contributor Author

AldoGl commented Nov 21, 2022

Thank you @marcofavoritobi, I am closing this one

@AldoGl AldoGl merged commit ac464f5 into main Nov 21, 2022
@AldoGl AldoGl deleted the add_filters_in_loss branch December 5, 2022 10:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants