Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jams.Annotation.trim() doesn't include events with 0 duration that's on the trim boundary #203

Closed
tomxi opened this issue Aug 22, 2019 · 4 comments · Fixed by #204
Closed
Labels

Comments

@tomxi
Copy link
Member

tomxi commented Aug 22, 2019

This issue is about jams.Annotation.trim()

Reproduciable code is below:

import jams
print(jams.__version__)

ann = jams.Annotation(namespace='beat', duration=4)
ann.append(time=0, duration=0, value=1)
ann.append(time=1, duration=0, value=2)
ann.append(time=2, duration=0, value=3)

print('----------')
for obs in ann.trim(0, 2.1, strict=False):
    print(obs)
    
print('----------')
for obs in ann.trim(0, 2, strict=False):
    print(obs)
    
print('----------')
for obs in ann.trim(-1, 1.9, strict=False):
    print(obs)

Current Output:

0.3.4
----------
Observation(time=1.0, duration=0.0, value=2, confidence=None)
Observation(time=2.0, duration=0.0, value=3, confidence=None)
----------
Observation(time=1.0, duration=0.0, value=2, confidence=None)
----------
Observation(time=1.0, duration=0.0, value=2, confidence=None)

Desired Output:

0.3.4
----------
Observation(time=0.0, duration=0.0, value=1, confidence=None)
Observation(time=1.0, duration=0.0, value=2, confidence=None)
Observation(time=2.0, duration=0.0, value=3, confidence=None)
----------
Observation(time=0.0, duration=0.0, value=1, confidence=None)
Observation(time=1.0, duration=0.0, value=2, confidence=None)
Observation(time=2.0, duration=0.0, value=3, confidence=None)
----------
Observation(time=0.0, duration=0.0, value=1, confidence=None)
Observation(time=1.0, duration=0.0, value=2, confidence=None)

I took a look at the source code, and I think it's down to maybe adding some equal signs in this line:

if obs_start < trim_end and obs_end > trim_start:

But I'm not sure if it will break other things.

Is this intended behavior? Right now it's impossible to slice or trim out the first beat of this annotation.

@bmcfee
Copy link
Contributor

bmcfee commented Aug 22, 2019

Oy this is a delicate one. It ultimately comes down to the semantics around time intervals. I've long advocated for a half-open interpretation [t, t+d), but I don't recall if we ever made that explicit. (Or maybe @justinsalamon had arguments for closed intervals. 🤷‍♂️ )

I think you've highlighted the right part of the code for fixing this, but I'm not 100% clear on what the correct logic should be here. Maybe making both sides of the comparison include the boundary conditions?

@tomxi
Copy link
Member Author

tomxi commented Aug 22, 2019

I would lean slightly toward a [t, t+d) interpretation.

In any case, I think annotations that happen exactly at start_time of the trim should be definitely be included.
The current behavior is surprising to me (and potentially other users), in that my first beat of the dummy_annotation is always missing no matter how I slice it.

@bmcfee
Copy link
Contributor

bmcfee commented Aug 26, 2019

I would lean slightly toward a [t, t+d) interpretation.

If we take that interpretation, then the semantics of trim should be that any observation starting at or after end_time would be discarded. I think this is reasonable, as any such observation would have duration 0 after trimming anyway.

The only wrinkle that I can see (and I think this is where @justinsalamon was coming from previously) is in the interpretation of densely sampled annotations, eg, f0 contours. These annotations typically have duration 0 for each observation already, so trimming to duration 0 doesn't necessarily introduce any new errors here.

To reheat an old argument, the problem with duration-0 observations is that they're inconsistent with the half-open interval property: [t, t+0) is the empty set, and by definition doesn't actually cover any time instant. Closed intervals [t, t+0] would cover exactly time t, but then it becomes ambiguous how to interpret an annotation like [t, t+d], [t+d, t+2d] because time t+d belongs to both intervals. This isn't always wrong, since intervals can overlap in general, but it provides no way to encode disjoint partitions of time. The argument for 0-duration being special is that it should correspond to uniformly sampled signals (like f0 contours) where we don't want to imply piecewise constancy between samples. This is basically the compromise that we struck when standardizing the observation schema.

One way that we could reconcile this is to say that non-0 intervals are always interpreted as half-open, and that includes annotation start/duration. So when we trim an annotation, it's slicing down to a given interval [start, end), which implies discarding observations from the ranges [end, +inf) or [-inf, start).

@bmcfee
Copy link
Contributor

bmcfee commented Aug 26, 2019

Going back to @tomxi 's original example:

ann = jams.Annotation(namespace='beat', duration=4)
ann.append(time=0, duration=0, value=1)
ann.append(time=1, duration=0, value=2)
ann.append(time=2, duration=0, value=3)

print('----------')
for obs in ann.trim(0, 2.1, strict=False):
print(obs)

All observations have start time 0 <= time < 2.1, so should be included. The fact that the first one fails is a mistake.

print('----------')
for obs in ann.trim(0, 2, strict=False):
print(obs)

Only the first two observations are in the trim interval, the last one starts at time 2 >= trim_end and should not be included.

print('----------')
for obs in ann.trim(-1, 1.9, strict=False):
print(obs)

Again, this should only include the first two observations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants