Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add probs methods to discrete distributions #305

Merged
merged 7 commits into from
Nov 8, 2014
Merged

Add probs methods to discrete distributions #305

merged 7 commits into from
Nov 8, 2014

Conversation

lindahua
Copy link
Contributor

@lindahua lindahua commented Nov 8, 2014

Computing pmf over a contiguous range can take advantage of the recursive relations in the pmf formula, and thus can be much more efficient than computing them individually using pdf.

Hence, I introduce the probs function for discrete distribution, with the following semantics:

probs(d, a:b)   # return a vector of probabilities corresponding to each value in the range a:b
probs(d)   # only for bounded distributions, equivalent to probs(minimum(d):maximum(d))

This function can be used to get an entire probability vector efficiently.

@lindahua lindahua added this to the v0.6 milestone Nov 8, 2014
@coveralls
Copy link

Coverage Status

Coverage decreased (-0.04%) when pulling 1efccbe on dh/probs into 67d5eae on master.

lindahua added a commit that referenced this pull request Nov 8, 2014
Add probs methods to discrete distributions
@lindahua lindahua merged commit 0792898 into master Nov 8, 2014
@coveralls
Copy link

Coverage Status

Coverage decreased (-0.04%) when pulling 1efccbe on dh/probs into 67d5eae on master.

@StefanKarpinski
Copy link

I'm wondering if this couldn't reasonably be additional methods of the pdf function.

@kmsquire
Copy link
Contributor

kmsquire commented Nov 8, 2014

I was thinking the same thing as Stefan.

@lindahua
Copy link
Contributor Author

lindahua commented Nov 8, 2014

This idea of using pdf did came to my mind but somehow I feel that pdf(d) producing a probability vector does not look very natural.

If the consensus is that it is ok to have pdf(d) yield a probability vector (for finite distributions), I can change it to pdf. I am pretty open to this.

@kmsquire
Copy link
Contributor

kmsquire commented Nov 8, 2014

To me this just seems like a vectorized version of the pdf function, which
seems fine to me.

On Saturday, November 8, 2014, Dahua Lin [email protected] wrote:

This idea of using pdf did came to my mind but somehow I feel that pdf(d)
producing a probability vector does not look very natural.

If the consensus is that it is ok to have pdf(d) yield a probability
vector (for finite distributions), I can change it to pdf. I am pretty
open to this.


Reply to this email directly or view it on GitHub
#305 (comment)
.

@StefanKarpinski
Copy link

The pdf(d) form does seem a little odd – and there's the issue of what, if anything, it should do for continuous distributions – but I think once you see it, it makes sense, and you'd have to look up probs anyway, so there seems to be little point in having an additional generic function for this.

@johnmyleswhite
Copy link
Member

Since this special case would only make sense for discrete distributions, perhaps we should use pmf, which I believe we already support for evaluation at a point.

@lindahua
Copy link
Contributor Author

lindahua commented Nov 9, 2014

Based on the feedback, I think I can take the following steps at this point:

  1. Use pdf(d, a:b) instead of probs(d, a:b) (for this case, their semantics are essentially the same), but using an efficient implementation that takes advantage of the relation between pdf(x) and pdf(x + 1).
  2. I will keep probs(d) as it is for now. This method was originally only for Categorical, Multinomial and MixtureModel to retrieve the associated probability vector. However, other bounded discrete distributions can be considered as a special case of Categorical distribution, and as I extend the probs method for them.

Whereas for Categorical, probs is the same as pdf. However, this is no longer the case for Multinomial and MixtureModel.

@lindahua
Copy link
Contributor Author

lindahua commented Nov 9, 2014

Think about it all, I lean more towards just using pdf.

Whereas it might be too cute to have pdf(d), however, it doesn't seem that it will cause ambiguity or confusion.

And probs(d) is only reserved for the case where there is a probability vector as a parameter and one can use probs(d) to retrieve that parameter. This applies to distributions like Categorical, Multinomial, and MixtureModel. In such cases probs are not necessarily the same as pdf.

@lindahua
Copy link
Contributor Author

As of v0.6.1, we are now using pdf in all these cases.

@ararslan ararslan deleted the dh/probs branch September 29, 2016 18:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants