Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more place classification #49

Merged
merged 2 commits into from
Jun 18, 2019
Merged

Add more place classification #49

merged 2 commits into from
Jun 18, 2019

Conversation

Joxit
Copy link
Member

@Joxit Joxit commented Jun 11, 2019

Background

When I tried a acceptance test on the API with the pelias_parser branch, I found some regressions. One of them was University of Hawaii at Hilo. I found that this was not classified as place, so I tried to added it.

In my first try, I added the correct classification in classifier/scheme/place.js but nothing happend. So I investigate a bit more and I found an exception on span containing a StopWordClassification in the WhosOnFirstClassifier.

What to do next ?

In order to limit the side effects, I preferred to add a penalty instead of completely removing this exception. With this, no test fails.

The parsing of University of Hawaii at Hilo is not at 100% correct, but it's better :)

If I completely remove the exception, 2 tests are failing, Rue de Paris and Avenue de Sainte Rose de Lima which should be streets.

I put this in standby at the moment in case it is necessary to keep the exception.
I am open to discussion / review 😄

@Joxit Joxit requested a review from missinglink June 11, 2019 13:59
@missinglink
Copy link
Member

hmmm yea, I would have preferred not to have that logic in the WhosOnFirstClassifier.
I think it makes the code harder to debug and reason about why it is behaving the way it is.

I'm also not sure how I feel about confidence = confidence / 2, I was hoping that we could avoid modifying existing tokens and existing classifications but maybe that's 'wishful thinking'.

I'd like to try and refactor some of this 'smart' code to be clearer and more predictable, and try to be less clever if there is a more transparent way of doing the same thing.

I also had a look at the University of Hawaii at Hilo test case and gave up ;)
I don't think we should merge the test case in this PR because it's incorrect, I'd prefer to leave it commented out than commit incorrect tests.

So yea 🤷‍♂ I don't know how to get that venue to parse correctly yet, maybe we just think about it a bit more and a simpler solution will present itself?

@Joxit Joxit force-pushed the joxit/place-classification branch from f09f9c2 to 20fa3c0 Compare June 12, 2019 14:58
@Joxit
Copy link
Member Author

Joxit commented Jun 12, 2019

confidence = confidence / 2 is totally random, I thought maybe we could decrease confidence progressively. It's bit like what is done in Street*Classifier, but instead of static values we have dynamic ones (which can be in fact replaced by 0.5).

Okay, I commented the test in the hope that it will be solved one day 🤞.

The addressParts.map piece of code is awesome ! 😆

Copy link
Member

@missinglink missinglink left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@Joxit Joxit marked this pull request as ready for review June 18, 2019 09:00
@Joxit Joxit merged commit 3dfd16b into master Jun 18, 2019
@Joxit Joxit deleted the joxit/place-classification branch June 18, 2019 09:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants