Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implemented Remote, Search within State or Province, and Search within Country #113

Closed
wants to merge 4 commits into from

Conversation

LilySu
Copy link
Contributor

@LilySu LilySu commented Oct 3, 2020

Added the options of searching for remote within locale / country. Also added options to search within a state or province and search within a country supposing one is open to relocation.

Description

  • Allowed 'city' and 'province_or_state' to be Null - this allows searching country-wide, or if city is not Null but province_or_state is Null, search will be for within the indicated province or state.
  • Added 'remote_within_country' in settings.yaml to specifically distinguish between searching by remote vs searching by region when province_or_state and/or city are Null.

Example cases in my_settings.yaml:
To search entire country if one is willing to relocate anywhere:
province_or_state: Null
city: Null
remote_within_country: False
To search within entire province or state (radius will not be accounted):
province_or_state: (ie. Scotland)
city: Null
remote_within_country: False
To search entire country for remote roles:
province_or_state: Null
city: Null
remote_within_country: True

Please note: CSV entries organically log the location of the user who posted the job listing, so even when a job is remote, a CSV entry for location still exists, thus '' for entry is unnecessary.

  • To implement remote for Indeed, location is set to remote.
  • No such option exists for Monster, but many jobs have the keyword remote in its listing, so setting remote involves adding the keyword remote in search queries.
  • Monster also has in the past prompted people to search by the keywords 'Work From Home.' However, 'Work From Home' in addition to other keyword queries has returned little results compared to adding the keyword 'remote.'

Please note: Even though 'remote_within_country: ' is set to True, it is not guaranteed within each platform that all jobs will be remote due to promoted listings.

Context of change

Please add options that are relevant and mark any boxes that apply.

  • Software (software that runs on the PC)
  • Library (library that runs on the PC)
  • Tool (tool that assists coding development)
  • Other

Type of change

Please mark any boxes that apply.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

I tried variations for each locale where remote is enable and Null is indicated to city or province_or_state

Checklist:

Please mark any boxes that have been completed.

  • I have performed a self-review of my own code.
  • I have commented my code, particularly in hard-to-understand areas.
  • I have made corresponding changes to the documentation.
  • My changes generate no new warnings.
  • I have added tests that prove my fix is effective or that my feature works.
  • New and existing unit tests pass locally with my changes.
  • Any dependent changes have been merged and published in downstream modules.

Copy link
Collaborator

@markkvdb markkvdb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for providing the groundwork of adding more advanced remote support. The current job landscape clearly sees an increasing demand for this 😃.

However, I do think some changes are required to make the new remoteness functionality in the production version. Please have a look at my comments.

Maybe it would also be easier to make changes after discussing what exactly the philosophy should be for the remoteness function without making it overly complicated. For example, we could change the input for the location to something like location = "Boston, MA" instead of city = "Boston", province_or_state="MA". The latter way is rather inflexible and makes things cumbersome if we want to add remoteness support. With a simple location we could for example just say location = "USA" or location="France". By doing it like this we could simply add remote support with a simple flag and we could leave out remote_in_country, since location would already imply the scope of the search.

@@ -151,18 +151,28 @@ def parse_cli(args: List[str]) -> Dict[str, Any]:
search_group.add_argument(
'-ps',
dest='search.province_or_state',
type=str,
type=None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to make this an Optional[str] since we want to make sure this is a string if the user provides a value.

)

search_group.add_argument(
'-remote_in_ctry',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe use --remote-in-country.

Comment on lines +263 to +288
search_url = 'https://www.monster.{}/jobs/search/?{}q={}'.format(
self.config.search_config.domain,
f'page={page}&' if page > 1 else '',
self.query,
self.config.search_config.city.replace(' ', '-'),
)
# search countrywide supposing one is open to relocating within
# the country (includes remote)
if self.config.search_config.city is None and \
self.config.search_config.province_or_state is None and \
self.config.search_config.remote_within_country is False:
return search_url
# search remote within the country
elif self.config.search_config.remote_within_country is True:
return search_url + '-remote'
# search by state or province, does not factor in city or radius
elif self.config.search_config.city is None and \
self.config.search_config.province_or_state and \
self.config.search_config.remote_within_country is False:
return search_url + '&where={}'.format(
self.config.search_config.province_or_state,
self._convert_radius(self.config.search_config.radius)
)
return (
search_url + '&where={}__2C-{}&rad={}'.format(
self.config.search_config.city.replace(' ', '-'),
self.config.search_config.province_or_state,
self._convert_radius(self.config.search_config.radius)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might overcomplicate things at this stage. The config builder should process all user provided input from the CLI and YAML and create a valid configuration. This configuration should then be directly processed by the scrapers.

The potential issue I see with complicating the logic for setting up the URL is that it is a lot harder to extend the scrapers with different LOCALES. Right now, we try to optimise the scraper classes such that it is easy to support more LOCALES.

Comment on lines +384 to +404
)
# search countrywide supposing one is open to relocating within
# the country (includes remote)
if self.config.search_config.city is None and \
self.config.search_config.province_or_state is None and \
self.config.search_config.remote_within_country is False:
return search_url
# search remote within the country
elif self.config.search_config.remote_within_country is True:
return search_url + '-remote'
# search by state or province, does not factor in city or radius
elif self.config.search_config.city is None and \
self.config.search_config.province_or_state and \
self.config.search_config.remote_within_country is False:
return search_url + '&where={}'.format(
self.config.search_config.province_or_state,
)
return (
search_url + '&where={}&rad={}'.format(
self.config.search_config.city.replace(' ', '-'),
self._convert_radius(self.config.search_config.radius)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See previous comment

@PaulMcInnis PaulMcInnis added this to the 3.0.1 milestone Oct 6, 2020
@PaulMcInnis PaulMcInnis linked an issue Oct 6, 2020 that may be closed by this pull request
Copy link
Owner

@PaulMcInnis PaulMcInnis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting there, I think however that we can simplify the cli and yaml for this feature, in addition to the URL construction logic.

required=False,
)

search_group.add_argument(
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit confused by the remote-in-city and remote-in-country arguments, why can't we simply provide a --remote argument?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just tried to make this part correspond 1:1 with the config.yaml.
If a city is indicated and remote is set, then remote will override the search.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it doesn't make sense to me why we need this, since if a user provides a country but no city with --remote it would have the intended effect no?

self.config.search_config.province_or_state.upper(),
self.max_results_per_page,
int(self.config.search_config.return_similar_results),
REMOTENESS_TO_QUERY[self.config.search_config.remoteness]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would make more sense for us to construct the URL in an additive way vs this if/else and return method. This way we can easily add or remove portions and update to using a URL-construction library (Which helps with constructing the URL string from tags).

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i.e.

if search_config.province_or_state: 
    search_url += "...."

province_or_state: "Texas"
# NOTE: this is the full city / town name, but can be 'Null', with
# province_or_state set to 'Null' to search country-wide.
city: "Richardson"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way the schema has been implemented, we can simply not provide province_or_state / city entirely and they will be set to the default value in the schema, which should be None.

i.e. user shouldn't be setting this to a Null string

@PaulMcInnis PaulMcInnis modified the milestones: 3.0.1, 3.0.2 Oct 11, 2020
@PaulMcInnis
Copy link
Owner

Closing this for now due to inactivity, 100% open to revisiting this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improved remote job search support
3 participants