Skip to content

This repository contains a test set designed for intent classification within the Human Resources domain.

License

Notifications You must be signed in to change notification settings

Avature/intent-classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

Repository for Intent Classification

This repository contains a test set designed for intent classification within the Human Resources domain.

The shared test set encompasses 34 different intents, meticulously identified by experts through iterative analysis of sample conversations in English. The data is stored in JSON format, with a dictionary mapping each intent to its corresponding samples. Company names are anonymized and replaced by the placeholder [company_name].

The procedure involved the collaboration of two annotators, achieving an inter-annotator agreement of 92% between them. In instances of disagreement, an adjudicator was consulted to render the final decision.

We also provide a course-grained test set, which is the result collapsing the 34 intents to just 8 labels by following a set of custom transformation rules.

General Statistics

This section provides statistics on the test set, including the distribution of samples per intent.

For the fine-grained test set:

{
  "global_metrics": {
    "total_intents": 34,
    "total_samples": 1199,
    "total_intents_with_entities": 0,
    "total_entities": 0,
    "intent_to_samples_dist": {
      "faq_recruiting_process": 62,
      "faq_job_requirements": 58,
      "faq_company_culture": 57,
      "faq_remote_work_policy": 56,
      "faq_salary_conversation": 56,
      "talk_to_human": 47,
      "faq_about_company": 46,
      "faq_pto_policy": 46,
      "user_wants_job_recommendations": 45,
      "appreciation": 45,
      "faq_company_teams": 43,
      "user_frustration": 40,
      "faq_company_size": 39,
      "faq_company_benefits": 37,
      "bot_capabilities": 36,
      "faq_diversity_inclusion_policy": 36,
      "faq_company_leadership": 36,
      "faq_company_location": 33,
      "faq_about_company_vision": 32,
      "faq_maternity_paternity_policy": 30,
      "faq_career_development": 29,
      "deny": 28,
      "faq_office_amenities": 28,
      "bot_challenge": 27,
      "faq_company_application_reasons": 26,
      "faq_company_values": 26,
      "user_wants_to_escape_form": 26,
      "faq_solutions_offered": 25,
      "bot_languages": 24,
      "bot_mood": 20,
      "affirm": 17,
      "ask_name": 15,
      "goodbye": 15,
      "greet": 13
    }
  },
  "sample_to_dup_intents": {},
  "intent_to_dup_samples": {}
}

For the course-grained test set:

{
  "global_metrics": {
    "total_intents": 8,
    "total_samples": 1199,
    "total_intents_with_entities": 0,
    "total_entities": 0,
    "intent_to_samples_dist": {
      "faq": 743,
      "small_talk": 235,
      "job_question": 58,
      "talk_to_human": 47,
      "job_search": 45,
      "negative_ack": 28,
      "exit_form": 26,
      "positive_ack": 17
    }
  },
  "sample_to_dup_intents": {},
  "intent_to_dup_samples": {}
}

Reference

This data correspond to the paper:

Intent Classification Methods for Human Resources Chatbots

Lucas Alvarez Lacasa, Martín Dévora-Pajares, Rabih Zbib and Hermenegildo Fabregat.

To be presented at the 40th International Conference of the Spanish Society for Natural Language Processing and to appear at the journal Procesamiento del Lenguaje Natural, September 2023.

If you use these data, please include the following reference:

@article{lacasa2024intentclassification,
  title={Intent Classification Methods for Human Resources Chatbots},
  author={Lucas Alvarez Lacasa, Martín D{\'\e}vora-Pajares, Rabih Zbib, Hermenegildo Fabregat},
  journal={Procesamiento del Lenguaje Natural},
  number={73},
  year={2024},
  publisher={Sociedad Espa{\~n}ola para el Procesamiento del Lenguaje Natural}
}

About

This repository contains a test set designed for intent classification within the Human Resources domain.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published