Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Roadmap] Mutahunter Roadmap #5

Open
14 tasks
jungs1 opened this issue Jul 18, 2024 · 0 comments
Open
14 tasks

[Roadmap] Mutahunter Roadmap #5

jungs1 opened this issue Jul 18, 2024 · 0 comments
Labels
help wanted Extra attention is needed misc

Comments

@jungs1
Copy link
Contributor

jungs1 commented Jul 18, 2024

Anything to discuss about improving Mutation Testing

This document outlines the features in Mutahunter's roadmap for Q3 2024. We welcome discussions and contributions, as this roadmap is shaped by the Mutahunter community. Notably, the following papers have influenced our direction and highlight the potential of LLM-based mutation testing:

Vision

As an engineer, I find the current state of mutation testing to be frustratingly limited. Traditional methods rely heavily on brute force and human intervention to identify semantic gaps. I believe that leveraging Large Language Models (LLMs) can significantly automate and improve this process, moving beyond mutation scores to provide deeper, automated insights into code quality.

Themes

We categorized our roadmap into 5 broad themes:

  1. Broad Model Support 🧠
  2. Cost Optimization 💰
  3. Performance Optimization 🚀
  4. Production-Level Features 🏭
  5. Strong OSS Community 🌍

Broad Model Support

  • Support Pretrained Language Model:
    • Implement support for codeBERT, and other state-of-the-art LLMs.

Cost Optimization

  • Token Cost Optimization:
    • Develop strategies to reduce token costs when using GPT-based models.
    • Explore the use of BERT for generating cost-effective mutants. The paper μBERT: Mutation Testing using Pre-Trained Language Models shows promising results compared to traditional mutation testing. While BERT may not be as powerful as GPT, it remains contextually relevant and maintains language agnosticism. μBERT has demonstrated better quality than rule-based mutation testing with higher coupling and better cost-effectiveness.

Performance Optimizations

  • Speed Optimization: Batch Call multiple function blocks to mutate for GPT LLM calls
  • Test Case Prioritization: Develop methods to prioritize which tests to run against generated mutants.
  • Benchmarking Tool: Create a benchmarking tool to compare LLM-based and non-LLM-based mutation testing tools, evaluating metrics like mutant score, mutant count, cost, real bug detectability, and coupling rate. See An Exploratory Study on Using Large Language Models for Mutation Testing

Test Generation

  • Line Coverage: Generates unit tests based on line coverage
  • Mutation Coverage: Validates and improves unit tests based on mutation coverage

Mutant Analysis

  • Advanced Analysis: Move beyond traditional mutation coverage and test strength metrics. While current efforts involve using LLMs to analyze surviving mutants for potential semantic gaps and direct feedback, we aim to develop more robust and comprehensive methods.
  • LLM-Driven Insights:
    • Utilize LLMs to analyze surviving mutants, identify semantic gaps, and detect potential bugs.
    • Map potential mutants to known bugs like CVE.
    • Develop a better way to inform users what to fix and explain based on the surviving mutants, improving beyond the current markdown format.
    • Provide a visual report on generated mutants, unified diffs, and failure points, enhancing the current JSON file format.

Production-Level Features

  • CI/CD Integration: Ensure seamless integration with CI/CD pipelines.

OSS Community

  • Documentation Enhancements: Improve documentation, explainers, tutorials, and examples.
  • Community Engagement:
    • Increase the number of examples and encourage bug reports.
    • Enhance the release process to minimize breaking changes.
  • Prompt Engineering:
    • Improve prompts for generating and analyzing mutants.
    • Seek contributions from prompt engineers to refine these prompts.
  • Active Discussions: I would love to see more and more people talking about this, bringing new ideas to the table, and engaging in the discussion. Your thoughts and discussions will shape how this project turns out.

What I Am Not Interested In

My primary focus is on enabling developers to use this tool easily during development. I aim for Mutahunter to be a tool that developers can utilize to find bugs or improve tests and then quickly move on with their development work. While running mutation testing in CI/CD pipelines overnight for QA teams or manually analyzing survived mutants can be effective for certain companies, this is not the direction I am prioritizing.


If you have any suggestions or if there's a feature you'd like to see on the roadmap, please feel free to comment in this thread, open a feature request. This would be incredibly helpful!


@jungs1 jungs1 added help wanted Extra attention is needed misc labels Jul 18, 2024
@jungs1 jungs1 changed the title [Roadmap] Mutahunter Roadmap Q3 2024 [Roadmap] Mutahunter Roadmap Jul 18, 2024
@jungs1 jungs1 pinned this issue Jul 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed misc
Projects
None yet
Development

No branches or pull requests

1 participant