GitHub - manishshettym/codescholar: codescholar: growing programs graphs idiomatically for API usage examples

While APIs have become a pervasive component of software, a core challenge for developers is to identify and use existing APIs. This warrants either a deep understanding of the API landscape or access to high-quality documentation and usage examples. While the for- mer is infeasible, the latter is often limited in practice.

CodeScholar (📝 Paper: Preprint) is a tool that generates idiomatic code examples for query APIs (single and multiple). It finds idiomatic examples for APIS by searching a large corpus of code and growing program graphs idiomatically guided by a neural model.

python search.py --dataset <dataset_name> --seed json.load

Key Aspects of CodeScholar

🔥 Fast neural-guided search over graphs.
🧠 Idiomatic code generation by graph growing for representative examples.
🪢 Single and Multi-API support, and easily extensible to new APIs.
🚀 Streamlit app for interactive search.

How to install CodeScholar:

# clone the repository
git clone [email protected]:tart-proj/codescholar.git

# cd into the codescholar directory
cd codescholar

# install basic requirements
pip install -r requirements-dev.txt

# install pytorch-geometric requirements. Use {pyg} for GPU and {torch} for CPU
pip install -r requirements-{pyg,torch}.txt

# install codescholar
pip install -e .

How to use CodeScholar:

Starting services

./services.sh start

what does this do?

# start an elasticsearch server (hosts programs) in a tmux session
docker run --rm -p 9200:9200 -p 9300:9300 -e "xpack.security.enabled=false" -e "discovery.type=single-node" docker.elastic.co/elasticsearch/elasticsearch:8.7.0

# start a redis server (hosts embeddings)
docker run --rm -p 6379:6379 redis

Indexing

./services.sh index <dataset_name>

what does this do?

# index the dataset using /search/elastic_search.py
cd codescholar/search
python elastic_search.py --dataset <dataset_name>

TODO: index all embeddings into redis; currently index happens before each search

Searching

# run the codescholar query (say np.mean) using /search/search.py
python search.py --dataset <dataset_name> --seed np.mean

You can also use some arguments with the search query:

--min_idiom_size <int> # minimum size of idioms to be saved
--max_idiom_size <int> # maximum size of idioms to be saved
--max_init_beams <int> # maximum beams to initialize search
--stop_at_equilibrium  # stop search when diversity = reusability of idioms

note: see more configurations in /search/search_config.py

How to run CodeScholar App:

Setup services

./services.sh start
./services.sh index <dataset_name>

Start server and application

cd codescholar/apps

./app.sh start

what does this do?

# start a celery backend to handle tasks asynchronously
celery -A app_decl.celery worker --pool=solo --loglevel=info

# start a flask server to handle http API requests
# note: runs flask on port 3003
python flask_app.py

You can now make API requests to the flask server. For example, to run search for size 10 idioms for pd.merge, you can:

curl -X POST -H "Content-Type: application/json" -d '{"api": "pd.merge", "size": 10}' http://localhost:3003/search

Finally,

# start the streamlit app on port localhost:8501
streamlit run streamlit_app.py

View details about the app using: ./app.sh show

How to train CodeScholar:

Refer to the training README for a detailed description of how to train CodeScholar.

Reproducability of CodeScholar Evaluation:

Refer to the evaluation README for a detailed description of how to reproduce the evaluation results reported in the paper.

Name		Name	Last commit message	Last commit date
Latest commit History 391 Commits
.github/workflows		.github/workflows
codescholar		codescholar
doc		doc
.gitignore		.gitignore
README.md		README.md
codescholar.png		codescholar.png
requirements-dev.txt		requirements-dev.txt
requirements-pyg.txt		requirements-pyg.txt
requirements-torch.txt		requirements-torch.txt
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Key Aspects of CodeScholar

Table of Contents

How to install CodeScholar:

How to use CodeScholar:

How to run CodeScholar App:

How to train CodeScholar:

Reproducability of CodeScholar Evaluation:

About

Releases

Packages

Languages

manishshettym/codescholar

Folders and files

Latest commit

History

Repository files navigation

Key Aspects of CodeScholar

Table of Contents

How to install CodeScholar:

How to use CodeScholar:

How to run CodeScholar App:

How to train CodeScholar:

Reproducability of CodeScholar Evaluation:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages