Skip to content
View AarushSah's full-sized avatar

Block or report AarushSah

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
AarushSah/README.md

Evals Are All You Need

Pinned Loading

  1. Set_Eval Set_Eval Public

    novel benchmark for probing the visual reasoning capabilities of large language models

    Python 2

  2. eris-eval eris-eval Public

    LLM evaluation framework that assesses model performance through simulated debates

    Python 1

  3. prompt-optimizer prompt-optimizer Public

    Automates the process of prompt engineering using Anthropic's Claude language model.

    Python 66 7

  4. LLM-PCI LLM-PCI Public

    Project Injector for Long-Context LLMs

    Python 4 1

  5. BookTrailers BookTrailers Public

    Easy way to create book trailers for libraries. Powered by Alpaca.

    Python 2

  6. llm-file-categorizer llm-file-categorizer Public

    Folder sorter powered by Claude 3 Haiku and Opus

    Python