Build relationship Graphs using LLM in a Retrieval-Augmented Generation(RAG) framework with pgvector as a vector database

Overview

Tool to build relationship graphs using a large language module (LLM). Supports adding context to the query using Retrieval-Augmented Generation(RAG). Context is built against an internal knowledge base. Context embeddings are stored and retrieved from a vector database. Relationships are stored in the database.

Tool Features

Store context in the vector database
Retrieve context from vector database, supplement the query with the context thus improve LLM response quality
Along with the LLM response, visualize the relationships in the document(s), highlight related documents and images

Installation

Prerequisites

Python 3.10 or greater
check requirements.txt for required python libraries

Supported Database

PostgreSQL . Supports Postgres 11+ . Tested on 14.10.

Vector Database

pgvector

Scripts

pgdb_setup.sh: Install postgresql14.10 database on Ubuntu.
pgvector.sql: Configure postgresql database as a vector database
setup.sh: Install required python packages, configure vector database. Assumes PostgreSQL database on the same host. Review the file before execution.

Application

coreconfigs.py: Application configurations. An important file to review and edit.
store_embeddings.py: Wrapper script to read the text files, generate and store embeddings, relationships in pgvector database
example_query.py: Example to query LLM, save results as a html
LLM-RAG-GRAPH.ipynb: Jupyter notebook with Gradio interface can also be used to interact with the LLM and visualize the graph

Getting Started

Application config and run

Download the repo
Perform the installation steps (see above)
Edit coreconfigs.py to update the postgreSQL DB connection.

run store_embeddings.py to store the embeddings, relationships into pgvector DB

Embedding model ok.
DB connection established.
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████| 8/8 [00:02<00:00,  3.39it/s]
Processing text file: NBK548420.txt
Get relations: Cetirizine and its enantiomer levocetirizine are second generation antihistamines that are used for the treatment of allergic rhinitis, angioedema and chronic urticaria.
...
...
Embeddings commited for file: texts_input\NBK548420.txt

run the example_query.py to test

python example_query.py
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████| 8/8 [00:04<00:00,  1.99it/s]
WARNING:root:Some parameters are on the meta device device because they were offloaded to the cpu.
Embedding model ok.
DB connection established.
View the html file: user_qry_results.html for the results

...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Build relationship Graphs using LLM in a Retrieval-Augmented Generation(RAG) framework with pgvector as a vector database

Overview

Tool Features

Installation

Prerequisites

Supported Database

Vector Database

Scripts

Application

Getting Started

Application config and run

Edit coreconfigs.py to update the postgreSQL DB connection.

Example 1

Generated graph full resolution

Example2: Query with a typo

Generated graph full resolution

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
assets		assets
text_processed		text_processed
texts_input		texts_input
LICENSE		LICENSE
LLM-RAG-GRAPH.ipynb		LLM-RAG-GRAPH.ipynb
README.md		README.md
coreconfigs.py		coreconfigs.py
coreutils.py		coreutils.py
example_query.py		example_query.py
pgdb_setup.sh		pgdb_setup.sh
pgvector.sql		pgvector.sql
requirements.txt		requirements.txt
setup.sh		setup.sh
store_embeddings.py		store_embeddings.py

License

ryogesh/llm-rag-graph

Folders and files

Latest commit

History

Repository files navigation

Build relationship Graphs using LLM in a Retrieval-Augmented Generation(RAG) framework with pgvector as a vector database

Overview

Tool Features

Installation

Prerequisites

Supported Database

Vector Database

Scripts

Application

Getting Started

Application config and run

Edit coreconfigs.py to update the postgreSQL DB connection.

Example 1

Generated graph full resolution

Example2: Query with a typo

Generated graph full resolution

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages