- Published on
Down the Rabbit Hole - One step closer to Production Grade GraphRAG
- Authors
- Name
- Athos Georgiou

Recently, I shared my experiment combining vector search with knowledge graphs for more contextual retrieval. The response was fantastic, and I've been enhancing the system to push graph-rag knowledge retrieval further. If you'd like to have a read, feel free to check out the previous post.
Building on the Foundation
The original GraphRAG implementation combined Qdrant for vector search, Neo4j for graph relationships, and Qwen2.5 for generation. While that provided a solid foundation, there were clear opportunities for improvement in several areas:
- Entity and relationship extraction - Heavy dependency on LLMs for triplet extraction, risking high costs and/or performance bottlenecks
- Context-awareness - Limited ability to understand relationships and context for complex queries
Although the system performed well for many queries, it struggled with complex questions requiring deep understanding of relationships and context. Additionally, scaling to large document collections was challenging due to performance bottlenecks when using open source models (subject to one's compute capabilities).
One alternative could be using proprietary APIs, but that would be expensive and prone to hallucinations and bias, as well as data privacy risks. Another approach would be to build a system that could dynamically construct an ontology based on the input documents and queries, allowing for more context-aware retrieval and reasoning. This approach would enable the system to adapt to new domains and questions without requiring manual ontology construction.
Easier said than done, right? Pretty much. I tried using spaCy, NLTK, and a bunch of other libraries to extract entities and relationships, but I kept finding myself reinventing the wheel. Every time, I ended up with massive, messy configuration files that were somehow more complex than the original system. I needed to understand more about what I was trying to do, so that I could ask the right questions. That's when I decided to take my hands off the keyboard and dive headfirst into the rabbit hole of dynamic ontology construction.
While researching the topic, I came across an article by Dr. Irina Adamchic, titled: Build your hybrid-Graph for RAG & GraphRAG applications using the power of NLP. The article talked about using a combination of NLP techniques like Named Entity Recognition, Dependency Parsing, and Relation Extraction to build a dynamic ontology. The idea was to extract entities and relationships from the text and use them to construct a knowledge graph on the fly. This approach would allow the system to adapt to new domains and questions without requiring manual ontology construction.
It was exactly what I was looking for! Although some code snippets are shown in the article, a code repo was not available. So, I decided to build my own implementation of the dynamic ontology construction system, inspired by the ideas and code samples provided in the article. The code is by no means bug free, optimized, or production ready, but I do feel it's one step closer to it and I'm super excited to share it with you. And if you want to get a better understanding of the system, I highly recommend reading the article itself.
So, let's dive in!
Implementing Dynamic Ontology Construction
Rather than creating a Fixed Entity Architecture as described in Adamchic's other article on "Three-Layer Fixed Entity Architecture," I opted for a fully dynamic approach that would be domain-agnostic. The implementation centers around two main components:
1. NLP-Based Term Extraction
The first component, implemented in nlp_graph.py
, extracts tokens, bigrams, and trigrams from document chunks and connects them to form a lexical network:
def extract_ngrams(text: str) -> Tuple[List[str], List[str], List[str]]:
"""Extract unigrams, bigrams, and trigrams from text"""
# Tokenize and normalize text
tokens = [w.lower() for w in nltk.word_tokenize(text) if w.isalnum()]
# Filter stopwords if required
if self.remove_stopwords:
unigrams = [t for t in tokens if t not in STOPWORDS]
else:
unigrams = tokens
# Generate bigrams and trigrams
bigrams = [' '.join(b) for b in nltk.bigrams(tokens)]
trigrams = [' '.join(t) for t in nltk.trigrams(tokens)]
return unigrams, bigrams, trigrams
For scalability with large document collections, I also integrated Spark NLP support:
class SparkNLPGraphBuilder(NLPGraphBuilder):
"""NLP Graph Builder using Spark NLP for more scalable processing"""
def __init__(self, neo4j_conn=None, remove_stopwords=True):
# Initialize Spark session with Spark NLP
self.spark = sparknlp.start()
# Define Spark NLP pipeline
document_assembler = DocumentAssembler().setInputCol("text").setOutputCol("document")
tokenizer = Tokenizer().setInputCols(["document"]).setOutputCol("token")
normalizer = Normalizer().setInputCols(["token"]).setOutputCol("normalized").setLowercase(True)
# Generate unigrams, bigrams, trigrams
bigram_generator = NGramGenerator().setInputCols(["normalized"]).setOutputCol("bigrams").setN(2)
trigram_generator = NGramGenerator().setInputCols(["normalized"]).setOutputCol("trigrams").setN(3)
This approach creates a rich lexical network where terms (tokens, n-grams) are linked to document chunks. The graph structure allows querying not only by exact matches but by traversing related terms, enabling more flexible retrieval than simple keyword matching.
2. Triplet Extraction for Semantic Relationships
The second core component, implemented in triplets.py
, extracts subject-predicate-object triplets to build a semantic knowledge layer:
class TripletExtractor:
"""Extract subject-relation-object triplets from text and map them into a Neo4j knowledge graph."""
def __init__(self, neo4j_conn=None, model_name=None):
# Use pretrained T5 model for triplet extraction
model_name = model_name or DEFAULT_TRIPLET_MODEL # "bew/t5_sentence_to_triplet_xl"
self.tokenizer = AutoTokenizer.from_pretrained(model_name)
self.model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
def extract_triplets(self, sentence: str) -> List[Tuple[str, str, str]]:
"""Extract triplets from a sentence using the T5 model."""
# Encode input sentence and generate model output
inputs = self.tokenizer(sentence, return_tensors="pt")
outputs = self.model.generate(**inputs, max_length=64)
triplet_text = self.tokenizer.decode(outputs[0], skip_special_tokens=False)
triplets = []
# Process model output to extract structured triplets
for segment in triplet_text.split("<triplet>"):
if segment.strip():
triple_content = segment.split("</triplet>")[0] if "</triplet>" in segment else segment
triple_content = triple_content.replace("<pad>", "")
if "<relation>" in triple_content and "<object>" in triple_content:
subj = triple_content.split("<relation>")[0].strip()
rel = triple_content.split("<relation>")[1].split("<object>")[0].strip()
obj = triple_content.split("<object>")[1].strip()
if subj and rel and obj:
triplets.append((subj, rel, obj))
return triplets
These triplets are then stored in the Neo4j graph to create semantic relationships between entities:
def process_triplet(self, triplet: Tuple[str, str, str]) -> Any:
"""Process a single triplet and merge into the Neo4j graph."""
subject, predicate, object_ = triplet
# Compute embeddings for each component
subject_emb = embed_text(subject)
predicate_emb = embed_text(predicate)
object_emb = embed_text(object_)
# Store in Neo4j with vector embeddings for similarity search
sanitized_relation = self.sanitize_relation(predicate)
# Create the semantic relationship
self.neo4j.run_query(
f"""
MERGE (s:Entity {{name: $subject}})
ON CREATE SET s.embedding = $subject_emb
MERGE (o:Entity {{name: $object}})
ON CREATE SET o.embedding = $object_emb
MERGE (s)-[r:{sanitized_relation}]->(o)
ON CREATE SET r.source = $predicate
""",
{
"subject": subject,
"predicate": predicate,
"object": object_,
"subject_emb": subject_emb.tolist(),
"object_emb": object_emb.tolist(),
}
)
The Hybrid Retrieval Mechanism
With both lexical and semantic layers in place, I implemented a hybrid retrieval approach that leverages both vector similarity and graph traversal:
- Vector-based Search: Embeds the query using the same model as documents and performs ANN search to find semantically similar chunks. No surprise here, I used Qdrant, as it's one of the best open-source Vector databases, if not the best
- Term-based Graph Traversal: Identifies key terms in the query and traverses the knowledge graph to find relevant chunks connected to these terms. I used Neo4j for the same reasons as above. State of the art + open source
- Entity-Relationship Exploration: Identifies entities mentioned in the query and follows relationships in the knowledge graph to find indirectly relevant content
- Context-aware Result Processing: For each matched chunk, retrieves surrounding chunks (PREV/NEXT) to provide coherent context that preserves document flow
The hybrid scoring mechanism combines vector similarity and graph relevance scores:
def hybrid_retrieval(query, top_k=5, with_context=True, context_size=2):
# Vector search component
vector_matches = vector_search(query, top_k)
# Graph-based enhancements
graph_matches = graph_search(query, top_k)
# Combine and score results
combined_results = merge_and_score(vector_matches, graph_matches)
# Add context if requested
if with_context:
results_with_context = add_context(combined_results, context_size)
return results_with_context
return combined_results
How does this thing look?
Here's a quick peek at the system in action, after ingesting the 3 sample files you can find in the repo:
http://localhost:7474/browser/
Neo4j Graph Dashboard - Running at:- Documents
- Document Chunks
- Terms
- Entities
- How it all comes together
http://localhost:6333/dashboard
Qdrant Vector Database - Running at:- Tokens Collection
Features and Architecture
The system includes several core features:
- Document ingestion from plain text or PDF files
- Automated chunking and embedding of text segments
- Document chain modeling with NEXT/PREV relationships between sequential chunks
- Context-aware retrieval that provides surrounding chunks for better context
- NLP-powered knowledge graph construction
- Hybrid retrieval combining vector search and graph traversal
- Interactive querying interface with context customization
The architecture follows a modular design with components for document ingestion, vector indexing, term graph construction, entity extraction, triplet extraction, hybrid retrieval, and a CLI interface.
Performance and Benefits
This architecture offers several key advantages:
- Domain Adaptability: The system adapts to new domains without requiring predefined ontologies or entity lists
- Cost Efficiency: The hybrid approach reduces reliance on expensive LLM calls for every step.
- Enhanced Context: The document chain model and context-aware retrieval provide more coherent information, which can then be provided to an LLM
- Improved Reasoning: Multi-hop exploration through the graph enables finding information that is not explicitly mentioned in the query
- Scalability: Support for Spark NLP enables processing large document collections efficiently
- Performance: This application significantly outperforms my previous implementation in terms of speed and accuracy.
What could be improved?
- The User Interface: The system currently provides only a CLI interface for querying and exploring results. A proper UI would make it more user-friendly
- Optimization: The code could be optimized for performance and efficiency, especially for large-scale document collections
- The Chat Experience: The system could be integrated with an AI Assistant to provide a more conversational experience
- Evals: The system should be evaluated on a variety of datasets to measure its performance and effectiveness. Any takers?
Final thoughts
The journey from a simple GraphRAG system to a dynamic ontology construction system has been exciting and challenging. I've learned a lot about NLP, graph databases, and knowledge representation along the way. I'm excited to see how this system evolves and how it can be applied to real-world problems. If you're interested in exploring the code or collaborating on this project, feel free to reach out! I'd love to hear your thoughts and ideas.
Special thanks to Dr. Irina Adamchic for inspiring this project and sharing her knowledge with the community. I look forward to learning more from her and others in the field of NLP and knowledge graphs.
Until next time, happy coding!
Oh, snap! Almost forgot! Here's the code. The README should help you get started.
See ya!