Archives
All the articles I've archived.
-
Diminishing Returns and the Art of Knowing When to Stop
I trained three generations of ColQwen3.5, each with more sophisticated optimization than the last. The most optimized version barely beat the previous one on the primary benchmark (+0.0011 nDCG@5). Individual tasks reshuffled substantially, with per-task swings an order of magnitude larger than the aggregate gain.
-
Closing the AI Value Gap: Insights from Research
Enterprise AI adoption has reached 88%, yet only 5% of pilots deliver measurable impact. Research from MIT, BCG, and RAND reveals what separates successful implementations from the rest. It's not the technology.
-
Implementing Spatially-Grounded Document Retrieval via Patch-to-Region Propagation
A deep dive into my recent research on spatially-grounded document retrieval using ColPali models and OCR bounding boxes, enabling precise region-level retrieval during inference time and without additional training.
-
Snappy: Your Vision Retrieval Buddy!
How Snappy evolved from the nextjs-fastapi-colpali template into a vision-first document retrieval system
-
You too can run the Vidore Benchmark with less than 32GB of GPU VRAM
Quick, practical notes to run the Vidore benchmark smoothly on a single 32GB GPU: dtype, batch size, and common OOM fixes.
-
The Most Beautiful RAG: Starring ColPali, Qdrant, Minio and Friends
Updated:An end-to-end, page-level Vision RAG template with ColPali-style embeddings, Qdrant multivector retrieval (with optional binary quantization), and MinIO-backed storage — dockerized and API-first.
-
ColQwen2.5 FastAPI Integration
A little-script to create a FastAPI server for ColQwen2.5
-
Audio RAG with ColQwen2.5-Omni
An audio RAG system that processes video URLs and answers questions about their content using ColQwen2.5-Omni and OpenAI audio
-
The Most Beautiful RAG: Starring Colnomic, Qdrant, Minio and Friends
Updated:Introducing the first project in my little-scripts monorepo - A simple, yet beautiful RAG implementation using Colnomic, Qdrant and Nomic
-
Mapping Worlds into Graphs with Qdrant, Neo4j, RF-DETR, BLIP-2 and Kung Fu
Diving deeper into the GraphRAG rabbit hole, I explore how to transform real-world video data into knowledge graphs using RF-DETR for object detection and BLIP-2 for intelligent entity description - setting the foundation for context-aware retrieval systems.
-
Down the Rabbit Hole - One step closer to Production Grade GraphRAG
After my initial experiment with GraphRAG using Qdrant, Neo4j, and Ollama, I took on a journey to build a more dynamic and context-aware system. This post dives into the details of how I constructed a dynamic ontology for NLP GraphRag.
-
GraphRAG with Qdrant, Neo4j, and Ollama (Using Qwen2.5:3b and Nomic text embeddings)
I've been playing with a new approach to RAG systems - combining vector search with knowledge graphs for more contextual, relationship-aware answers. Here's what I've built, how it works, and why you might want to try it yourself.
-
Crazy good Observability using Grafana Alloy
Learn how to quickly set up a complete Grafana Alloy observability stack with just a few commands.
-
Superfast Telemetry Setup for Next.js with OpenTelemetry, Prometheus, and Grafana
Learn how to quickly set up a super cool telemetry system for your Next.js application using OpenTelemetry, Prometheus, and Grafana
-
It must've been love. Is it over now?
It must've been love. But, Is it over now?
-
Raising Artificial Intelligence
Artificial Intelligence, especially Large Language Models like GPT-4, can be viewed through the parent-child relationship lens, reflecting the care and responsibility akin to raising a child. This perspective helps balance AI’s capabilities with societal impacts, ethical considerations, and risk management, without implying AI sentience or diminishing human complexities.
-
Is Generative AI the Answer to Everything?
Is Generative AI the Answer to Everything? No, but it's a powerful tool that can be augmented with traditional methods and new technologies to address a broader spectrum of challenges.
-
How much would it cost to store a 1 hour, 60fps 4k Video in a RAG model?
Updated:A simple experiment to calculate the cost of storing a 1 hour, 60fps 4k Video in a RAG model. For no practical reason, whatsoever.
-
Naive NoSQL Conversational History Retrieval for Dummies
Persistent memory in Generative AI is a crucial component that allows the AI to remember and recall information from previous interactions. In this article, we'll explore the 'Naive NoSQL Conversational History Retrieval Strategy'
-
So, I've been doing stuff...
2023 was an exciting year for learning. I went from a hibernating bear to a busy bee, and I've gotten back in touch with my long lost passion; Artificial Intelligence. It's time to reflect on the past and look forward to the future.
-
A brief analysis on RAG with Pinecone Serverless and Unstructured.io
Updated:In this article, I provide a brief analysis on the performance of a RAG model using Pinecone Serverless and Unstructured.io, and the streaming chat experience. This analysis was done using a Next.js based template named [Titanium](https://github.com/athrael-soju/Titanium), which already incorporates several advanced Generative AI features.
-
Integrating Vision using the latest OpenAI API
In this article, I'll be integrating Vision into the AI chat assistant I've been building in the previous articles. This work will include creating new UI components for the Vision API, as well as creating new API routes to support the new functionality.
-
Thoughts on the Latest OpenAI APIs and starting a New Project
Updated:Lately I've been playing around with the latest OpenAI APIs to see what the buzz is all about. I thought it would be a good time to start a new project from scratch, and I've been working on it for the last month or so. It's still in early stages, but I think I've learned enough to have a decent perspective on what OpenAI has to offer and how it can be used to build interesting applications.
-
Integrating multi-user Assistants using the latest OpenAI API
Updated:In this article, I'll be sharing my experience with integrating multi-user assistants with OpenAI API. This will include building the UI components for the user Assistant, including some functionality to allow deletion of all Assistant related data/files and going over file uploads and how to handle them in the Assistant.
-
Integrating Next-Auth in a Streaming AI Chat Assistant Using Material-UI
This guide covers the step-by-step process of integrating next-auth for authentication in a Next.js project, using Material-UI for styling. It includes acquiring OAuth credentials from GitHub and Google, configuring environment variables, setting up authentication providers, and implementing Material-UI components for a responsive user interface.
-
Creating an OpenAI Law Copilot - A Guide to Building an AI Legal Assistant
A guide to building an AI legal assistant using OpenAI's API to handle various legal tasks efficiently.
-
Creating a Customized Input Component in a Streaming AI Chat Assistant Using Material-UI
Creating a customized input component for a streaming AI chat assistant using Material-UI and React.
-
Integrating Markdown in Streaming Chat for AI Assistants
A guide to Integrating Markdown in Streaming Chat for AI Assistants.
-
Athrael.net - A Tech Blog about everything, everywhere, all at once. 'Pun intended'
Inaugural post introducing Athrael.net, a space where AI meets real-world applications, driven by a seasoned software engineer’s journey and insights.
Athrael.net