Tag: vision
All the articles with the tag "vision".
-
Implementing Spatially-Grounded Document Retrieval via Patch-to-Region Propagation
A deep dive into my recent research on spatially-grounded document retrieval using ColPali models and OCR bounding boxes, enabling precise region-level retrieval during inference time and without additional training.
-
Snappy: Your Vision Retrieval Buddy!
How Snappy evolved from the nextjs-fastapi-colpali template into a vision-first document retrieval system
-
The Most Beautiful RAG: Starring ColPali, Qdrant, Minio and Friends
Updated:An end-to-end, page-level Vision RAG template with ColPali-style embeddings, Qdrant multivector retrieval (with optional binary quantization), and MinIO-backed storage — dockerized and API-first.
-
Integrating Vision using the latest OpenAI API
In this article, I'll be integrating Vision into the AI chat assistant I've been building in the previous articles. This work will include creating new UI components for the Vision API, as well as creating new API routes to support the new functionality.
Athrael.net