Tag: vision

All the articles with the tag "vision".

Implementing Spatially-Grounded Document Retrieval via Patch-to-Region Propagation

2 Dec, 2025

A deep dive into my recent research on spatially-grounded document retrieval using ColPali models and OCR bounding boxes, enabling precise region-level retrieval during inference time and without additional training.
Snappy: Your Vision Retrieval Buddy!

27 Oct, 2025

How Snappy evolved from the nextjs-fastapi-colpali template into a vision-first document retrieval system
The Most Beautiful RAG: Starring ColPali, Qdrant, Minio and Friends

Updated: 1 Sep, 2025

An end-to-end, page-level Vision RAG template with ColPali-style embeddings, Qdrant multivector retrieval (with optional binary quantization), and MinIO-backed storage — dockerized and API-first.
Integrating Vision using the latest OpenAI API

30 Jan, 2024

In this article, I'll be integrating Vision into the AI chat assistant I've been building in the previous articles. This work will include creating new UI components for the Vision API, as well as creating new API routes to support the new functionality.

Implementing Spatially-Grounded Document Retrieval via Patch-to-Region Propagation