
Reflections on Intercom’s decision to stick with Elasticsearch
Reflections on Intercom’s decision to stick with Elasticsearch
Cosine similarity - the duct tape of AI. Convenient but often misused. Let's find out how to use it better.
AI systems aren’t perfect (GASP!) and these are some of the reasons why.
A significant challenge in information retrieval today is determining the most efficient method for nearest-neighbor vector search, especially with the growing complexity of dense and sparse retrieval models. Practitioners must navigate a wide range of options for indexing and retrieval methods, including HNSW (Hierarchical Navigable Small-World) graphs, flat indexes, and inverted indexes. These methods offer different trade-offs in terms of speed, scalability, and quality of retrieval results. As datasets become larger and more complex, the absence of clear operational guidance makes it difficult for practitioners to optimize their systems, particularly for applications requiring high performance, such as search engines and
Open source vector databases are among the top options out there for AI development, including some you may already be familiar with or even have on hand.
Performance testing shows integrating Tantivy’s full-text search engine library into vector search significantly improves speed and performance.
Pinecone, the vector database startup founded by Edo Liberty, the former head of Amazon's AI Labs, has long been at the forefront of helping businesses
Cosine similarity can measure the proximity between two documents by transforming words into vectors within a vector space.
Vector databases have become increasingly prominent, especially in applications that involve machine learning, image processing, and similarity searches. Unlike traditional databases that store data as scalar values (numbers and strings), vector databases are designed to handle multidimensional data points, typically represented as vectors. These vectors can be used to model complex items like images, videos, and text in a format that machines can interpret for tasks such as content recommendation, anomaly detection, and more. Let’s explore 14 different vector databases and provide a comparative analysis of several key parameters. Faiss (Facebook AI Similarity Search) Faiss, developed by Facebook AI, is
I built a magical meme search engine using siglip/CLIP and vector encoding images. It was a fun way to learn about this powerful technology. I'm sharing the code so you can build your own and discover forgotten gems in your photo library. Let's unleash the power of AI on our images!
Everyone is talking about vectors these days. Cosines, ANN searches, normalizations, sentence...
Today Pinecone launched a serverless vector database architecture that CEO Edo Liberty calls a 'significant' breakthrough for the industry.
Word embedding vector databases have become increasingly popular due to the proliferation of massive language models. Using the power of sophisticated machine learning techniques, data is stored in a vector database. It allows for very fast similarity search, essential for many AI uses such as recommendation systems, picture recognition, and NLP. The essence of complicated data is captured in a vector database by representing each data point as a multidimensional vector. Quickly retrieving related vectors is made possible by modern indexing techniques like k-d trees and hashing. To transform big data analytics, this architecture generates highly scalable, efficient solutions for
A hands-on dive into scalar quantization (integer quantization) and product quantization with Python.
A detailed comparison of Milvus, Pinecone, Vespa, Weaviate, Vald, GSI and Qdrant
Discover Vector Databases: How They Work, Examples, Use Cases, Pros & Cons, Selection and Implementation. They have combined capabilities of traditional databases and standalone vector indexes while specializing for vector embeddings.
The Similarity Engine's use cases include item-to-item similarity for text and image modality and user-to-item personalized recommendations based on a user’s historical behavior data.
Vector database is a type of database that stores data as high-dimensional vectors, which are...