You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -17,14 +17,42 @@ import IpSampleImage from './images/ip-sample.png';
17
17
18
18
<AuthorsfrontMatter={frontMatter} />
19
19
20
-
:::tip GITHUB CODE
20
+
## What you Will learn in this tutorial
21
21
22
-
Below is a command to the clone the source code used in this tutorial
22
+
This tutorial is a comprehensive guide to leveraging Redis for vector similarity search in a NodeJS environment. Aimed at software developers with expertise in the NodeJS/ JavaScript ecosystem, this tutorial will provide you with the knowledge and techniques required for advanced vector operations. Here's what's covered:
-[About Vectors](./#what-is-a-vector-in-machine-learning): Delve into the foundational concept of vectors in machine learning.
27
+
-[Vector Databases](./#what-is-a-vector-database): Understand specialized databases designed to handle vector data efficiently.
28
+
-[Vector Similarity](./#what-is-vector-similarity): Grasp the concept and significance of comparing vectors. Discover some use cases where vector similarity plays a pivotal role, from recommendation systems to content retrieval.
29
+
30
+
-[Vector Generation](./#generating-vectors):
31
+
32
+
-[Textual Content](./#sentence-text-vector): Learn techniques to generate vectors from textual data.
33
+
-[Imagery Content](./#image-vector): Understand how images can be represented as vectors and how they're processed.
34
+
35
+
-[Redis Database Setup](./#database-setup):
36
+
37
+
-[Data Seeding](./#sample-data-seeding): Get hands-on with populating your Redis database with vector data.
38
+
-[Index Creation](./#create-vector-index): Understand the process of indexing vector fields in Redis, optimizing for both accuracy and performance.
39
+
40
+
- Advanced Vector Queries in Redis:
41
+
42
+
-[KNN (k-Nearest Neighbors) Queries](./#what-is-vector-knn-query): Dive into the concept of KNN and its implementation in Redis to retrieve vectors most similar to a query vector.
43
+
-[Range Queries](./#what-is-vector-range-query): Discover how to retrieve vectors within a certain distance or range from a target vector.
44
+
45
+
-[Vector Similarity Calculations](./#how-to-calculate-vector-similarity): (Optionally) if you want to understand the math behind vector similarity search
46
+
47
+
-[Euclidean Distance](./#euclidean-distance-l2-norm): Understand the L2 norm method for calculating similarity.
48
+
-[Cosine Similarity](./#cosine-similarity): Dive into the angular differences and its importance in vector space.
49
+
-[Inner Product](./#inner-product): Learn about another essential metric in understanding vector similarities.
50
+
51
+
-[Additional Resources](./#further-reading): Take your learning further with other resources related to vectors in Redis.
52
+
53
+
## Vectors introduction
26
54
27
-
## What is a vector in machine learning?
55
+
###What is a vector in machine learning?
28
56
29
57
In the context of machine learning, a vector is a mathematical representation of data. It is an ordered list of numbers that encode the features or attributes of a piece of data.
30
58
@@ -41,7 +69,7 @@ Now, product 1 `Puma Men Race Black Watch` might be represented as the vector `[
41
69
42
70
In a more complex scenario, like natural language processing (NLP), words or entire sentences can be converted into dense vectors (often referred to as embeddings) that capture the semantic meaning of the text.Vectors play a foundational role in many machine learning algorithms, particularly those that involve distance measurements, such as clustering and classification algorithms.
43
71
44
-
## What is a vector database?
72
+
###What is a vector database?
45
73
46
74
A vector database is a specialized system optimized for storing and searching vectors. Designed explicitly for efficiency, these databases play a crucial role in powering vector search applications, including recommendation systems, image search, and textual content retrieval. Often referred to as vector stores, vector indexes, or vector search engines, these databases employ vector similarity algorithms to identify vectors that closely match a given query vector.
47
75
@@ -51,12 +79,12 @@ A vector database is a specialized system optimized for storing and searching ve
51
79
52
80
:::
53
81
54
-
## What is vector similarity?
82
+
###What is vector similarity?
55
83
56
84
Vector similarity is a measure that quantifies how alike two vectors are, typically by evaluating the `distance` or `angle` between them in a multi-dimensional space.
57
85
When vectors represent data points, such as texts or images, the similarity score can indicate how similar the underlying data points are in terms of their features or content.
58
86
59
-
### Use cases for vector similarity
87
+
**Use cases for vector similarity:**
60
88
61
89
-**Recommendation Systems**: If you have vectors representing user preferences or item profiles, you can quickly find items that are most similar to a user's preference vector.
62
90
-**Image Search**: Store vectors representing image features, and then retrieve images most similar to a given image's vector.
@@ -81,7 +109,7 @@ Below is a command to the clone the source code used in this tutorial
To generate sentence embeddings, we'll make use of a Hugging Face model titled [Xenova/all-distilroberta-v1](https://huggingface.co/Xenova/all-distilroberta-v1). It's a compatible version of [sentence-transformers/all-distilroberta-v1](https://huggingface.co/sentence-transformers/all-distilroberta-v1) for transformer.js with ONNX weights.
87
115
@@ -484,7 +512,7 @@ KNN, or k-Nearest Neighbors, is an algorithm used in both classification and reg
484
512
485
513
### Vector KNN query with Redis
486
514
487
-
Redis allows you to index and then search for vectors [using the KNN approach](https://redis.io/docs/stack/search/reference/vectors/#pure-knn-queries).
515
+
Redis allows you to index and then search for vectors [using the KNN approach](https://redis.io/docs/interact/search-and-query/search/vectors/).
488
516
489
517
Below, you'll find a Node.js code snippet that illustrates how to perform `KNN query` for any provided `search text`:
KNN queries can be combined with standard Redis search functionalities using <u>[hybrid knn queries](https://redis.io/docs/interact/search-and-query/search/vectors/#hybrid-knn-queries).</u>
618
+
KNN queries can be combined with standard Redis search functionalities using <u>[Hybrid queries](https://redis.io/docs/interact/search-and-query/search/vectors/#hybrid-knn-queries).</u>
591
619
:::
592
620
593
621
## What is vector range query?
@@ -701,7 +729,7 @@ Hopefully this tutorial has helped you visualize how to use Redis for vector sim
701
729
702
730
Several techniques are available to assess vector similarity, with some of the most prevalent ones being:
703
731
704
-
#### Euclidean Distance (L2 norm)
732
+
### Euclidean Distance (L2 norm)
705
733
706
734
**Euclidean Distance (L2 norm)** calculates the linear distance between two points within a multi-dimensional space. Lower values indicate closer proximity, and hence higher similarity.
707
735
@@ -725,7 +753,7 @@ As an example, we will use a 2D chart made with [chart.js](https://www.chartjs.o
725
753
726
754

727
755
728
-
#### Cosine Similarity
756
+
### Cosine Similarity
729
757
730
758
**Cosine Similarity** measures the cosine of the angle between two vectors. The cosine similarity value ranges between -1 and 1. A value closer to 1 implies a smaller angle and higher similarity, while a value closer to -1 implies a larger angle and lower similarity. Cosine similarity is particularly popular in NLP when dealing with text vectors.
731
759
@@ -748,7 +776,7 @@ Using [chart.js](https://www.chartjs.org/), we've crafted a 2D chart of `Price v
748
776
749
777

750
778
751
-
#### Inner Product
779
+
### Inner Product
752
780
753
781
**Inner Product (dot product)** The inner product (or dot product) isn't a distance metric in the traditional sense but can be used to calculate similarity, especially when vectors are normalized (have a magnitude of 1). It's the sum of the products of the corresponding entries of the two sequences of numbers.
0 commit comments