You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/howtos/solutions/vector/getting-started-vector/index-getting-started-vector.mdx
+295-9Lines changed: 295 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -46,6 +46,12 @@ In a more complex scenario, like natural language processing (NLP), words or ent
46
46
Vector similarity is a measure that quantifies how alike two vectors are, typically by evaluating the `distance` or `angle` between them in a multi-dimensional space.
47
47
When vectors represent data points, such as texts or images, the similarity score can indicate how similar the underlying data points are in terms of their features or content.
48
48
49
+
### Use cases for vector similarity:
50
+
51
+
-**Recommendation Systems**: If you have vectors representing user preferences or item profiles, you can quickly find items that are most similar to a user's preference vector.
52
+
-**Image Search**: Store vectors representing image features, and then retrieve images most similar to a given image's vector.
53
+
-**Textual Content Retrieval**: Store vectors representing textual content (e.g., articles, product descriptions) and find the most relevant texts for a given query vector.
54
+
49
55
## How to calculate vector similarity?
50
56
51
57
There are several ways to calculate vector similarity, but some of the most common methods include:
console.log(`product ${product._id} added to redis`);
371
+
}
372
+
}
373
+
}
374
+
```
375
+
376
+
Data view in RedisInsight
377
+
378
+

379
+
380
+
:::tip
381
+
Download <u>[RedisInsight](https://redis.com/redis-enterprise/redis-insight/)</u> to view your Redis data or to play with raw Redis commands in the workbench. learn more about <u>[RedisInsight in tutorials](/explore/redisinsight/)</u>
382
+
:::
383
+
384
+
### Create vector index
385
+
386
+
Below implementation shows indexing different field types in Redis including vector fields like productDescriptionEmbeddings and productImageEmbeddings.
387
+
388
+
```ts title="src/redis-index.ts"
389
+
import {
390
+
createClient,
391
+
SchemaFieldTypes,
392
+
VectorAlgorithms,
393
+
RediSearchSchema,
394
+
} from'redis';
395
+
396
+
constPRODUCTS_KEY_PREFIX='products';
397
+
constPRODUCTS_INDEX_KEY='idx:products';
398
+
constREDIS_URI='redis://localhost:6379';
399
+
let nodeRedisClient =null;
400
+
401
+
constgetNodeRedisClient=async () => {
402
+
if (!nodeRedisClient) {
403
+
nodeRedisClient =createClient({ url:REDIS_URI });
404
+
awaitnodeRedisClient.connect();
405
+
}
406
+
return nodeRedisClient;
407
+
};
408
+
409
+
constcreateRedisIndex=async () => {
410
+
/* (RAW COMMAND)
411
+
FT.CREATE idx:products
412
+
ON JSON
413
+
PREFIX 1 "products:"
414
+
SCHEMA
415
+
"$.productDisplayName" as productDisplayName TEXT NOSTEM SORTABLE
416
+
"$.brandName" as brandName TEXT NOSTEM SORTABLE
417
+
"$.price" as price NUMERIC SORTABLE
418
+
"$.masterCategory" as "masterCategory" TAG
419
+
"$.subCategory" as subCategory TAG
420
+
"$.productDescriptionEmbeddings" as productDescriptionEmbeddings VECTOR "FLAT" 10
421
+
"TYPE" FLOAT32
422
+
"DIM" 768
423
+
"DISTANCE_METRIC" "L2"
424
+
"INITIAL_CAP" 111
425
+
"BLOCK_SIZE" 111
426
+
"$.productDescription" as productDescription TEXT NOSTEM SORTABLE
427
+
"$.imageURL" as imageURL TEXT NOSTEM
428
+
"$.productImageEmbeddings" as productImageEmbeddings VECTOR "HNSW" 8
429
+
"TYPE" FLOAT32
430
+
"DIM" 1024
431
+
"DISTANCE_METRIC" "COSINE"
432
+
"INITIAL_CAP" 111
433
+
434
+
*/
435
+
constnodeRedisClient=awaitgetNodeRedisClient();
436
+
437
+
const schema:RediSearchSchema= {
438
+
'$.productDisplayName': {
439
+
type:SchemaFieldTypes.TEXT,
440
+
NOSTEM:true,
441
+
SORTABLE:true,
442
+
AS:'productDisplayName',
443
+
},
444
+
'$.brandName': {
445
+
type:SchemaFieldTypes.TEXT,
446
+
NOSTEM:true,
447
+
SORTABLE:true,
448
+
AS:'brandName',
449
+
},
450
+
'$.price': {
451
+
type:SchemaFieldTypes.NUMERIC,
452
+
SORTABLE:true,
453
+
AS:'price',
454
+
},
455
+
'$.masterCategory': {
456
+
type:SchemaFieldTypes.TAG,
457
+
AS:'masterCategory',
458
+
},
459
+
'$.subCategory': {
460
+
type:SchemaFieldTypes.TAG,
461
+
AS:'subCategory',
462
+
},
463
+
'$.productDescriptionEmbeddings': {
464
+
type:SchemaFieldTypes.VECTOR,
465
+
TYPE:'FLOAT32',
466
+
ALGORITHM:VectorAlgorithms.FLAT,
467
+
DIM:768,
468
+
DISTANCE_METRIC:'L2',
469
+
INITIAL_CAP:111,
470
+
BLOCK_SIZE:111,
471
+
AS:'productDescriptionEmbeddings',
472
+
},
473
+
'$.productDescription': {
474
+
type:SchemaFieldTypes.TEXT,
475
+
NOSTEM:true,
476
+
SORTABLE:true,
477
+
AS:'productDescription',
478
+
},
479
+
'$.imageURL': {
480
+
type:SchemaFieldTypes.TEXT,
481
+
NOSTEM:true,
482
+
AS:'imageURL',
483
+
},
484
+
'$.productImageEmbeddings': {
485
+
type:SchemaFieldTypes.VECTOR,
486
+
TYPE:'FLOAT32',
487
+
ALGORITHM:VectorAlgorithms.HNSW, //Hierarchical Navigable Small World graphs
FLAT : When you index your vectors in a "FLAT" manner, you're essentially storing them as they are, without any additional structure or hierarchy. When you query against a FLAT index, the algorithm will perform a linear scan through all the vectors to find the most similar ones. This is a more accurate, but much slower and compute intensive approach (suitable for smaller dataset).
510
+
511
+
HNSW : (Hierarchical Navigable Small World) :
512
+
HNSW is a graph-based method for indexing high-dimensional data. For bigger datasets it becomes slower to compare with every single vector in the index, so a probabilistic approach through the HNSW algorithm provides very fast search results (but sacrifices some accuracy)
513
+
:::
514
+
515
+
## What is vector search by KNN?
516
+
517
+
KNN, or k-Nearest Neighbors, is an algorithm used in both classification and regression tasks, but when referring to "KNN Search," we're typically discussing the task of finding the "k" points in a dataset that are closest (most similar) to a given query point. In the context of vector search, this means identifying the "k" vectors in our database that are most similar to a given query vector, usually based on some distance metric like cosine similarity or Euclidean distance.
518
+
519
+
Redis provides support for vector search, allowing you to index and then search for vectors [using the KNN approach](https://redis.io/docs/stack/search/reference/vectors/#pure-knn-queries).
0 commit comments