Creating Tools (fka Chains)
Datasets (Vectors & Metadata)
Datasets
Install VecDB
pip install vecdb
Login and create dataset
Python
import relevanceai as rai
rai.login()
ds = rai.Dataset(
id = "hello_world"
)
Different ways to insert data into your dataset:
Automatically vectorizes the text
field:
Python
ds.insert(
documents=[
{"_id": "doc1", "text": "tiger", "category": "land"},
{"_id": "doc2", "text": "lion", "category": "land"},
{"_id": "doc3", "text": "tigr", "category": "land"},
{"_id": "doc4", "text": "car", "category": "land"},
]
)
Now its ready for search:
Python
ds.search(text="vroom", page_size=1)
Changing models
Lets change the model to “text-embedding-ada-002”:
Python
ds.insert(
documents=[
{"_id": "doc1", "text": "tiger", "category": "land"},
{"_id": "doc2", "text": "lion", "category": "land"},
{"_id": "doc3", "text": "tigr", "category": "land"},
{"_id": "doc4", "text": "car", "category": "land"},
],
encoders=[{
"model_name":"text-embedding-ada-002",
"field" : "text"
}]
)
We support these models [‘image_text’, ‘text_image’, ‘all-mpnet-base-v2’, ‘clip-vit-b-32-image’, ‘clip-vit-b-32-text’, ‘clip-vit-l-14-image’, ‘clip-vit-l-14-text’, ‘sentence-transformers’, ‘text-embedding-ada-002’, ‘cohere-small’, ‘cohere-large’, ‘cohere-multilingual-22-12’]:
Python
ds.search(text="vroom", model="text-embedding-ada-002")
Bring your own vectors:
Python
ds.insert(
documents=[
{"_id": "doc1", "text": "tiger", "category": "land", "text_vector_" : [0.1, 0.1, 0.1]},
{"_id": "doc2", "text": "lion", "category": "land", "text_vector_" : [0.2, 0.2, 0.2]},
{"_id": "doc3", "text": "tigr", "category": "land", "text_vector_" : [0.3, 0.3, 0.3]},
{"_id": "doc4", "text": "car", "category": "land", "text_vector_" : [0.4, 0.4, 0.4]},
]
)
Its ready for search:
Python
ds.search(
vector=[0.1, 0.1, 0.1]
)
Turn the search into a step
Note this only works for the hosted embeddings. To do this for your own embeddings it’ll have to be done outside of the chain.
Python
from relevanceai.params import StringParam
query = StringParam("query")
ds.search(
text=query,
return_as_step=True
)
You can list your datasets
Python
from relevanceai import rai
rai.list_datasets()
For more complex search
Note: This search can’t be added to a chain yet. This search can do a lot of things from searching with vectors.
Python
ds.db.search(...)