I tried creating a chromadb using embedding values but its not working. I have my own embedding model that I want to deploy on the server. So here if I pass text as list I am getting its embedded values as output in list. But how to create chromadb or faiss db from this embedded values. As in the internet its showing how to create using the embedding model inference. I was able to do this one:
from langchain_huggingface import HuggingFaceEmbeddings
modelPath = 'models/model_v2'
# Create a dictionary with model configuration options, specifying to use the CPU for computations
model_kwargs = {'device':'cpu'}
# Create a dictionary with encoding options, specifically setting 'normalize_embeddings' to False
encode_kwargs = {'normalize_embeddings': False}
# Initialize an instance of HuggingFaceEmbeddings with the specified parameters
embeddings = HuggingFaceEmbeddings(
model_name=modelPath, # Provide the pre-trained model's path
model_kwargs=model_kwargs, # Pass the model configuration options
encode_kwargs=encode_kwargs # Pass the encoding options
)
db = Chroma.from_documents(documents, embeddings)
I was able to create chromadb like this, but here what to do next:
from sentence_transformers import SentenceTransformer
sentences = ["This is an example sentence", "Each sentence is converted"]
model = SentenceTransformer(modelPath)
embeddings = model.encode(sentences)
print(embeddings)
Here in the above code, I have both the text and embedding I should be able to create a chromadb from it. Whats the manual process to add both text and embedding values? Please help I am stuck here, not able to move on with it.
And please don't ask why I want this, this is a project requirement. So, that embedding model is in different server and for input text I will be getting embedding output.