98 questions
2
votes
1
answer
2k
views
ImportError: cannot import name 'RetrievalQA' from 'langchain.chains' in Python project [duplicate]
I’m trying to use LangChain in my Python project. My current retriever.py contains:
from langchain.chains import RetrievalQA
But when I run my code, I get:
ImportError: cannot import name '...
Advice
3
votes
0
replies
94
views
How can I efficiently run a RAG pipeline fully in Node.js using OpenAI embeddings and FAISS without relying on Python?
I am working with a small Retrieval-Augmented Generation (RAG) setup and I want to run the entire pipeline purely in Node.js without using any Python-based services.
Workflow I am going to follow :
...
0
votes
0
answers
88
views
How can I persist a db for MultiVectorRetriever?
I am trying to build a RAG from pdfs where I extract the text and tables. I want to use a persistent db in order to store the chunks, tables, embeddings e.t.c. and then reload the db and use the ...
0
votes
1
answer
63
views
LangChain FAISS similarity_search returns empty list despite populated index
I’m trying to use LangChain with FAISS to build a simple document retriever. I’ve indexed several documents, but when I call similarity_search, I always get an empty list.
from langchain.embeddings....
0
votes
0
answers
89
views
How to fetch specific file following the pattern for RAG in AWS bedrock?
I have created a knowledgebase in AWS and attached an S3 datasource to it. Now I want to perform query on specific files using RAG.
When you create a datasource in AWS it creates serverless ...
0
votes
1
answer
715
views
Error raised by bedrock service: when calling the InvokeModel operation: Malformed input request
ValueError: Error raised by bedrock service: An error occurred (ValidationException) when calling the InvokeModel operation: Malformed input request: #: required key [messages] not found, please ...
0
votes
0
answers
55
views
KeyFrame detection in python
I'm building a RAG system for a platform where the primary content consists of videos and slides. My approach involves extracting keyframes from videos using OpenCV
diff = cv2.absdiff(prev_image, ...
0
votes
0
answers
112
views
How to expand context window based on metadata of the vector-store collection
I have a working RAG code, using Langchain and Milvus. Now I'd like to add the feature to look at the metadata of each of the extracted k documents, and do the following:
find the paragraph_id of ...
0
votes
1
answer
2k
views
BM25Retriever + ChromaDB Hybrid Search Optimization using LangChain
For those who have integrated the ChromaDB client with the Langchain framework, I am proposing the following approach to implement the Hybrid search (Vector Search + BM25Retriever):
from ...
0
votes
0
answers
262
views
RAG on Mac (M3) with langchain (RetrievalQA): code runs indefinitely
I'm trying to run a RAG system on my mac M3-pro (18gb RAM) using langchain and `Llama-3.2-3B-Instruct` on a jupyter notebook (and the vector storage is Milvus).
When I am invoking RetrievalQA....
0
votes
1
answer
1k
views
ModuleNotFoundError: No module named 'huggingface_hub.inference._types'
I am running a RAG pipeline, with LlamaIndex and quantized LLama3-8B-Instruct. I just installed these libraries:
!pip install --upgrade huggingface_hub
!pip install --upgrade peft
!pip install llama-...
1
vote
1
answer
184
views
Creating an index in PyMilvus 2.5.x does not actually index any rows
I am trying to create an index on text embeddings for a RAG system with Milvus 2.5.x as vector database in Python. I have already create the collections and populated them. My dataset size is quite ...
1
vote
0
answers
37
views
Llamaindex Bug: ToolInteractiveReflectionAgentWorker not doing corrective reflection
I tried exactly the code here line by line but with a different contents of the tool (shouldn't matter):
https://docs.llamaindex.ai/en/stable/examples/agent/introspective_agent_toxicity_reduction/
...
1
vote
0
answers
425
views
code walkthrough of chain syntax in langchain [duplicate]
I am following a RAG tutorial from: https://medium.com/@vndee.huynh/build-your-own-rag-and-run-it-locally-langchain-ollama-streamlit-181d42805895
In the tutorial there is a section that creates a ...
0
votes
1
answer
1k
views
Getting Tokens Usage Metadata from Gemini LLM calls in LangChain RAG RunnableSequence
I would like to have the token utilisation of my RAG chain each time it is invoked.
No matter what I do, I can't seem to find the right way to output the total tokens from the Gemini model I'm using.
...
6
votes
0
answers
444
views
Best Approach to Evaluate a Graph RAG Pipeline Using Metrics?
I’ve developed a Graph RAG (Retrieval-Augmented Generation) pipeline that performs reasoning over a knowledge graph. Given a user query, the pipeline retrieves relevant nodes and relationships in the ...
-1
votes
1
answer
126
views
RAG | chromadb is retrieving the old vectors after first attemp not on new document
Actually i am building rag chatbot with gradio where the issue is that on first pdf file it give the actual response to that pdf file what the question is asked but if i upload new pdf and ask any ...
0
votes
1
answer
5k
views
Stream output using VLLM
I am working on a RAG app, where I use LLMs to analyze various documents. I'm looking to improve the UX by streaming responses in real time.
a snippet of my code:
params = SamplingParams(temperature=...
0
votes
0
answers
562
views
Unpredictable, bad performance of vector similarity search in Postgres database with `pgvector`
Problem
I have a Postgres query in the context of a Retrieval Augmented Generation (RAQ) application that is wrapped in a database function which shows poor, unpredictable and varying performance. I ...
1
vote
0
answers
220
views
Unable to create a vectorstore retriever using Chroma
I am trying to implement RAG with the GPT-3.5 API. However, my code execution gets stuck while trying to create the retriever. I didn't get this issue on Google Colab but I started getting this issue ...
0
votes
1
answer
612
views
How to merge multiple (at least two) existing LlamaIndex VectorStoreIndex instances?
I'm working with LlamaIndex and have created two separate VectorStoreIndex instances, each from different documents. Now, I want to merge these two indexes into a single index. Here's my current setup:...
-3
votes
3
answers
2k
views
RAG using Langchain / Chroma - Unable to save more than 99 Records to Database
I'm using the following code to load the content of markdown files (only one file, in my case), split it into chunks and then embed and store the chunks one by one. My file is split into 801 chunks. ...
0
votes
1
answer
1k
views
DSPy: How to get the number of tokens available for the input fields?
This is a cross-post from Issue #1245 of DSPy GitHub Repo. There were no responses in the past week, am I am working on a project with a tight schedule.
When running a DSPy module with a given ...
1
vote
0
answers
652
views
Converting PDFs to Markdown for Higher Quality Embeddings with Langchain.js
I am working on RAG LLM projects with Langchain.js using Node.js. Most of the data I retrieve are PDFs and a bit of JSON.
For higher quality, I would like to convert my PDFs into Markdown before ...
1
vote
0
answers
303
views
Running entirely local RAG system in Colab over GDrive files?
I am trying to run an entirely local RAG using Colab on my google drive, without sending any tokens to an external language model API. I downloaded the model into a Drive folder (here just called path,...
1
vote
1
answer
1k
views
LlamaParse not able to parse documents inside directory
Whenever I try to use LlamaParse I get an error that states the file_input must be a file path string, file bytes, or buffer object.
parser = LlamaParse(result_type="markdown")
...
0
votes
2
answers
532
views
Huggingface library not being able to replace separators in create_documents: "AttributeError: 'dict' object has no attribute 'replace'"
I'm a beginner in the chatbot developer world and currently building a rag code to create a context based chatbot, but I keep getting this error, I believe it happens when the text is being split, ...
1
vote
0
answers
277
views
ChromaDB terminates Flask without exception
I'm creating an API with Flask. The other side will send me a file and I will save it to chroma database on my side. Chroma.add will terminates my program without any exception. When I save a smaller ...
0
votes
2
answers
635
views
BedrockEmbeddings - botocore.errorfactory.ModelTimeoutException
I am trying to get vector embeddings on scale for documents.
Importing, from langchain_community.embeddings import BedrockEmbeddings package.
Using embeddings = BedrockEmbeddings( ...
1
vote
1
answer
569
views
Need clarification for a custom RAG project using Mistral 7B Instruct
I am a Langchain beginner.
I am tasked with setting up an AI assistant for an app of a fake theater, let's call it SignStage, that has two Halls A and B and each play is staged twice a day in the ...
0
votes
1
answer
551
views
Tooling with Langchain Bedrock for RAG AI-Chat Generation
I have a function that takes in a Langugaue Model, a vector store, question and tools; and returns a response, at the moment the tools argument is not being added because based on this example the ...
3
votes
1
answer
361
views
how to create embedding for 4bit quantized llama3 model using huggingface and langchain
I am trying to do a rag using longchain and huggingface,
from langchain_huggingface import HuggingFaceEmbeddings
model_name = "unsloth/llama-3-8b-Instruct-bnb-4bit"
model_kwargs = {'device':...
1
vote
1
answer
1k
views
is there a way to filter and exclude documents when doing similarity search in a vector db using langchain?
So far my research only shows me how to filter to a specific a specific document or page but it doesn't show how to exclude some documents from the search.
results_with_scores = db....
0
votes
2
answers
969
views
langchain DirectoryLoader stuck when reading .md files
Trying to create embeddings from .md files but DirectoryLoader is stuck. This works for pdf files but not for .md.
I am using the below code to create a vector db in chroma, this works perfectly when ...
0
votes
1
answer
196
views
Why is this KNN vector query to a Google Spanner database taking over 30 seconds?
I've got a document database with about 6,000 records. I've successfully used vector database backends for RAG queries with these records. I'd like to move to from a vector-specific database to Google'...
0
votes
1
answer
525
views
JsonOutputParser error while querying through create_retrieval_chain in langchain
I am creating a django API where it takes a pdf doc and using RAG, a query is made to the doc and the output is generated via LLM. I want the output as json and I am using jsonoutputparser but I am ...
1
vote
1
answer
337
views
Unable to get expected results using BM25 or any search functions in Weaviate
I have created a collection in Weaviate, and ingested some documents into the Weaviate database using LlamaIndex. When I used the default search, I found that it was retrieving wrong documents the ...
2
votes
1
answer
1k
views
Using a different chain, i.e., create_retrieval_chain in custom tools due to RetrievalQA deprecation
I am using RetrievalQA to define custom tools for my RAG. According to the official documentation, RetrievalQA will be deprecated soon, and it is recommended to use other chains such as ...
0
votes
1
answer
3k
views
BadRequestError: Context length exceeded the 8192 token limit, resulting in error code 400
I am building a chat flow in Azure AI studio. The goal is to have 3 index lookup and have the LLM compare the difference.
However, if I set top_k as 3, I would have the following error as the LLM ...
1
vote
0
answers
192
views
How to add indivdual documents to chromadb using langchain (while still using chunks)?
I would like to be able to add and remove documents from chromadb using langchain without creating a new vectorstore every time. I understand that you can do this by referencing document ids, but how ...
1
vote
0
answers
46
views
Training or finetuning RETRO model with my own dataset using lucidrains RETRO-pytorch
While running the Trainingwrapper script of lucidrain/RETRO-pytorch in Google colab, I get the exception: No embeddings found in folder .tmp/embeddings. The log says there's a file saved in that ...
0
votes
1
answer
266
views
Retrieving relevant documents for specific queries
I am trying to retrieve the top 5 relevant documents related to a user's query using the RAG-Token model. I'm using a custom knowledge base and I tried adjusting the retrieval parameters.
This is the ...
0
votes
2
answers
375
views
RAG with LlamaIndex SubDocument, how to persist embeddings
Im doing a RAG model with some documents.
Testing Llamaindex SubDocSummaryPack, seems to be a good choice for documents chunking instead of simple chunking the original information.
After using ...
5
votes
0
answers
4k
views
Insert thousands of documents into a chroma db
I have thousands of text files that I would like to add to a Chroma DB. I noticed that when I searched a certain number of documents, the search query no longer worked properly. I can no longer get ...
1
vote
1
answer
2k
views
How to load web articles into RAG LLM for embedding
I watched this tutorial (https://youtu.be/2TJxpyO3ei4) on setting up RAG (retrieval augmented generation) using LLMs (I used a local embedding model and a local model for queries). I want to be able ...
0
votes
1
answer
710
views
Connect Chainlit to existing ChromaDb
I am trying to create a RAG application using chainlit.
This is the code, I got from an existing tutorial, which is working fine. Only problem that the user has to choose a pdf file every time. I want ...
1
vote
0
answers
89
views
Retriever using LLM - capture context data
This is the code shown below for getting response from RAG LLM.
def response_llm(prompt, text1, text2, int1, int2):
if len(text1)>1:
prompt = prompt + "\n text1: " + text1
if ...
0
votes
0
answers
540
views
"unstructured" and langchain's "HTMLHeaderTextSplitter" ignores "pre" and/or "code" HTML tags
I want to read a webpage and split it into chunks to feed a vector database in a RAG pipeline. This webpage has python code examples on it, but I cannot create chunks with that code text, it is ...
-2
votes
1
answer
159
views
Issues with LLM Retrieving Passwords from Provided Passages
I'm using a language model (LLM) and providing it with a passage that contains the password for a specific website. Later, I'm asking the LLM to retrieve the password from the passage, similar to a ...
1
vote
0
answers
1k
views
BM25 + PgVector Dense retriever doesn't give expected accuracy in hybrid searching
Iam a building a prototype for fetching the relevant documents for an input question (should search based on keywords and context). For this, I have the data frames of vector embeddings (all-mpnet-...