Newest 'retrieval-augmented-generation' Questions

2 votes

1 answer

2k views

ImportError: cannot import name 'RetrievalQA' from 'langchain.chains' in Python project [duplicate]

I’m trying to use LangChain in my Python project. My current retriever.py contains: from langchain.chains import RetrievalQA But when I run my code, I get: ImportError: cannot import name '...

Anilss

13

asked Nov 9 at 11:59

Advice

3 votes

0 replies

94 views

How can I efficiently run a RAG pipeline fully in Node.js using OpenAI embeddings and FAISS without relying on Python?

I am working with a small Retrieval-Augmented Generation (RAG) setup and I want to run the entire pipeline purely in Node.js without using any Python-based services. Workflow I am going to follow : ...

Gnaneshwar P

167

asked Oct 29 at 11:00

0 votes

0 answers

88 views

How can I persist a db for MultiVectorRetriever?

I am trying to build a RAG from pdfs where I extract the text and tables. I want to use a persistent db in order to store the chunks, tables, embeddings e.t.c. and then reload the db and use the ...

AndCh

339

asked Aug 25 at 8:13

0 votes

1 answer

63 views

LangChain FAISS similarity_search returns empty list despite populated index

I’m trying to use LangChain with FAISS to build a simple document retriever. I’ve indexed several documents, but when I call similarity_search, I always get an empty list. from langchain.embeddings....

Tony Guo

1

asked Jul 22 at 20:19

0 votes

0 answers

89 views

How to fetch specific file following the pattern for RAG in AWS bedrock?

I have created a knowledgebase in AWS and attached an S3 datasource to it. Now I want to perform query on specific files using RAG. When you create a datasource in AWS it creates serverless ...

Makarand

636

asked Apr 14 at 3:52

0 votes

1 answer

715 views

Error raised by bedrock service: when calling the InvokeModel operation: Malformed input request

ValueError: Error raised by bedrock service: An error occurred (ValidationException) when calling the InvokeModel operation: Malformed input request: #: required key [messages] not found, please ...

DIVYANSH TRIVEDI

1

asked Mar 30 at 18:53

0 votes

0 answers

55 views

KeyFrame detection in python

I'm building a RAG system for a platform where the primary content consists of videos and slides. My approach involves extracting keyframes from videos using OpenCV diff = cv2.absdiff(prev_image, ...

Daniel

13

asked Mar 24 at 15:40

0 votes

0 answers

112 views

How to expand context window based on metadata of the vector-store collection

I have a working RAG code, using Langchain and Milvus. Now I'd like to add the feature to look at the metadata of each of the extracted k documents, and do the following: find the paragraph_id of ...

ArieAI

512

asked Mar 2 at 11:30

0 votes

1 answer

2k views

BM25Retriever + ChromaDB Hybrid Search Optimization using LangChain

For those who have integrated the ChromaDB client with the Langchain framework, I am proposing the following approach to implement the Hybrid search (Vector Search + BM25Retriever): from ...

Diallo Francis Patrick

177

asked Mar 1 at 14:31

0 votes

0 answers

262 views

RAG on Mac (M3) with langchain (RetrievalQA): code runs indefinitely

I'm trying to run a RAG system on my mac M3-pro (18gb RAM) using langchain and `Llama-3.2-3B-Instruct` on a jupyter notebook (and the vector storage is Milvus). When I am invoking RetrievalQA....

ArieAI

512

asked Jan 13 at 10:32

0 votes

1 answer

1k views

ModuleNotFoundError: No module named 'huggingface_hub.inference._types'

I am running a RAG pipeline, with LlamaIndex and quantized LLama3-8B-Instruct. I just installed these libraries: !pip install --upgrade huggingface_hub !pip install --upgrade peft !pip install llama-...

Hoang Cuong Nguyen

451

asked Dec 21, 2024 at 4:34

1 vote

1 answer

184 views

Creating an index in PyMilvus 2.5.x does not actually index any rows

I am trying to create an index on text embeddings for a RAG system with Milvus 2.5.x as vector database in Python. I have already create the collections and populated them. My dataset size is quite ...

Liqs

197

asked Dec 17, 2024 at 14:10

1 vote

0 answers

37 views

Llamaindex Bug: ToolInteractiveReflectionAgentWorker not doing corrective reflection

I tried exactly the code here line by line but with a different contents of the tool (shouldn't matter): https://docs.llamaindex.ai/en/stable/examples/agent/introspective_agent_toxicity_reduction/ ...

Burny

11

asked Oct 19, 2024 at 10:57

1 vote

0 answers

425 views

code walkthrough of chain syntax in langchain [duplicate]

I am following a RAG tutorial from: https://medium.com/@vndee.huynh/build-your-own-rag-and-run-it-locally-langchain-ollama-streamlit-181d42805895 In the tutorial there is a section that creates a ...

Null Salad

1,070

asked Oct 17, 2024 at 20:28

0 votes

1 answer

1k views

Getting Tokens Usage Metadata from Gemini LLM calls in LangChain RAG RunnableSequence

I would like to have the token utilisation of my RAG chain each time it is invoked. No matter what I do, I can't seem to find the right way to output the total tokens from the Gemini model I'm using. ...

Matheus Torquato

1,669

asked Sep 30, 2024 at 15:04

6 votes

0 answers

444 views

Best Approach to Evaluate a Graph RAG Pipeline Using Metrics?

I’ve developed a Graph RAG (Retrieval-Augmented Generation) pipeline that performs reasoning over a knowledge graph. Given a user query, the pipeline retrieves relevant nodes and relationships in the ...

LLM_Enthusiast

87

asked Aug 17, 2024 at 3:44

-1 votes

1 answer

126 views

RAG | chromadb is retrieving the old vectors after first attemp not on new document

Actually i am building rag chatbot with gradio where the issue is that on first pdf file it give the actual response to that pdf file what the question is asked but if i upload new pdf and ask any ...

Husnain Izhar

1

asked Aug 2, 2024 at 13:11

0 votes

1 answer

5k views

Stream output using VLLM

I am working on a RAG app, where I use LLMs to analyze various documents. I'm looking to improve the UX by streaming responses in real time. a snippet of my code: params = SamplingParams(temperature=...

Cihan Yalçın

53

asked Jul 31, 2024 at 9:54

0 votes

0 answers

562 views

Unpredictable, bad performance of vector similarity search in Postgres database with `pgvector`

Problem I have a Postgres query in the context of a Retrieval Augmented Generation (RAQ) application that is wrapped in a database function which shows poor, unpredictable and varying performance. I ...

1awuesterose

577

asked Jul 25, 2024 at 14:20

1 vote

0 answers

220 views

Unable to create a vectorstore retriever using Chroma

I am trying to implement RAG with the GPT-3.5 API. However, my code execution gets stuck while trying to create the retriever. I didn't get this issue on Google Colab but I started getting this issue ...

S R

31

asked Jul 25, 2024 at 12:06

0 votes

1 answer

612 views

How to merge multiple (at least two) existing LlamaIndex VectorStoreIndex instances?

I'm working with LlamaIndex and have created two separate VectorStoreIndex instances, each from different documents. Now, I want to merge these two indexes into a single index. Here's my current setup:...

林抿均

53

asked Jul 16, 2024 at 12:07

-3 votes

3 answers

2k views

RAG using Langchain / Chroma - Unable to save more than 99 Records to Database

I'm using the following code to load the content of markdown files (only one file, in my case), split it into chunks and then embed and store the chunks one by one. My file is split into 801 chunks. ...

hassaanq

7

asked Jul 15, 2024 at 7:06

0 votes

1 answer

1k views

DSPy: How to get the number of tokens available for the input fields?

This is a cross-post from Issue #1245 of DSPy GitHub Repo. There were no responses in the past week, am I am working on a project with a tight schedule. When running a DSPy module with a given ...

Tom Lin

110

asked Jul 13, 2024 at 8:25

1 vote

0 answers

652 views

Converting PDFs to Markdown for Higher Quality Embeddings with Langchain.js

I am working on RAG LLM projects with Langchain.js using Node.js. Most of the data I retrieve are PDFs and a bit of JSON. For higher quality, I would like to convert my PDFs into Markdown before ...

Uiyoung Kim

11

asked Jul 8, 2024 at 13:12

1 vote

0 answers

303 views

Running entirely local RAG system in Colab over GDrive files?

I am trying to run an entirely local RAG using Colab on my google drive, without sending any tokens to an external language model API. I downloaded the model into a Drive folder (here just called path,...

Groovatys_rainbow

11

asked Jul 7, 2024 at 20:34

1 vote

1 answer

1k views

LlamaParse not able to parse documents inside directory

Whenever I try to use LlamaParse I get an error that states the file_input must be a file path string, file bytes, or buffer object. parser = LlamaParse(result_type="markdown") ...

verstandskies

11

asked Jul 4, 2024 at 13:46

0 votes

2 answers

532 views

Huggingface library not being able to replace separators in create_documents: "AttributeError: 'dict' object has no attribute 'replace'"

I'm a beginner in the chatbot developer world and currently building a rag code to create a context based chatbot, but I keep getting this error, I believe it happens when the text is being split, ...

user25991121

1

asked Jul 4, 2024 at 0:39

1 vote

0 answers

277 views

ChromaDB terminates Flask without exception

I'm creating an API with Flask. The other side will send me a file and I will save it to chroma database on my side. Chroma.add will terminates my program without any exception. When I save a smaller ...

StaEx_G

13

asked Jul 3, 2024 at 2:45

0 votes

2 answers

635 views

BedrockEmbeddings - botocore.errorfactory.ModelTimeoutException

I am trying to get vector embeddings on scale for documents. Importing, from langchain_community.embeddings import BedrockEmbeddings package. Using embeddings = BedrockEmbeddings( ...

Benny

7,238

asked Jun 30, 2024 at 16:45

1 vote

1 answer

569 views

Need clarification for a custom RAG project using Mistral 7B Instruct

I am a Langchain beginner. I am tasked with setting up an AI assistant for an app of a fake theater, let's call it SignStage, that has two Halls A and B and each play is staged twice a day in the ...

NIKOMAHOS

21

asked Jun 20, 2024 at 8:30

0 votes

1 answer

551 views

Tooling with Langchain Bedrock for RAG AI-Chat Generation

I have a function that takes in a Langugaue Model, a vector store, question and tools; and returns a response, at the moment the tools argument is not being added because based on this example the ...

DaviesTobi alex

670

asked Jun 17, 2024 at 21:39

3 votes

1 answer

361 views

how to create embedding for 4bit quantized llama3 model using huggingface and langchain

I am trying to do a rag using longchain and huggingface, from langchain_huggingface import HuggingFaceEmbeddings model_name = "unsloth/llama-3-8b-Instruct-bnb-4bit" model_kwargs = {'device':...

brian chow

31

asked Jun 17, 2024 at 19:51

1 vote

1 answer

1k views

is there a way to filter and exclude documents when doing similarity search in a vector db using langchain?

So far my research only shows me how to filter to a specific a specific document or page but it doesn't show how to exclude some documents from the search. results_with_scores = db....

JosephNgugiMuiruri

105

asked Jun 17, 2024 at 7:24

0 votes

2 answers

969 views

langchain DirectoryLoader stuck when reading .md files

Trying to create embeddings from .md files but DirectoryLoader is stuck. This works for pdf files but not for .md. I am using the below code to create a vector db in chroma, this works perfectly when ...

ranguy

21

asked Jun 16, 2024 at 12:17

0 votes

1 answer

196 views

Why is this KNN vector query to a Google Spanner database taking over 30 seconds?

I've got a document database with about 6,000 records. I've successfully used vector database backends for RAG queries with these records. I'd like to move to from a vector-specific database to Google'...

Paul Vincent Craven

2,066

asked Jun 14, 2024 at 20:45

0 votes

1 answer

525 views

JsonOutputParser error while querying through create_retrieval_chain in langchain

I am creating a django API where it takes a pdf doc and using RAG, a query is made to the doc and the output is generated via LLM. I want the output as json and I am using jsonoutputparser but I am ...

Aliasgar Taksali

1

asked Jun 14, 2024 at 13:56

1 vote

1 answer

337 views

Unable to get expected results using BM25 or any search functions in Weaviate

I have created a collection in Weaviate, and ingested some documents into the Weaviate database using LlamaIndex. When I used the default search, I found that it was retrieving wrong documents the ...

SoftwearEnginear

13

asked Jun 13, 2024 at 8:22

2 votes

1 answer

1k views

Using a different chain, i.e., create_retrieval_chain in custom tools due to RetrievalQA deprecation

I am using RetrievalQA to define custom tools for my RAG. According to the official documentation, RetrievalQA will be deprecated soon, and it is recommended to use other chains such as ...

Skyward

21

asked Jun 11, 2024 at 14:51

0 votes

1 answer

3k views

BadRequestError: Context length exceeded the 8192 token limit, resulting in error code 400

I am building a chat flow in Azure AI studio. The goal is to have 3 index lookup and have the LLM compare the difference. However, if I set top_k as 3, I would have the following error as the LLM ...

GKecheng

13

asked Jun 10, 2024 at 11:30

1 vote

0 answers

192 views

How to add indivdual documents to chromadb using langchain (while still using chunks)?

I would like to be able to add and remove documents from chromadb using langchain without creating a new vectorstore every time. I understand that you can do this by referencing document ids, but how ...

Cody Kletter

11

asked Jun 9, 2024 at 6:34

1 vote

0 answers

46 views

Training or finetuning RETRO model with my own dataset using lucidrains RETRO-pytorch

While running the Trainingwrapper script of lucidrain/RETRO-pytorch in Google colab, I get the exception: No embeddings found in folder .tmp/embeddings. The log says there's a file saved in that ...

Zahin Mohammad

11

asked Jun 9, 2024 at 4:18

0 votes

1 answer

266 views

Retrieving relevant documents for specific queries

I am trying to retrieve the top 5 relevant documents related to a user's query using the RAG-Token model. I'm using a custom knowledge base and I tried adjusting the retrieval parameters. This is the ...

Rhett

1

asked Jun 9, 2024 at 0:15

0 votes

2 answers

375 views

RAG with LlamaIndex SubDocument, how to persist embeddings

Im doing a RAG model with some documents. Testing Llamaindex SubDocSummaryPack, seems to be a good choice for documents chunking instead of simple chunking the original information. After using ...

Diego

11

asked Jun 7, 2024 at 16:56

5 votes

0 answers

4k views

Insert thousands of documents into a chroma db

I have thousands of text files that I would like to add to a Chroma DB. I noticed that when I searched a certain number of documents, the search query no longer worked properly. I can no longer get ...

ogre

89

asked Jun 7, 2024 at 4:39

1 vote

1 answer

2k views

How to load web articles into RAG LLM for embedding

I watched this tutorial (https://youtu.be/2TJxpyO3ei4) on setting up RAG (retrieval augmented generation) using LLMs (I used a local embedding model and a local model for queries). I want to be able ...

Nero

111

asked Jun 6, 2024 at 9:36

0 votes

1 answer

710 views

Connect Chainlit to existing ChromaDb

I am trying to create a RAG application using chainlit. This is the code, I got from an existing tutorial, which is working fine. Only problem that the user has to choose a pdf file every time. I want ...

raju

7,004

asked Jun 2, 2024 at 7:43

1 vote

0 answers

89 views

Retriever using LLM - capture context data

This is the code shown below for getting response from RAG LLM. def response_llm(prompt, text1, text2, int1, int2): if len(text1)>1: prompt = prompt + "\n text1: " + text1 if ...

Vallalar_dev

47

asked May 31, 2024 at 10:19

0 votes

0 answers

540 views

"unstructured" and langchain's "HTMLHeaderTextSplitter" ignores "pre" and/or "code" HTML tags

I want to read a webpage and split it into chunks to feed a vector database in a RAG pipeline. This webpage has python code examples on it, but I cannot create chunks with that code text, it is ...

Abraham Martín Expósito

29

asked May 29, 2024 at 11:02

-2 votes

1 answer

159 views

Issues with LLM Retrieving Passwords from Provided Passages

I'm using a language model (LLM) and providing it with a passage that contains the password for a specific website. Later, I'm asking the LLM to retrieve the password from the passage, similar to a ...

Sanjay Mythili

1

asked May 23, 2024 at 5:54

1 vote

0 answers

1k views

BM25 + PgVector Dense retriever doesn't give expected accuracy in hybrid searching

Iam a building a prototype for fetching the relevant documents for an input question (should search based on keywords and context). For this, I have the data frames of vector embeddings (all-mpnet-...

Bhavya

65

asked May 21, 2024 at 7:47

Collectives™ on Stack Overflow