Skip to main content
Filter by
Sorted by
Tagged with
4 votes
2 answers
507 views

I'm trying to install the LLaMA 3.1 8B model by following the instructions in the llamamodel GitHub README. When I run the command: llama-model download --source meta --model-id CHOSEN_MODEL_ID (...
alwayssaewoo's user avatar
0 votes
0 answers
49 views

I am doing some tests using Ollama on local computer, with Llama 3.2, which consists in prompting a task against a document. I read that after having reached maximum context, I should restart the ...
user305883's user avatar
  • 1,739
0 votes
0 answers
46 views

I'm trying to extract API integration parameters like Authorization headers, query params, and request body fields from API documentation. This is essentially a custom NER task. I’ve experimented with ...
Rukhma's user avatar
  • 1
0 votes
1 answer
139 views

I am using llama stack (https://llama-stack.readthedocs.io/en/latest/) and as provider of models to interact with Ollama. At first I used tool calling from models directly downloaded from Ollama. ...
andrealorenzetti's user avatar
0 votes
0 answers
99 views

I'm using a locally hosted model(llama3.2) with Ollama and trying to replicate functionality similar to bind_tools(to create and run the tools with LLM ) for tool calling. This is my model service ...
Ahmad Ali's user avatar
1 vote
0 answers
235 views

I'm following codes from links: https://github.com/jalr4ever/Tiny-OAI-MCP-Agent/blob/main/mcp_client.py https://github.com/philschmid/mcp-openai-gemini-llama-example/blob/master/...
Akshay Kulkarni's user avatar
0 votes
1 answer
122 views

So I'm trying to toss together a little demo that is essentially: 1) generate some text live and save to a file (I've got this working), 2) have a local instance of an LLM running (Llama3 in this case)...
PoGaMi's user avatar
  • 133
0 votes
0 answers
565 views

I am teaching myself LLM programming by developing a RAG application. I am running Llama 3.2 on my laptop using Ollama, and using a mix of SQLite and langchain. I can pass a context to the llm along ...
punkish's user avatar
  • 15.6k
0 votes
0 answers
28 views

I am learning to fine tune Llama3.1 on a custom dataset.I have converted my dataset to a hugging face dataset.By evaluating directly using the model gives accuracy of 80%.Now when i am trying to fine ...
Jagatha Pugazhendhi's user avatar
0 votes
0 answers
330 views

I'm extracting Inputs, Outputs, and Summaries from large legacy codebases (COBOL, RPG), but facing repetition issues, especially when generating bullet points. Summaries work fine, but sections like ...
Saurav Srivastava's user avatar
0 votes
1 answer
129 views

I am communicating with ollama (llama3.1b) and have it respond with a tool call that I can resolve. However - I am struggling with the final call to ollama that would resolve the orginal question. I ...
Michaela.Merz's user avatar
1 vote
1 answer
436 views

On a windows 11 machine, I am trying to get a json reponse from the llama3 model on my local ollama installation on jupyter notebook but it does not work Steps I tried: This below snippet works ...
Pri's user avatar
  • 11
0 votes
1 answer
225 views

I am trying to make Llama3 Instruct able to use function call from tools , it does work but now it is answering only function call! if I ask something like who are you ? or what is apple device ? it ...
Kodr.F's user avatar
  • 14.5k
1 vote
0 answers
3k views

I'm integrating the Groq API in my Flask application to classify social media posts using a model based on DeepSeek r1 (e.g., deepseek-r1-distill-llama-70b). I build a prompt by combining multiple ...
Towsif Ahamed Labib's user avatar
0 votes
0 answers
135 views

I have a collection of news articles and I want to produce some new (unbiased) news articles using meta-llama/Meta-Llama-3-8B-Instruct. The articles are in a huggingface Dataset and to feed the ...
Xhulio Xhelilai's user avatar
0 votes
1 answer
131 views

The end-point implementation is like so: @app.post("/api/chat/{question}", dependencies=[Depends(sessionValidator)]) async def chat(question: str = Path(...), my_chatbot=Depends(...
Sun Bee's user avatar
  • 1,840
0 votes
2 answers
2k views

I have deployed Llama 3.1 70B and Llama 3.1 8B on my system and it works perfectly for the 8B model. When I tested it for 70B, it underutilized the GPU and took a lot of time to respond. Here are the ...
JAMSHAID's user avatar
  • 1,375
1 vote
0 answers
334 views

I am using the LLaMA 3.1 70B Instruct model via AWS Bedrock with LangChain for agent-based function calling. While testing, I observed the following issues: Inconsistent Tool Calling: The model often ...
Nilesh Malode's user avatar
0 votes
1 answer
364 views

I'm integrating a SQL agent with LangChain in a Node.js application using the AWS Bedrock model (us.meta.llama3-2-1b-instruct-v1:0) for natural language to SQL conversion. Database: PostgreSQL ...
Nibin's user avatar
  • 3,960
0 votes
1 answer
313 views

I'm working on a project that uses llama_index to retrieve document information in Jupyter Notebook, but I'm experiencing very slow query response times (around 15 minutes per query). I'm using the ...
Kavinila's user avatar
0 votes
2 answers
1k views

I'm testing a local GPT with Ollama running on a Flask server. I've developed an interface to chat using Llama3.2 model. I've managed to create the chat history and the chatbot answers according to ...
P. Frau's user avatar
  • 71
8 votes
2 answers
2k views

I am following along a LangChain tutorial for LangGraph. They are using OpenAI models in the tutorial. However, I want to use my local Ollama models. I am using Llama 3.2 as that supports tool ...
Neha's user avatar
  • 179
0 votes
2 answers
957 views

When I run ollama run llama3.2 after it is installed this error shows up, llama runner process has terminated: exit status 0xc0000135. Can anyone tell me the issue. Using ollama I installed llama3.2 ...
Arhan's user avatar
  • 3
0 votes
1 answer
2k views

HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': 'meta-llama/llama3.1/8b-instruct-fp16'. Use repo_type argument if needed. tokenizer = AutoTokenizer....
James Brittain's user avatar
0 votes
0 answers
217 views

I have been trying to Dockerize Ollama and consequently load the Llama3.1 model into the Google Cloud Run deployment. While Ollama is running as expected in Cloud Run, the model is not loaded as ...
wayne's user avatar
  • 1
0 votes
1 answer
376 views

I'm working on a chatbot application using Amazon Bedrock with the Llama 3 model. I'm using Streamlit for the frontend and LangChain for managing the conversation. However, I'm encountering an issue ...
rahul raj's user avatar
0 votes
1 answer
83 views

I am facing an issue with my Django project which runs inside a Docker container along with Redis and Celery. I am using Redis and Celery to manage queues and other processes. The problem is that my ...
Mehmet Yıldırım's user avatar
2 votes
1 answer
5k views

I'm currently running the LLama 3.1:8B model using the Ollama Docker container. My context window has the following structure: Bot Personality Bot Directives Conversation (an array of messages) I ...
Claus's user avatar
  • 5,762
2 votes
0 answers
93 views

I have a custom llama3:8b model which I have created using a model file with specific instructions. I need steps/resources to do the following which I could not find: Deploy this llama model on cloud ...
Toji's user avatar
  • 250
2 votes
1 answer
3k views

I installed the Llama 3.1 8B model through Meta's Github page, but I can't get their example code to work. I'm running the following code in the same directory as the Meta-Llama-3.1-8B folder: import ...
MatthewScarpino's user avatar
1 vote
0 answers
631 views

I'm encountering a RuntimeError while trying to load a state_dict for LlamaForCausalLM. The error message indicates a size mismatch: RuntimeError: Error(s) in loading state_dict for LlamaForCausalLM: ...
DigiSpocDeera's user avatar
1 vote
0 answers
50 views

You can also use Llama3 model on SageMaker JumpStart as below: from sagemaker.jumpstart.model import JumpStartModel model = JumpStartModel(model_id = "meta-textgeneration-llama-3-70b-instruct&...
celsofranssa's user avatar
0 votes
1 answer
360 views

I am currently trying to do cross-validation with Llama-3 LLM in Google Collab, and I am facing with the issue that the GPU memory runs out way before I am able to finish my experiments. My code is ...
Silvia A's user avatar
0 votes
0 answers
908 views

I am working on knowledge graph and all connection to neo4j browser is a success(using neo4j desktop windows not docker deployed). however with llama3 i am running the same notebooks as in property ...
Kcndze's user avatar
  • 29
0 votes
1 answer
604 views

I'm trying to create a service using the llama3-70b model by combining langchain and llama-cpp-python on a server workstation. While the model works well with short prompts(question1, question2), it ...
bibiibibin's user avatar
2 votes
1 answer
3k views

I'm fine-tuning llama3 using unsloth , I trained my model and saved it successfully but when I tried loading using AutoPeftModelForCausalLM.from_pretrained ,then I used TextStreamer from transformer ...
Sarra Ben Messaoud's user avatar
0 votes
1 answer
611 views

I'm working with LlamaIndex and have created two separate VectorStoreIndex instances, each from different documents. Now, I want to merge these two indexes into a single index. Here's my current setup:...
林抿均's user avatar
1 vote
0 answers
391 views

I've tried loading Huggingface transformers models to MPS in two different ways: llm = AutoModelForCausalLM.from_pretrained( "meta-llama/Meta-Llama-3-8B-Instruct", torch_dtype=torch....
Owen D's user avatar
  • 85
1 vote
1 answer
5k views

I wanna set my eos_token_id, and pad_token_id. I googled alot, and most are suggesting to use e.g. tokenizer.pad_token_id (like from here https://huggingface.co/meta-llama/Meta-Llama-3-8B/discussions/...
yts61's user avatar
  • 1,669
0 votes
0 answers
194 views

I'm trying to load an 8 bit quantized version of llama3 on my local laptop (linux) from llama.cpp, but the process is getting killed due to memory exceeding. Is there any way around this? I've already ...
Anagha's user avatar
  • 1
0 votes
1 answer
523 views

I managed to run the Llama server with the following command: ./llama-server -m models/7B/ggml-model.gguf -c 2048 My request looks like this: time curl --request POST --url http://localhost:8080/...
didinko's user avatar
  • 572
0 votes
1 answer
297 views

I'm trying to get llama3-70b to find all sequences that match a given list. The list contains multiple terms (which range from one word to twelve words). I want the model to match all terms in a given ...
joshpopelka20's user avatar
0 votes
0 answers
846 views

I am new to LLMs. I created a local RAG using Llamaindex with llama3 to load our documents and I am using ChromaDb to persist the embeddings. I am not clear on how do I specify a specific embedding ...
tigger tigger's user avatar
2 votes
0 answers
3k views

I encountered an error when downloading a model from huggingface. It was working on Google Colab, but not working on my windows machine. I am using Python 3.10.0. The error code is shown below: E:\...
Aswin Jimmy's user avatar
1 vote
0 answers
50 views

When im trying to use Llama3-8B tune guide from : https://pytorch.org/torchtune/0.1/tutorials/llama3.html it gave me this error : W0608 08:41:38.766000 10904 torch\distributed\elastic\multiprocessing\...
graph User's user avatar
0 votes
1 answer
5k views

I am now trying to finetune a llama3 model. I am using unsloth, from unsloth import FastLanguageModel Then I load Llama3 model. model, tokenizer = FastLanguageModel.from_pretrained( model_name = &...
yts61's user avatar
  • 1,669
0 votes
1 answer
200 views

I am testing llama3 here using this simple code below import ollama message = "What is football" # connect to Llama3 model try: response_stream = ollama.chat( model="llama3&...
Nived Puthumana Meleppattu's user avatar
0 votes
1 answer
522 views

I wanted to use llama-index locally with ollama and llama3:8b to index utf-8 json file. I dont have a gpu. I use uncharted to convert docs into json. Now If it is not possible to use llama-index ...
Asif Rahman's user avatar
1 vote
1 answer
429 views

Im trying to fine tune the llama3 model with torch tune. these are the steps that ive already done : 1.pip install torch 2.pip install torchtune 3.tune download meta-llama/Meta-Llama-3-8B --output-dir ...
Ahad Porkar's user avatar
  • 1,718