I can install llama cpp with cuBLAS using pip as below:
CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python
However, I don't know how to install it with cuBLAS when using poetry. Installation is possible, but cuBLAS Acceleration is not available.
I checked that I can use cuBLAS when I installed it with pip in my environment.
I added llama-cpp-python dependency to the pyproject.toml file as below:
[tool.poetry.dependencies]
python = ">=3.10, <3.13"
...
llama-cpp-python = "^0.2.13"
...
I tried
CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 poetry install
And
export CMAKE_ARGS="-DLLAMA_CUBLAS=on"
export FORCE_CMAKE=1
poetry install