llama.cpp LLM模型 windows cpu安装部署；运行LLaMA2模型测试

显示全部楼层 · 2023-8-12 00:12:10

参考：
https://www.listera.top/ji-xu-zhe-teng-xia-chinese-llama-alpaca/
https://blog.csdn.net/qq_38238956/article/details/130113599
cmake windows安装参考：https://blog.csdn.net/weixin_42357472/article/details/131314105
llama.cpp下载编译

1、下载：

git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp

复制代码

2、编译

mkdir build
cd build
cmake ..
cmake --build . --config Release

复制代码

3、测试运行

cd bin\Release
./main -h

复制代码

运行LLaMA-7B模型测试

参考：
https://zhuanlan.zhihu.com/p/638427280
模型下载：
https://huggingface.co/nyanko7/LLaMA-7B/tree/main
下载下来后在llama.cpp-master\models\下再创建LLamda\7B目录

1、 convert the 7B model to ggml FP16 format
convert.py文件就在llama.cpp-master下

python3 convert.py models/7B/

复制代码

2、量化quantize the model to 4-bits (using q4_0 method)
quantize.exe在llama.cpp-master\build\bin\Release下；量化后体积大概从13G到不到4G大小

.\quantize.exe D:\llm\llama.cpp-master\models\LLamda\7B\ggml-model-f16.bin D:\llm\llama.cpp-master\models\LLamda\7B\ggml-model-q4_0.bin q4_0

复制代码

3、命令行交互运行
main.exe在llama.cpp-master\build\bin\Release下

.\main.exe -m D:\llm\llama.cpp-master\models\LLamda\7B\ggml-model-q4_0.bin -n 128 --repeat_penalty 1.0 --color -i -r "User:" -f D:\llm\llama.cpp-master\prompts\chat-with-bob.txt

复制代码

LLaMA中文支持的不大好，虽然能大概理解意思，如果需要中文支持可能需要选择其他模型

也可以直接下载第三方转化好的ggml模型，Llama-2

参考地址：
https://huggingface.co/TheBloke/Llama-2-13B-chat-GGML
https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGML
windows运行很吃内存，32g基本跑满，生成速度也挺慢；不过13b llama-2代模型能直接回复中文

##运行
.\main.exe -m "C:\Users\loong\Downloads\llama-2-13b-chat.ggmlv3.q4_0.bin" -n 128 --repeat_penalty 1.0 --color -i -r "User:" -f D:\llm\llama.cpp-master\prompts\chat-with-bob.txt

复制代码

Chinese-Llama-2中文第二代

模型下载：
https://huggingface.co/soulteary/Chinese-Llama-2-7b-ggml-q4

##运行
.\main.exe -m "C:\Users\loong\Downloads\Chinese-Llama-2-7b-ggml-q4.bin" -n 128 --repeat_penalty 1.0 --color -i -r "User:" -f D:\llm\llama.cpp-master\prompts\chat-with-bob.txt

复制代码

来源：https://blog.csdn.net/weixin_42357472/article/details/131313977
免责声明：如果侵犯了您的权益，请联系站长，我们会及时删除侵权内容，谢谢合作！

llama.cpp LLM模型 windows cpu安装部署；运行LLaMA2模型测试

本帖子中包含更多资源