前言#

技术要点：Kaggle+SakuraLLM+Ngrok
项目地址

部署参考1
部署参考2
部署参考3

1.登录并注册Kaggle#

官网 (绑定手机号 2024.6.26可以绑定)
如果绑定不上 (大概率是人机验证的问题)

2.登录并注册Ngrok#

Ngrok官网获取到token

顺便创建个免费域名

3.在Kaggle部署sakuraLLM#

先创建一个note

将GPU服务器选项打开，验证手机号

然后复制代码到你的note中↓↓↓

# 改了下tgw-ngrok-api的代码，搞了可以自动下载并选择模型，api那边还使用了ngrok的静态域名（Domains）。
# ngrok得token，需要填写
TOKEN = "     <<<<这里填写你的ngroktoken>>>>     "

# ngrok的静态域名（Domains）。在ngrok上面可以免费开一个静态域名
DOMAIN = "     <<<<这里填写ngroktoken静态域名>>>>    "

# 模型名称 一开始会自动下载这个模型，基本上就是huggingface上的项目全名
MODEL = "SakuraLLM/Sakura-14B-Qwen2beta-v0.9.2-GGUF"

# 模型里面的某个文件具体名称，有的话，就只会下载这个模型文件
# SPECIFIC_FILE = "sakura-13b-lnovel-v0.9b-Q8_0.gguf"
SPECIFIC_FILE = "sakura-14b-qwen2beta-v0.9.2-q4km.gguf"

# 启动时默认使用的模型名称 一般时指model名称，或者specific_file名称。
DEFAULT_MODEL = "sakura-14b-qwen2beta-v0.9.2-q4km.gguf"

# 运行tgw的参数，可以自己改动
RUN_CONTENT = "--api --trust-remote-code --model " + DEFAULT_MODEL + " --n-gpu-layers 256 --tensorcores"

%mkdir /kaggle/working
%cd /kaggle/working
!git clone https://github.com/oobabooga/text-generation-webui.git
%cd text-generation-webui
!git checkout e98d1086f53dc9c9baa4d17f08b9660244b67d0f
%cd ..
!conda create -n textgen python=3.11 -y
import os
import shutil

!source /opt/conda/bin/activate textgen && pip install requests
!source /opt/conda/bin/activate textgen && pip install tqdm

!source /opt/conda/bin/activate textgen && python /kaggle/working/text-generation-webui/download-model.py $MODEL --specific-file $SPECIFIC_FILE
def move_files(source_dir, destination_dir):
    for root, dirs, files in os.walk(source_dir):
        for file in files:
            source_path = os.path.join(root, file)
            destination_path = os.path.join(destination_dir, os.path.relpath(source_path, source_dir))
            shutil.move(source_path, destination_path)

move_files("/kaggle/working/models", "/kaggle/working/text-generation-webui/models")
%ls /kaggle/working/text-generation-webui/models/

# 输出模型文件夹内容 假如默认加载模型失败，看看是不是这里输出的内容和默认模型名称是否不同
print("输出模型文件夹内容")
for filename in os.listdir('/kaggle/working/text-generation-webui/models'):
    print(filename)
!source /opt/conda/bin/activate textgen && pip3 install torch==2.1.* torchvision==0.16.* torchaudio==2.1.* --index-url https://download.pytorch.org/whl/cu121
!source /opt/conda/bin/activate textgen && conda install -y -c "nvidia/label/cuda-12.1.1" cuda-runtime
%cd /kaggle/working/
%cd text-generation-webui
!source /opt/conda/bin/activate textgen && pip install -r requirements.txt
!source /opt/conda/bin/activate textgen && pip install -r extensions/openai/requirements.txt
!source /opt/conda/bin/activate textgen && pip install -r extensions/coqui_tts/requirements.txt
!source /opt/conda/bin/activate textgen && pip install -r extensions/whisper_stt/requirements.txt
import os

os.chdir("/kaggle/working/text-generation-webui/")

print("**********RUN_CONTENT:" + RUN_CONTENT)
with open("CMD_FLAGS.txt", "w") as file:
    file.write(RUN_CONTENT)

print("文件修改完成。")
port1 = 7860
port2 = 5000
HOME_FOLDER = "/kaggle/working/"

!pip install pyngrok==6.1.0
from pyngrok import ngrok, conf
import gc

gc.collect()

try:
    ngrok.set_auth_token(TOKEN)
    ngrok.kill()
    public_url1 = ngrok.connect(port1).public_url
    # domain 是ngrok送的固定域名 给api用了
    public_url2 = ngrok.connect(port2, domain=DOMAIN).public_url
    print(f"**********************后台访问网址: {public_url1}")
    print(f"**********************API访问网址: {public_url2}")

    # Replace this line with the command to start the text-generation-webui
    !source /opt/conda/bin/activate textgen && python server.py $RUN_CONTENT
    
    
    print(f"**********************后台访问网址: {public_url1}")
    print(f"**********************API访问网址: {public_url2}")

except Exception as e:
    print(f"Error starting ngrok tunnel: {e}")

然后选择显卡和打开网络

然后点击启动

然后是10分钟左右的漫长等待…

部署成功！！！

4.测试sakuraLLM模型#

测试网站
点击”添加Sakura翻译器”，链接输入之前提到的翻译器链接，然后点击添加。
添加完成后，点击测试。如果一切顺利，将弹窗显示测试样例的翻译结果。