10分钟快速搭建自己的LLAMA3 8B 服务器

空间站广场

论文

Notebooks

比赛

课程

Apps

我的主页

我的Notebooks

我的论文库

我的足迹

我的工作空间

任务

节点

文件

数据集

镜像

项目

数据库

公开

10分钟快速搭建自己的LLAMA3 8B 服务器

Llama3

Ollama

Llama3Ollama

发布于 2024-04-24

推荐镜像 :ollama:llama3

推荐机型 :c8_m31_1 * NVIDIA T4

Llama 3

迄今为止功能最强大的公开可用大型语言模型。

Meta Llama 3是一组由Meta公司开发的模型家族，它们采用了最新的技术，提供了8B和70B参数规模（预训练或指令调优）的版本。

Llama 3指令调优模型经过精细调整和优化，专为对话/聊天场景设计，在常见基准测试中表现优于许多现有的开源聊天模型。

代码

文本

Ollama 是一个本地推理框架客户端，可一键部署如Llama 3、Phi 3、Mistral、Gemma 等大型语言模型

访问https://ollama.com 按照提示安装ollama

代码

文本

[1]

# !curl -fsSL https://ollama.com/install.sh | sh

代码

文本

运行ollama的服务端

代码

文本

[1]

%%bash

nohup ollama serve > nohup.out 2>&1 &

代码

文本

运行下面的脚本就可以愉快的和llama3对话啦🤩

代码

文本

[2]

import json

import requests

# NOTE: ollama must be running for this to work, start the ollama app or run `ollama serve`

model = "llama3" # TODO: update this for whatever model you wish to use

def chat(messages):

r = requests.post(

"http://0.0.0.0:11434/api/chat",

json={"model": model, "messages": messages, "stream": True},

)

r.raise_for_status()

output = ""

for line in r.iter_lines():

body = json.loads(line)

if "error" in body:

raise Exception(body["error"])

if body.get("done") is False:

message = body.get("message", "")

content = message.get("content", "")

output += content

# the response streams one token at a time, print that as we receive it

print(content, end="", flush=True)

if body.get("done", False):

message["content"] = output

return message

def main():

messages = []

while True:

user_input = input("Enter a prompt: ")

if not user_input:

exit()

print()

messages.append({"role": "user", "content": user_input})

message = chat(messages)

messages.append(message)

print("\n\n")

if __name__ == "__main__":

main()

Enter a prompt:

😊 你好！我是 Chatbot，欢迎你来到这个聊

Enter a prompt:

天空间！有什么想聊的吗？ 🤔

代码

文本

Llama3

Ollama

Llama3Ollama

已赞2