GPT4All:免费使用本地运行无需GPU的基于C++的GPT模型和UI: A free-to-use, locally running, privacy-aware chatbot. No GPU

时间:2024-10-23 08:34:43

 

目录

GPT4All:  :m/nomic-ai/gpt4all

GPT4All Chat UI GPT4All 聊天用户界面

效果测试

问题:How to write a multiple Thread code In Java ?

输出:

GPT4All 如何运作

Running LLMs on CPU在 CPU 上运行 LLM

Sideloading any ggML model加载任何 ggML 模型

GPT4ALL : 目前已经支持的免费开源的大模型

 性能基准测试 Performance Benchmarks

Plugins 插件

LocalDocs Beta Plugin (Chat With Your Data)LocalDocs Beta 插件(与您的数据聊天)

Enabling LocalDocs 启用本地文档

LocalDocs Capabilities 本地文档功能

How LocalDocs Works LocalDocs 的工作原理

Troubleshooting and FAQ 故障排除和常见问题解答

LocalDocs Roadmap 本地文档路线图

Server Mode 服务器模式


GPT4All:  :m/nomic-ai/gpt4all

A free-to-use, locally running, privacy-aware chatbot. No GPU or internet required.

GPT4All

GPT4All Chat UI GPT4All 聊天用户界面

 

The GPT4All Chat Client lets you easily interact with any local large language model.
GPT4All Chat Client 让您可以轻松地与任何本地大型语言模型进行交互。

It is optimized to run 7-13B parameter LLMs on the CPU's of any computer running OSX/Windows/Linux.
它经过优化,可以在任何运行 OSX/Windows/Linux 的计算机的 CPU 上运行 7-13B 参数 LLM。

效果测试

问题:How to write a multiple Thread code In Java ?

输出:

To write a multiple thread code in Java, you can use the package. Here's an example of how to create two threads that print numbers from 1 to 10:

  1. import .*;
  2. public class MultipleThreads {
  3. public static void main(String[] args) throws InterruptedException {
  4. Thread thread1 = new Thread(() -> {
  5. for (int i = 1; i <= 10; i++) {
  6. ("Thread 1: " + i);
  7. }
  8. });
  9. Thread thread2 = new Thread(() -> {
  10. for (int i = 1; i <= 10; i++) {
  11. ("Thread 2: " + i);
  12. }
  13. });
  14. ();
  15. ();
  16. (thread1, thread2);
  17. }
  18. }
  19. class ThreadUtils {
  20. public static void join(Thread t1, Thread t2) throws InterruptedException {
  21. if (t1 != null && t2 != null) {
  22. try {
  23. ();
  24. ();
  25. } catch (InterruptedException e) {
  26. ();
  27. }
  28. }
  29. }
  30. }

In this example, we create two threads using the `Thread` class and its constructor. We then start both threads using the `start()` method. Finally, we use the `join()` method from the `ThreadUtils` class to wait for both threads to finish before printing the output.

GPT4All 如何运作

GPT4All is an ecosystem to train and deploy powerful and customized large language models that run locally on consumer grade CPUs.
GPT4All 是一个生态系统,用于训练和部署在消费级 CPU 上本地运行的强大和定制的大型语言模型。

The goal is simple - be the best instruction tuned assistant-style language model that any person or enterprise can freely use, distribute and build on.
目标很简单——成为任何个人或企业都可以*使用、分发和构建的最佳指令调整助理式语言模型。

A GPT4All model is a 3GB - 8GB file that you can download and plug into the GPT4All open-source ecosystem software. Nomic AI supports and maintains this software ecosystem to enforce quality and security alongside spearheading the effort to allow any person or enterprise to easily train and deploy their own on-edge large language models.
GPT4All 模型是一个 3GB - 8GB 的​​文件,您可以下载该文件并将其插入 GPT4All 开源生态系统软件。 Nomic AI 支持并维护这个软件生态系统,以加强质量和安全性,同时带头让任何个人或企业轻松训练和部署自己的边缘大型语言模型。

Running LLMs on CPU
在 CPU 上运行 LLM

The GPT4All Chat UI supports models from all newer versions of ggML including the LLaMAMPT and GPT-J architectures. The falcon and replit architectures will soon also be supported.
GPT4All Chat UI 支持所有更新版本的 ggML 、  模型,包括 LLaMA 、 MPT 和 GPT-J 架构。 falcon 和 replit 架构也将很快得到支持。

GPT4All maintains an official list of recommended models located in . You can pull request new models to it and if accepted they will show up in the official download dialog.
GPT4All 在 中维护着推荐模型的官方列表。您可以向其请求新模型,如果接受,它们将显示在官方下载对话框中。

Sideloading any ggML model
加载任何 ggML 模型

If a model is compatible with the gpt4all-backend, you can sideload it into GPT4All Chat by:
如果模型与 gpt4all-backend 兼容,您可以通过以下方式将其侧载到 GPT4All Chat 中:

  1. Downloading your model in ggML format. It should be a 3-8 GB file similar to the ones here.
    以 ggML 格式下载模型。它应该是一个 3-8 GB 的文件,类似于此处的文件。
  2. Identifying your GPT4All Chat downloads folder. This is the path listed at the bottom of the download dialog.
    识别您的 GPT4All Chat 下载文件夹。这是列在下载对话框底部的路径。
  3. Prefixing your downloaded model with string ggml- and placing it into the GPT4All Chat downloads folder.
    用字符串 ggml- 为您下载的模型添加前缀,并将其放入 GPT4All Chat 下载文件夹中。
  4. Restarting your chat app. Your model should appear in the download dialog.
    重新启动您的聊天应用程序。您的模型应该出现在下载对话框中。

GPT4ALL : 目前已经支持的免费开源的大模型

在这个页面上列出了现已支持的大模型实例(可以直接下载到本地加载到 CPU 内存里面运行推理): 

GPT4All

Model BoolQ PIQA HellaSwag WinoGrande ARC-e ARC-c OBQA Avg
GPT4All-J 6B v1.0 73.4 74.8 63.4 64.7 54.9 36 40.2 58.2
GPT4All-J v1.1-breezy 74 75.1 63.2 63.6 55.4 34.9 38.4 57.8
GPT4All-J v1.2-jazzy 74.8 74.9 63.6 63.8 56.6 35.3 41 58.6
GPT4All-J v1.3-groovy 73.6 74.3 63.8 63.5 57.7 35 38.8 58.1
GPT4All-J Lora 6B 68.6 75.8 66.2 63.5 56.4 35.7 40.2 58.1
GPT4All LLaMa Lora 7B 73.1 77.6 72.1 67.8 51.1 40.4 40.2 60.3
GPT4All 13B snoozy 83.3 79.2 75 71.3 60.9 44.2 43.4 65.3
GPT4All Falcon 77.6 79.8 74.9 70.1 67.9 43.4 42.6 65.2
Nous-Hermes 79.5 78.9 80 71.9 74.2 50.9 46.4 68.8
Dolly 6B 68.8 77.3 67.6 63.9 62.9 38.7 41.2 60.1
Dolly 12B 56.7 75.4 71 62.2 64.6 38.5 40.4 58.4
Alpaca 7B 73.9 77.2 73.9 66.1 59.8 43.3 43.4 62.5
Alpaca Lora 7B 74.3 79.3 74 68.8 56.6 43.9 42.6 62.8
GPT-J 6.7B 65.4 76.2 66.2 64.1 62.2 36.6 38.2 58.4
LLama 7B 73.1 77.4 73 66.9 52.5 41.4 42.4 61.0
LLama 13B 68.5 79.1 76.2 70.1 60 44.6 42.2 63.0
Pythia 6.7B 63.5 76.3 64 61.1 61.3 35.2 37.2 56.9
Pythia 12B 67.7 76.6 67.3 63.8 63.9 34.8 38 58.9
Fastchat T5 81.5 64.6 46.3 61.8 49.3 33.3 39.4 53.7
Fastchat Vicuña 7B 76.6 77.2 70.7 67.3 53.5 41.2 40.8 61.0
Fastchat Vicuña 13B 81.5 76.8 73.3 66.7 57.4 42.7 43.6 63.1
StableVicuña RLHF 82.3 78.6 74.1 70.9 61 43.5 44.4 65.0
StableLM Tuned 62.5 71.2 53.6 54.8 52.4 31.1 33.4 51.3
StableLM Base 60.1 67.4 41.2 50.1 44.9 27 32 46.1
Koala 13B 76.5 77.9 72.6 68.8 54.3 41 42.8 62.0
Open Assistant Pythia 12B 67.9 78 68.1 65 64.2 40.4 43.2 61.0
Mosaic MPT7B 74.8 79.3 76.3 68.6 70 42.2 42.6 64.8
Mosaic mpt-instruct 74.3 80.4 77.2 67.8 72.2 44.6 43 65.6
Mosaic mpt-chat 77.1 78.2 74.5 67.5 69.4 43.3 44.2 64.9
Wizard 7B 78.4 77.2 69.9 66.5 56.8 40.5 42.6 61.7
Wizard 7B Uncensored 77.7 74.2 68 65.2 53.5 38.7 41.6 59.8
Wizard 13B Uncensored 78.4 75.5 72.1 69.5 57.5 40.4 44 62.5
GPT4-x-Vicuna-13b 81.3 75 75.2 65 58.7 43.9 43.6 63.2
Falcon 7b 73.6 80.7 76.3 67.3 71 43.3 44.4 65.2
Falcon 7b instruct 70.9 78.6 69.8 66.7 67.9 42.7 41.2 62.5
text-davinci-003 88.1 83.8 83.4 75.8 83.9 63.9 51 75.7

 性能基准测试 Performance Benchmarks

BLOOM vs. WizardLM: Which LLM is Better? | Sapling

Plugins 插件

GPT4All Chat Plugins allow you to expand the capabilities of Local LLMs.
GPT4All 聊天插件允许您扩展本地 LLM 的功能。

LocalDocs Beta Plugin (Chat With Your Data)
LocalDocs Beta 插件(与您的数据聊天)

LocalDocs is a GPT4All plugin that allows you to chat with your local files and data. It allows you to utilize powerful local LLMs to chat with private data without any data leaving your computer or server. When using LocalDocs, your LLM will cite the sources that most likely contributed to a given output. Note, even an LLM equipped with LocalDocs can hallucinate. If the LocalDocs plugin decides to utilize your documents to help answer a prompt, you will see references appear below the response.
LocalDocs 是一个 GPT4All 插件,允许您与本地文件和数据聊天。它允许您利用强大的本地 LLM 与私人数据聊天,而无需任何数据离开您的计算机或服务器。使用 LocalDocs 时,您的 LLM 将引用最有可能对给定输出做出贡献的来源。请注意,即使是配备了 LocalDocs 的 LLM 也会产生幻觉。如果 LocalDocs 插件决定使用您的文档来帮助回答提示,您将看到参考出现在回复下方。

GPT4All-Snoozy with LocalDocs. Try GPT4All-Groovy for a faster experience!
GPT4All-Snoozy 与 LocalDocs。尝试 GPT4All-Groovy 以获得更快的体验!

Enabling LocalDocs 启用本地文档
  1. Install the latest version of GPT4All Chat from GPT4All Website.
    从 GPT4All 网站安装最新版本的 GPT4All Chat。
  2. Go to Settings > LocalDocs tab. 转到 Settings > LocalDocs tab 。
  3. Configure a collection (folder) on your computer that contains the files your LLM should have access to. You can alter the contents of the folder/directory at anytime. As you add more files to your collection, your LLM will dynamically be able to access them.
    在您的计算机上配置一个集合(文件夹),其中包含您的 LLM 应该有权访问的文件。您可以随时更改文件夹/目录的内容。当您将更多文件添加到您的收藏中时,您的 LLM 将能够动态地访问它们。
  4. Spin up a chat session with any LLM (including external ones like ChatGPT but warning data will leave your machine!)
    启动与任何 LLM 的聊天会话(包括像 ChatGPT 这样的外部的,但警告数据会离开你的机器!)
  5. At the top right, click the database icon and select which collection you want your LLM to know about during your chat session.
    在右上角,单击数据库图标并选择您希望 LLM 在聊天会话中了解的集合。
LocalDocs Capabilities 本地文档功能

LocalDocs allows your LLM to have context about the contents of your documentation collection. Not all prompts/question will utilize your document collection for context. If LocalDocs was used in your LLMs response, you will see references to the document snippets that LocalDocs used.
LocalDocs 允许您的 LLM 了解您文档集内容的上下文。并非所有提示/问题都会利用您的文档集作为上下文。如果您的 LLM 回复中使用了 LocalDocs,您将看到对 LocalDocs 使用的文档片段的引用。

LocalDocs can: LocalDocs 可以:

  • Query your documents based upon your prompt / question. If your documents contain answers that may help answer your question/prompt LocalDocs will try to utilize snippets of your documents to provide context.
    根据您的提示/问题查询您的文档。如果您的文档包含可能有助于回答您的问题/提示的答案,LocalDocs 将尝试利用您的文档片段来提供上下文。

LocalDocs cannot: LocalDocs 不能:

  • Answer general metadata queries (. What documents do you know about?Tell me about my documents)
    回答一般元数据查询(例如 What documents do you know about? 、 Tell me about my documents )
  • Summarize a single document (. Summarize my magna carta PDF.)
    总结单个文档(例如 Summarize my magna carta PDF. )

See the Troubleshooting section for common issues.
有关常见问题,请参阅故障排除部分。

How LocalDocs Works LocalDocs 的工作原理

LocalDocs works by maintaining an index of all data in the directory your collection is linked to. This index consists of small chunks of each document that the LLM can receive as additional input when you ask it a question. The general technique this plugin uses is called Retrieval Augmented Generation.
LocalDocs 的工作原理是在您的收藏链接到的目录中维护所有数据的索引。该索引由每个文档的小块组成,LLM 可以在您提出问题时将其作为附加输入接收。该插件使用的一般技术称为检索增强生成。

These document chunks help your LLM respond to queries with knowledge about the contents of your data. The number of chunks and the size of each chunk can be configured in the LocalDocs plugin settings tab. For indexing speed purposes, LocalDocs uses pre-deep-learning n-gram and TF-IDF based retrieval when deciding what document chunks your LLM should use as context. You'll find its of comparable quality with embedding based retrieval approaches but magnitudes faster to ingest data.
这些文档块可帮助您的法学硕士根据有关数据内容的知识来响应查询。块的数量和每个块的大小可以在 LocalDocs 插件设置选项卡中配置。出于索引速度的目的,LocalDocs 在决定您的 LLM 应该使用哪些文档块作为上下文时使用预深度学习 n-gram 和基于 TF-IDF 的检索。您会发现它的质量与基于嵌入的检索方法相当,但摄取数据的速度要快得多。

LocalDocs supports the following file types:
LocalDocs 支持以下文件类型:

 
 
  1. ["txt", "doc", "docx", "pdf", "rtf", "odt", "html", "htm", "xls", "xlsx", "csv", "ods", "ppt", "pptx", "odp", "xml", "json", "log", "md", "tex", "asc", "wks",
  2. "wpd", "wps", "wri", "xhtml", "xht", "xslt", "yaml", "yml", "dtd", "sgml", "tsv", "strings", "resx",
  3. "plist", "properties", "ini", "config", "bat", "sh", "ps1", "cmd", "awk", "sed", "vbs", "ics", "mht",
  4. "mhtml", "epub", "djvu", "azw", "azw3", "mobi", "fb2", "prc", "lit", "lrf", "tcr", "pdb", "oxps",
  5. "xps", "pages", "numbers", "key", "keynote", "abw", "zabw", "123", "wk1", "wk3", "wk4", "wk5", "wq1",
  6. "wq2", "xlw", "xlr", "dif", "slk", "sylk", "wb1", "wb2", "wb3", "qpw", "wdb", "wks", "wku", "wr1",
  7. "wrk", "xlk", "xlt", "xltm", "xltx", "xlsm", "xla", "xlam", "xll", "xld", "xlv", "xlw", "xlc", "xlm",
  8. "xlt", "xln"]

Troubleshooting and FAQ 故障排除和常见问题解答

My LocalDocs plugin isn't using my documents

  • Make sure LocalDocs is enabled for your chat session (the DB icon on the top-right should have a border)
    确保为您的聊天会话启用 LocalDocs(右上角的数据库图标应该有边框)
  • Try to modify your prompt to be more specific and use terminology that is in your document. This will increase the likelihood that LocalDocs matches document snippets for your question.
    尝试修改您的提示以使其更加具体,并使用文档中的术语。这将增加 LocalDocs 匹配您问题的文档片段的可能性。
  • If your document collection is large, wait 1-2 minutes for it to finish indexing.
    如果您的文档集合很大,请等待 1-2 分钟以完成索引。
LocalDocs Roadmap 本地文档路线图
  • Embedding based semantic search for retrieval.
    用于检索的基于嵌入的语义搜索。
  • Customize model fine-tuned with retrieval in the loop.
    自定义模型,通过循环检索进行微调。
  • Plugin compatibility with chat client server mode.
    插件与聊天客户端服务器模式的兼容性。

Server Mode 服务器模式

GPT4All Chat comes with a built-in server mode allowing you to programmatically interact with any supported local LLM through a very familiar HTTP API. You can find the API documentation here.
GPT4All Chat 带有内置服务器模式,允许您通过非常熟悉的 HTTP API 以编程方式与任何受支持的本地 LLM 进行交互。您可以在此处找到 API 文档。

Enabling server mode in the chat client will spin-up on an HTTP server running on localhost port 4891 (the reverse of 1984). You can enable the webserver via GPT4All Chat > Settings > Enable web server.
在聊天客户端中启用服务器模式将在 localhost 端口 4891 上运行的 HTTP 服务器上启动(与 1984 年相反)。您可以通过 GPT4All Chat > Settings > Enable web server 启用网络服务器。

Begin using local LLMs in your AI powered apps by changing a single line of code: the base path for requests.
通过更改一行代码,开始在 AI 支持的应用程序中使用本地 LLM:请求的基本路径。

 
 
  1. import openai
  2. openai.api_base = "http://localhost:4891/v1"
  3. #openai.api_base = "/v1"
  4. openai.api_key = "not needed for a local LLM"
  5. # Set up the prompt and other parameters for the API request
  6. prompt = "Who is Michael Jordan?"
  7. # model = "gpt-3.5-turbo"
  8. #model = "mpt-7b-chat"
  9. model = "gpt4all-j-v1.3-groovy"
  10. # Make the API request
  11. response = (
  12. model=model,
  13. prompt=prompt,
  14. max_tokens=50,
  15. temperature=0.28,
  16. top_p=0.95,
  17. n=1,
  18. echo=True,
  19. stream=False
  20. )
  21. # Print the generated completion
  22. print(response)

which gives the following response
给出以下响应

 
 
  1. {
  2. "choices": [
  3. {
  4. "finish_reason": "stop",
  5. "index": 0,
  6. "logprobs": null,
  7. "text": "Who is Michael Jordan?\nMichael Jordan is a former professional basketball player who played for the Chicago Bulls in the NBA. He was born on December 30, 1963, and retired from playing basketball in 1998."
  8. }
  9. ],
  10. "created": 1684260896,
  11. "id": "foobarbaz",
  12. "model": "gpt4all-j-v1.3-groovy",
  13. "object": "text_completion",
  14. "usage": {
  15. "completion_tokens": 35,
  16. "prompt_tokens": 39,
  17. "total_tokens": 74
  18. }
  19. }

GPT4All Chat UI - GPT4All Documentation