Windows系统上怎么设置Ollama环境变量

时间:2025-02-22 22:36:12

ollama是大模型演示的方便工具,但是有时候我们需要修改其配置(例如模型留驻GPU的时间),首先:

ollama serve -h

可以看到能够设置的环境变量:

Environment Variables:
      OLLAMA_DEBUG               Show additional debug information (. OLLAMA_DEBUG=1)
      OLLAMA_HOST                IP Address for the ollama server (default 127.0.0.1:11434)
      OLLAMA_KEEP_ALIVE          The duration that models stay loaded in memory (default "5m")
      OLLAMA_MAX_LOADED_MODELS   Maximum number of loaded models (default 1)
      OLLAMA_MAX_QUEUE           Maximum number of queued requests
      OLLAMA_MODELS              The path to the models directory
      OLLAMA_NUM_PARALLEL        Maximum number of parallel requests (default 1)
      OLLAMA_NOPRUNE             Do not prune model blobs on startup
      OLLAMA_ORIGINS             A comma separated list of allowed origins
      OLLAMA_TMPDIR              Location for temporary files

如果要改驻留时间,就修改OLLAMA_KEEP_ALIVE,那这个环境变量是什么单位呢?查看一下这个网页:/ollama/ollama/blob/main/docs/#how-do-i-keep-a-model-loaded-in-memory-or-make-it-unload-immediatelyhttps:///ollama/ollama/blob/main/docs/#how-do-i-keep-a-model-loaded-in-memory-or-make-it-unload-immediately

可以看到,指定上面的时间有几种选择:

  • a duration string (such as "10m" or "24h")
  • a number in seconds (such as 3600)
  • any negative number which will keep the model loaded in memory (. -1 or "-1m")
  • '0' which will unload the model immediately after generating a response

 例如我们在windows环境变量中可以把OLLAMA_KEEP_ALIVE改成1h,OLLAMA_NUM_PARALLEL改成2,就可以同时有两个并发访问,并且驻留时间为1h了(如果用ollama ps则会显示59 minutes)。就简单记录这么多。

补充一点:我发现在windows上需要重启系统后上面这个环境变量才会真正生效。