IT-AI-Ollama

From wiki.samerhijazi.net
Revision as of 12:35, 7 May 2026 by Samerhijazi (talk | contribs) (Created page with "=Settings= <pre class="code"> export OLLAMA_FLASH_ATTENTION=true export OLLAMA_KV_CACHE_TYPE=q8_0 ### f16 (for 7B–13B), q8_0 (for 34B, 70B) export OLLAMA_CONTEXT_LENGTH=3276...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Settings

export OLLAMA_FLASH_ATTENTION=true
export OLLAMA_KV_CACHE_TYPE=q8_0	### f16 (for 7B–13B), q8_0 (for 34B, 70B)
export OLLAMA_CONTEXT_LENGTH=32768	### 65536 for large codebase work
export OLLAMA_NUM_PARALLEL=1
export OLLAMA_MAX_LOADED_MODELS=1
export OLLAMA_NUM_THREAD=10
export OLLAMA_ORIGINS="*"
ollama run gemma4:31b "what is a black hole"
ollama run deepseek-coder-v2:16b "what is a black hole"
ollama run qwen3-coder:30b-a3b-q4_K_M "what is a black hole"