Model Gallery

7 models from 1 repositories

Filter by type:

Filter by tags:

llm-compiler-13b-imat
LLM Compiler is a state-of-the-art LLM that builds upon Code Llama with improved performance for code optimization and compiler reasoning. LLM Compiler is free for both research and commercial use. LLM Compiler is available in two flavors: LLM Compiler, the foundational models, pretrained on over 500B tokens of LLVM-IR, x86_84, ARM, and CUDA assembly codes and trained to predict the effect of LLVM optimizations; and LLM Compiler FTD, which is further fine-tuned to predict the best optimizations for code in LLVM assembly to reduce code size, and to disassemble assembly code to LLVM-IR.

Repository: localaiLicense: other

llm-compiler-13b-ftd
LLM Compiler is a state-of-the-art LLM that builds upon Code Llama with improved performance for code optimization and compiler reasoning. LLM Compiler is free for both research and commercial use. LLM Compiler is available in two flavors: LLM Compiler, the foundational models, pretrained on over 500B tokens of LLVM-IR, x86_84, ARM, and CUDA assembly codes and trained to predict the effect of LLVM optimizations; and LLM Compiler FTD, which is further fine-tuned to predict the best optimizations for code in LLVM assembly to reduce code size, and to disassemble assembly code to LLVM-IR.

Repository: localaiLicense: other

llm-compiler-7b-imat-GGUF
LLM Compiler is a state-of-the-art LLM that builds upon Code Llama with improved performance for code optimization and compiler reasoning. LLM Compiler is free for both research and commercial use. LLM Compiler is available in two flavors: LLM Compiler, the foundational models, pretrained on over 500B tokens of LLVM-IR, x86_84, ARM, and CUDA assembly codes and trained to predict the effect of LLVM optimizations; and LLM Compiler FTD, which is further fine-tuned to predict the best optimizations for code in LLVM assembly to reduce code size, and to disassemble assembly code to LLVM-IR.

Repository: localaiLicense: other

llm-compiler-7b-ftd-imat
LLM Compiler is a state-of-the-art LLM that builds upon Code Llama with improved performance for code optimization and compiler reasoning. LLM Compiler is free for both research and commercial use. LLM Compiler is available in two flavors: LLM Compiler, the foundational models, pretrained on over 500B tokens of LLVM-IR, x86_84, ARM, and CUDA assembly codes and trained to predict the effect of LLVM optimizations; and LLM Compiler FTD, which is further fine-tuned to predict the best optimizations for code in LLVM assembly to reduce code size, and to disassemble assembly code to LLVM-IR.

Repository: localaiLicense: other

deepseek-v4-flash-q2
DeepSeek V4 Flash (IQ2XXS GGUF, ~81 GB) - only loadable via the ds4 backend. Requires >=128 GB RAM. Metal (Darwin) or CUDA (Linux). See https://github.com/antirez/ds4 for details.

Repository: localai

deepseek-v4-flash-q2-q4
DeepSeek V4 Flash (mixed q2/q4 GGUF, ~91 GB) - only loadable via the ds4 backend. The last 6 expert layers are kept at Q4_K (the rest IQ2XXS), trading a little extra memory for higher quality than the pure-q2 build while still fitting in RAM on a 128 GB machine. imatrix-tuned. Metal (Darwin) or CUDA (Linux). See https://github.com/antirez/ds4 for details.

Repository: localai

deepseek-v4-flash-q2-mtp
DeepSeek V4 Flash (IQ2XXS GGUF, ~81 GB) paired with the optional MTP speculative-decoding weights (~3.5 GB) for a slight speedup. Only loadable via the ds4 backend; requires >=128 GB RAM. MTP helps only with greedy decoding (temperature 0), so the override pins temperature to 0. Metal (Darwin) or CUDA (Linux). See https://github.com/antirez/ds4 for details.

Repository: localai