Tags: Mozilla-Ocho/llamafile
Tags
Release llamafile v0.8.2 - Upgrade to cosmocc 3.3.6 - Remove warnings from cuda build - Fix bug in llamafile_trapping_enabled - Refactor the new vectorized expf() code - iqk_mul_mat() only needs codegen for AVX2 - Be less gung ho about the -ngl flag in README - Restore shell scriptabiilty fix for new tokenizer - Suppress divide by zero errors llama_print_timings() - Cut back on tinyBLAS CPU multiple output type kernels - Cut back NVIDIA fat binary releases to -arch=all-major - Remove GA (won't rely on slow broken irregular cloud dev tools) - Cut flash_attn_ext from release binaries (use --recompile to have it)
PreviousNext