Tags · Mozilla-Ocho/llamafile

0.8.6

Release llamafile v0.8.6

May 25, 2024
81cfbcf
zip
tar.gz
Notes
Downloads

0.8.5

Release llamafile v0.8.5

May 25, 2024
b79ecf4
zip
tar.gz
Notes
Downloads

0.8.4

Release llamafile v0.8.4

May 10, 2024
30cdd9c
zip
tar.gz
Notes
Downloads

0.8.3

Release llamafile v0.8.3

May 10, 2024
ae34574
zip
tar.gz

0.8.2

Release llamafile v0.8.2

- Upgrade to cosmocc 3.3.6
- Remove warnings from cuda build
- Fix bug in llamafile_trapping_enabled
- Refactor the new vectorized expf() code
- iqk_mul_mat() only needs codegen for AVX2
- Be less gung ho about the -ngl flag in README
- Restore shell scriptabiilty fix for new tokenizer
- Suppress divide by zero errors llama_print_timings()
- Cut back on tinyBLAS CPU multiple output type kernels
- Cut back NVIDIA fat binary releases to -arch=all-major
- Remove GA (won't rely on slow broken irregular cloud dev tools)
- Cut flash_attn_ext from release binaries (use --recompile to have it)