llamafile/llama.cpp at main · Mozilla-Ocho/llamafile

History

Name		Name	Last commit message	Last commit date
parent directory ..
imatrix		imatrix
llama-bench		llama-bench
llava		llava
main		main
perplexity		perplexity
quantize		quantize
server		server
BUILD.mk		BUILD.mk
LICENSE		LICENSE
README.llamafile		README.llamafile
base64.h		base64.h
build-info.cpp		build-info.cpp
common.cpp		common.cpp
common.h		common.h
console.cpp		console.cpp
console.h		console.h
ggml-alloc.c		ggml-alloc.c
ggml-alloc.h		ggml-alloc.h
ggml-backend-impl.h		ggml-backend-impl.h
ggml-backend.c		ggml-backend.c
ggml-backend.h		ggml-backend.h
ggml-common.h		ggml-common.h
ggml-cuda.cu		ggml-cuda.cu
ggml-cuda.h		ggml-cuda.h
ggml-impl.h		ggml-impl.h
ggml-metal.h		ggml-metal.h
ggml-metal.m		ggml-metal.m
ggml-metal.metal		ggml-metal.metal
ggml-quants-amd-avx.c		ggml-quants-amd-avx.c
ggml-quants-amd-avx2.c		ggml-quants-amd-avx2.c
ggml-quants-amd-avx512.c		ggml-quants-amd-avx512.c
ggml-quants-arm80.c		ggml-quants-arm80.c
ggml-quants.cpp		ggml-quants.cpp
ggml-quants.h		ggml-quants.h
ggml-quants.inc		ggml-quants.inc
ggml-quants.py		ggml-quants.py
ggml-vector-amd-avx.c		ggml-vector-amd-avx.c
ggml-vector-amd-avx2.c		ggml-vector-amd-avx2.c
ggml-vector-amd-avx512.c		ggml-vector-amd-avx512.c
ggml-vector-amd-avx512bf16.c		ggml-vector-amd-avx512bf16.c
ggml-vector-amd-f16c.c		ggml-vector-amd-f16c.c
ggml-vector-amd-fma.c		ggml-vector-amd-fma.c
ggml-vector-arm80.c		ggml-vector-arm80.c
ggml-vector-arm82.c		ggml-vector-arm82.c
ggml-vector.cpp		ggml-vector.cpp
ggml-vector.h		ggml-vector.h
ggml-vector.inc		ggml-vector.inc
ggml-vector.py		ggml-vector.py
ggml.c		ggml.c
ggml.h		ggml.h
grammar-parser.cpp		grammar-parser.cpp
grammar-parser.h		grammar-parser.h
json-schema-to-grammar.cpp		json-schema-to-grammar.cpp
json-schema-to-grammar.h		json-schema-to-grammar.h
json.h		json.h
llama.cpp		llama.cpp
llama.h		llama.h
llamafile.h		llamafile.h
log.h		log.h
sampling.cpp		sampling.cpp
sampling.h		sampling.h
stb_image.c		stb_image.c
stb_image.h		stb_image.h
unicode-data.cpp		unicode-data.cpp
unicode-data.h		unicode-data.h
unicode.cpp		unicode.cpp
unicode.h		unicode.h

README.llamafile

DESCRIPTION

  llama.cpp is a machine learning library for large language models

LICENSE

  MIT

ORIGIN

  ggerganov/llama.cpp#4406
  152da28ae54139e3754189b9e6e1c28e11277502
  2024-05-23

LOCAL MODIFICATIONS

  - Remove MAP_POPULATE because it makes mmap(tinyllama) block for 100ms
  - Refactor ggml.c, llama.cpp, and llava to use llamafile_open() APIs
  - Unify main, server, and llava-cli into single llamafile program
  - Make cuBLAS / hipBLAS optional by introducing tinyBLAS library
  - Add support to main() programs for Cosmo /zip/.args files
  - Introduce pledge() SECCOMP sandboxing to improve security
  - Call exit() rather than abort() when GGML_ASSERT() fails
  - Clamp bf16/f32 values before passing to K quantizers
  - Make GPU logger callback API safer and less generic
  - Write log to /dev/null when main.log fails to open
  - Make main and llava-cli print timings on ctrl-c
  - Make emebeddings CLI program shell scriptable
  - Avoid bind() conflicts on port 8080 w/ server
  - Use runtime dispatching for matmul quants
  - Remove operating system #ifdef statements
  - Remove stdout logging from LLaVA

Files

llama.cpp

Directory actions

More options

Directory actions

More options

Latest commit

History

llama.cpp

Folders and files

parent directory