Block or Report
Block or report awesome-yyh
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseLists (1)
Sort Name ascending (A-Z)
Stars
Language
Sort by: Recently starred
Scalable toolkit for efficient model alignment
Examples for using ONNX Runtime for machine learning inferencing.
Examples for using ONNX Runtime for model training.
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
Tutel MoE: An Optimized Mixture-of-Experts Implementation
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
Reference implementation of Megalodon 7B model
基于大模型搭建的聊天机器人,同时支持 微信公众号、企业微信应用、飞书、钉钉 等接入,可选择GPT3.5/GPT-4o/GPT4.0/ Claude/文心一言/讯飞星火/通义千问/ Gemini/GLM-4/Claude/Kimi/LinkAI,能处理文本、语音和图片,访问操作系统和互联网,支持基于自有知识库进行定制企业智能客服。
A Native-PyTorch Library for LLM Fine-tuning
Devika is an Agentic AI Software Engineer that can understand high-level human instructions, break them down into steps, research relevant information, and write code to achieve the given objective…
lightweight, standalone C++ inference engine for Google's Gemma models.
Interact with your documents using the power of GPT, 100% privately, no data leaks
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…
Official inference library for Mistral models
The official PyTorch implementation of Google's Gemma models
Development repository for the Triton language and compiler
基于ChatGLM-6B模型,进行下游具体任务微调,涉及Freeze、Lora、P-tuning等
Fast and memory-efficient exact attention
Fast inference engine for Transformer models
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
Large Language Model Text Generation Inference