Skip to content
View awesome-yyh's full-sized avatar
🎯
Focusing
🎯
Focusing
Block or Report

Block or report awesome-yyh

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Beta Lists are currently in beta. Share feedback and report bugs.
Showing results

Scalable toolkit for efficient model alignment

Python 413 44 Updated Jun 29, 2024

Examples for using ONNX Runtime for machine learning inferencing.

C++ 1,027 308 Updated Jun 21, 2024

Examples for using ONNX Runtime for model training.

C# 290 59 Updated Jun 25, 2024

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

C++ 13,403 2,744 Updated Jun 29, 2024

一本系统地教你将深度学习模型的性能最大化的战术手册。

2,118 191 Updated May 27, 2023

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

Python 2,066 248 Updated Jun 28, 2024

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

2,801 102 Updated Jun 26, 2024

Tutel MoE: An Optimized Mixture-of-Experts Implementation

Python 683 84 Updated Jun 6, 2024

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Python 14,835 1,411 Updated Jun 28, 2024

Reference implementation of Megalodon 7B model

Cuda 487 50 Updated Apr 18, 2024

基于大模型搭建的聊天机器人,同时支持 微信公众号、企业微信应用、飞书、钉钉 等接入,可选择GPT3.5/GPT-4o/GPT4.0/ Claude/文心一言/讯飞星火/通义千问/ Gemini/GLM-4/Claude/Kimi/LinkAI,能处理文本、语音和图片,访问操作系统和互联网,支持基于自有知识库进行定制企业智能客服。

Python 27,723 7,402 Updated Jun 27, 2024

A Native-PyTorch Library for LLM Fine-tuning

Python 3,520 282 Updated Jun 29, 2024

Devika is an Agentic AI Software Engineer that can understand high-level human instructions, break them down into steps, research relevant information, and write code to achieve the given objective…

Python 17,798 2,318 Updated Jun 28, 2024

Cool Papers - Immersive Paper Discovery

HTML 277 3 Updated Jun 1, 2024

lightweight, standalone C++ inference engine for Google's Gemma models.

C++ 5,705 479 Updated Jun 28, 2024

Python bindings for llama.cpp

Python 7,063 835 Updated Jun 25, 2024

LLM inference in C/C++

C++ 60,953 8,698 Updated Jun 29, 2024

Interact with your documents using the power of GPT, 100% privately, no data leaks

Python 52,829 7,105 Updated Jun 15, 2024

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 7,339 794 Updated Jun 27, 2024

Official inference library for Mistral models

Jupyter Notebook 9,114 797 Updated Jun 22, 2024

Inference code for Llama models

Python 54,075 9,301 Updated May 15, 2024

The official PyTorch implementation of Google's Gemma models

Python 5,113 485 Updated Jun 27, 2024

Development repository for the Triton language and compiler

C++ 11,830 1,394 Updated Jun 29, 2024

NCCL Tests

Cuda 728 216 Updated Jun 14, 2024

基于ChatGLM-6B模型,进行下游具体任务微调,涉及Freeze、Lora、P-tuning等

Python 5 Updated May 7, 2023

Fast and memory-efficient exact attention

Python 11,767 1,042 Updated Jun 27, 2024

Minimalist ML framework for Rust

Rust 14,216 794 Updated Jun 29, 2024

Fast inference engine for Transformer models

C++ 3,005 267 Updated Jun 28, 2024

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Python 35,504 4,359 Updated Jun 28, 2024

Large Language Model Text Generation Inference

Python 8,307 939 Updated Jun 28, 2024
Next