- Berkeley, CA
- @simon_mo_
Block or Report
Block or report simon-mo
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
llama3 implementation one matrix multiplication at a time
A high-throughput and memory-efficient inference and serving engine for LLMs
A library for building fast, reliable and evolvable network services.
A user-friendly CMS for static site generators.
A static site generator for data apps, dashboards, reports, and more. Observable Framework combines JavaScript on the front-end for interactive graphics with any language on the back-end for data a…
FlashInfer: Kernel Library for LLM Serving
Building a quick conversation-based search demo with Lepton AI.
a unified scheduler for online and offline tasks
simple markdown editor w inline comments, on latest automerge stack
Ensō is a high-performance streaming interface for NIC-application communication.
Kepler (Kubernetes-based Efficient Power Level Exporter) uses eBPF to probe performance counters and other system stats, use ML models to estimate workload energy consumption based on these stats, …
a Rust library for parsing, validating, and modifying Dockerfiles
Module, Model, and Tensor Serialization/Deserialization
SQL databases in Python, designed for simplicity, compatibility, and robustness.
Large Language Model Text Generation Inference
LLM papers I'm reading, mostly on inference and model compression
A fast, clean, responsive Hugo theme.
A lightweight quadtree implementation for javascript
得意黑 Smiley Sans:一款在人文观感和几何特征中寻找平衡的中文黑体
🔥 Blazing fast bulk data transfers between any cloud 🔥
Standalone Kubelet Tutorial