Skip to content

Releases: databricks/megablocks

v0.5.1

11 Jan 22:14
f05609c
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.5.0...v0.5.1

v0.5.0

08 Dec 16:51
0460181
Compare
Choose a tag to compare

What's New

Several improvements to avoid CPU <> GPU device synchronizations, GLU support, and support for some new models 👀

What's Changed

New Contributors

Full Changelog: v0.4.0...v0.5.0

v0.4.0

24 Oct 22:44
6a71b18
Compare
Choose a tag to compare

What's Changed

Full Changelog: v0.3.3...v0.4.0

v0.3.3

17 Oct 21:58
52aa1b2
Compare
Choose a tag to compare

What's Changed

  • Enable running MegaBlocks MoE without bias by @vchiley in #31

Full Changelog: v0.3.2...v0.3.3

v0.3.2

10 Oct 22:32
Compare
Choose a tag to compare

What's Changed

  • Support for bfloat16
  • Optimizations for top_k > 1
  • Support for fully-sharded data parallelism
  • Support tensor model parallelism when expert_parallel_world_size > num_experts
  • Optimizations for activation memory
  • Support activation quantization (thanks @dblalock!)
  • Optimizations for SM90 (Hopper)
  • Lots of bug fixes, cleanup and small optimizations

New Contributors

Full Changelog: v0.1...v0.3.2

Version 0.1

01 May 15:14
Compare
Choose a tag to compare
Version 0.1 Pre-release
Pre-release

Initial release documenting repository state prior to MLSys'23 camera-ready publication.