Block Sparse Matrices for Smaller and Faster Language Models

Post Content