Draft& Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding

January 2024

Type

Publication

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2024