Machine Learning Mastery

Webinar: Double LLM Inference Speeds with Speculative Decoding

Machine Learning Mastery sent this email to their subscribers on May 20, 2024.

151 Calle de San Francisco, Suite 200 - PMB 5072, San Juan, PR 00901

Text-only version of this email

Predibase 2X LLM INFERENCE SPEEDS WITH s SPECULATIVE DECODING = v Thursday, May 23, 2024 10:00 am PT / 1:80 pm ET L - Arnav Garg Ve May 23rd at 10:00am PT Join us on Thursday, May 23rd at 10:00 am PT to learn how fine-tuning an open-source LLM with speculative decoding typically increases inference throughput by more than 2x without sacrificing performance. We’re excited to bring Medusa into the Predibase platform in our next release and invite you to join our upcoming webinar for an early preview and technical Q&A with our CTO, Travis Addair, and ML Engineer, Arnav Garg. We’ll talk through a couple scenarios for using speculative decoding, including: 1. Increasing throughput of an open-source LLM base model by fine-tuning an open-source LLM when you don’t have labeled data (i.e. only inputs) 2. Increasing throughput and performance of an open-source LLM for task-specific applications by fine-tuning with labeled data (i.e. input,output pairs) We’ll also cover key considerations like how to structure fine-tuning jobs to optimize for faster inference and more performant model generation, plus will open up the conversation for Q&A to make sure teams know how to get started with Medusa on their own models. Register Now 151 Calle de San Francisco, Suite 200 - PMB 5072, San Juan, PR 00901
Show all

The Latest Emails Sent By Machine Learning Mastery

More Emails, Deals & Coupons From Machine Learning Mastery

Email Offers, Discounts & Promos From Our Top Stores