From the course: Large Language Models on AWS: Building and Deploying Open-Source LLMs
Unlock this course with a free trial
Join today to access over 24,700 courses taught by industry experts.
Implications of Amdahl’s law: A walkthrough
From the course: Large Language Models on AWS: Building and Deploying Open-Source LLMs
Implications of Amdahl’s law: A walkthrough
- [Instructor] Let's take a look at parallel compilation with llama.cpp. This is really common when you're dealing with large language models. You have to git clone a project and you have to compile it locally on your machine. And it's important to understand some of the implications of compiling. So first up here we look at some real data on my Lambda box that has a thread ripper. It's a 24 core 48 thread thread ripper. And what it really exposes is Amdahl's law in practice via compilation. First up in x axis we have parallel jobs. In this case it's the -j flag in the make. And every time you add another number in there, you're going to add more threads. Now, some of the threads may be IO bound, right? So CPU isn't important, but eventually you start to run out of a gain from doing threads. So in the blue line here, we're going to track compilation and this is a left y axis. And then the green line shows CPU utilization and this is the right y axis. The yellow reference line here…
Contents
-
-
-
(Locked)
Implications of Amdahl’s law: A walkthrough4m 5s
-
(Locked)
Compiling llama.cpp demo4m 17s
-
(Locked)
GGUF file format3m 18s
-
(Locked)
Python UV scripting3m 55s
-
Python UV packaging overview1m 59s
-
(Locked)
Key concepts in llama.cpp walkthrough4m 37s
-
(Locked)
GGUF quantized llama.cpp end-to-end demo4m 3s
-
(Locked)
Llama.cpp on AWS G5 demo4m 20s
-
(Locked)
-