NERSCPowering Scientific Discovery for 50 Years

NERSC End-to-End LLM Bootcamp, May 2024

May 7, 2024

The NERSC End-to-End LLM Bootcamp will be hosted virtually for three days from May 7-9, 2024 and is co-organized by NERSC with support from OpenACC organization and NVIDIA.

The End-to-End LLM (Large Language Model) Bootcamp is designed from a real-world perspective that follows the data processing, development, and deployment pipeline paradigm. Attendees walk through the workflow of preprocessing the SQuAD (Stanford Question Answering Dataset) dataset for Question Answering task, training the dataset using BERT (Bidirectional Encoder Representations from Transformers), and executing prompt learning strategy using NVIDIA® NeMo™ and a transformer-based language model, NVIDIA Megatron. Attendees will also learn to optimize an LLM using NVIDIA TensorRT™, an SDK for high-performance deep learning inference, guardrail prompts and responses from the LLM model using NeMo Guardrails, and deploy the AI pipeline using NVIDIA Triton™ Inference Server, an open-source software that standardizes AI model deployment and execution across every workload.

This virtual bootcamp is a hands-on learning experience where you will be guided through step-by-step instructions with teaching assistants on hand to help throughout. 

Eligibility and Prerequisites

This event is open to the general HPC community. However, due to limited capacity, active NERSC, ALCF, and OLCF users will receive priority consideration. Prerequisites: 

  • Basic experience with Python and PyTorch. 
  • No GPU programming experience is required.

Agenda

All times are in Pacific time zone.

day 0: monday, may 6th // 9:00 am - 10:00 am
09:00 AM - 10:00 AM Cluster Connection Session
day 1: tuesday, may 7th // 9:00 am - 1:30 pm
09:00 AM - 09:10 AM Welcome and Introduction
09:10 AM - 09:20 AM Cluster Overview
09:20 AM - 10:00 AM Overview of Large Language Models and NeMo Megatron (Lecture)
10:00 AM - 10:30 AM Nemo Fundamentals (Lab)
10:30 AM - 10:45 AM Break
10:45 AM - 12:15 PM NeMo Question Answering (Lab)
12:15 PM - 12:45 PM Wrap up Q&A
12:45 PM - 01:30 PM LLM Projects Discussion
day 2: Wednesday, may 8th // 9:00 am - 1:30 pm
09:00 AM - 09:45 AM Prompt Tuning/P-Tuning (Lecture)
09:45 AM - 10:30 AM Prompt Tuning/P-Tuning (Lab)
10:30 AM - 10:45 AM Break
10:45 AM - 11:00 AM NeMo Megatron-GPT 1.3B: Language Model Inferencing (Lab)
11:00 AM - 11:35 AM LLama2 7B Inference using TensorRT-LLM (Lecture and Lab)
11:35 AM - 12:05 PM LLama2 7B deployment using Triton Inference server (Lab)
12:05 PM - 12:30 PM Wrap up Q&A
12:30 PM - 01:30 PM LLM Projects Discussion
day 3: Thursday, may 9th // 9:00 am - 1:30 pm
09:00 AM - 10:00 AM NeMo Guardrails Overview and Architecture (Lecture)
10:00 AM - 10:20 AM NeMo Guardrails Topical Rails (Lab)
10:20 AM - 10:40 AM NeMo Guardrails Jailbreak Rails (Lab)
10:40 AM - 11:00 AM NeMo Guardrails Grounding Rails (Lab)
11:00 AM - 11:15 AM Break
11:15 AM - 11:35 AM NeMo Guardrails Moderation Rails (Lab)
11:35 AM - 12:20 PM NeMo Guardrails Langchain and Prompt Templates (Lab)
12:20 PM - 12:50 PM Nemo LLM Service Demo
12:50 PM - 01:30 PM Wrap up Q&A

Application

The application deadline is April 24, 2024. For application and more details, please visit the event page. Please note acceptance is not confirmed until you have received a confirmation email.

Presentation Materials

Please find the presentation materials (password-protected) for this event here.