Deploying and Scaling AI Applications with the NVIDIA TensorRT Inference Server on Kubernetes

Deploying and Scaling AI Applications with the NVIDIA TensorRT Inference Server on Kubernetes

Share:

Similar Tracks

Do You Need a Service Mesh? NGINX

From model weights to API endpoint with TensorRT LLM: Philip Kiely and Pankaj Gupta AI Engineer

GPU's in Kubernetes the easy way? nvidia gpu operator overview! Null Labs

NVAITC Webinar: Deploying Models with TensorRT NVIDIA Developer

Why and how to run NVIDIA NIM on Amazon EKS AWS Events

NVIDIA GPU Operator Overview NVIDIA Developer

NVIDIA Triton Inference Server and its use in Netflix's Model Scoring Service Outerbounds

Scaling AI Workloads with Kubernetes: Sharing GPU Resources Across Multiple Containers - Jack Ong The Linux Foundation

AI at the Edge TensorFlow to TensorRT on Jetson NVIDIA Developer

Keynote: Accelerating AI Workloads with GPUs in Kubernetes - Kevin Klues & Sanjay Chatterjee CNCF [Cloud Native Computing Foundation]

GTC 2020: Deep into Triton Inference Server: BERT Practical Deployment on NVIDIA GPU Bitcoin Standard

Optimizing Model Deployments with Triton Model Analyzer NVIDIA Developer

NVidia TensorRT: high-performance deep learning inference accelerator (TensorFlow Meets) TensorFlow

【Llama3 部署】基于TensorRT-LLM和Triton进行Llama3模型部署 AI大模型实战教程唐国梁Tommy

THE TRITON LANGUAGE | PHILIPPE TILLET PyTorch

AI Storage Considerations & VAST Data's Approach VAST Data

AWS re:Invent 2022 - Deep learning on AWS with NVIDIA: From training to deployment (PRT219) AWS Events

Unlocking the Full Potential of GPUs for AI Workloads on Kubernetes - Kevin Klues, NVIDIA CNCF [Cloud Native Computing Foundation]

DIY Intelligence with NVIDIA Jetson Nano Hackster.io, an Avnet community

Do NOT Learn Kubernetes Without Knowing These Concepts... Travis Media