salesforce logo About
Research
Ethics
Blog
Outreach
Products
Careers
AI Economist
salesforce logo ☰
About Research Publications Open source Asia Lab Ethics Blog Outreach Products Careers AI Economist About Get Involved Fork us on Github Connect on Slack

20 Apr 2020

Benchmarking Triton (TensorRT) Inference Server for Transformer Models

Nitish Shirish Keskar · #engineering

SummaryWe investigate NVIDIA's Triton (TensorRT) Inference Server as a way of hosting Transformer Language Models. The blog is roughly divided into two parts: (i) instructions for setting up your own inference server, and (ii) benchmarking experiments. The instructions are intended to be detailed and standalone, but readers interested solely in

Page 1 of 1
salesforce logo
Content Terms Privacy Cookie Preferences Press
© Copyright 2020 Salesforce.com, inc. All rights reserved. Rights of ALBERT EINSTEIN are used with permission of The Hebrew University of Jerusalem. Represented exclusively by Greenlight.