salesforce logo About
Research
Ethics
Blog
Outreach
Products
Careers
AI Economist
salesforce logo ☰
About Research Publications Open source Asia Lab Ethics Blog Outreach Products Careers AI Economist About Get Involved Fork us on Github Connect on Slack

20 Apr 2020

Benchmarking Triton (TensorRT) Inference Server for Transformer Models

Nitish Shirish Keskar · #engineering

SummaryWe investigate NVIDIA's Triton (TensorRT) Inference Server as a way of hosting Transformer Language Models. The blog is roughly divided into two parts: (i) instructions for setting up your own inference server, and (ii) benchmarking experiments. The instructions are intended to be detailed and standalone, but readers interested solely in

08 Nov 2017

Weighted Transformer Network for Machine Translation

Nitish Shirish Keskar · #research

Most neural architectures for machine translation use an encoder-decoder model consisting of either convolutional or recurrent layers. The encoder layers map the input to a latent space and the decoder, in turn, uses this latent representation to map the inputs to the targets.

Page 1 of 1
salesforce logo
Content Terms Privacy Cookie Preferences Press
© Copyright 2020 Salesforce.com, inc. All rights reserved. Rights of ALBERT EINSTEIN are used with permission of The Hebrew University of Jerusalem. Represented exclusively by Greenlight.