Improving Graph Convolutional Networks with Lessons from Transformers
Michael Sollami · #deeplearningTransformer-inspired tips for enhancing the design of neural networks that process graph-structured data
Improving Graph Convolutional Networks with Lessons from Transformers
Michael Sollami · #deeplearningTransformer-inspired tips for enhancing the design of neural networks that process graph-structured data
Celebrating the Winners of the Third Annual Salesforce AI Research Grant
Audrey Cook · #newsWe are proud to announce the 2020 winners of our Salesforce AI Research Grant. Each of our winners will receive a $50K grant to advance their work and help us shape the future of AI.
Salesforce Research at EMNLP 2020
Denna Mafie · #researchThis year marks the 24th annual Empirical Methods in Natural Language Processing (EMNLP) conference reimagined for the first time ever in a fully virtual format. EMNLP is a leading conference in the area of Natural Language Processing covering a broad spectrum of diverse research areas that are concerned with computational
Talk to Your Data: One Model, Any Relational Database
Victoria Lin · #natural language interfaceWe introduce Photon, a live demo of natural language interface to databases based on our latest research in neural semantic parsing. đź”— https://naturalsql.com/
Announcing the Annual Salesforce AI Research Grant
Audrey Cook · #newsOur Salesforce Research team is inviting submissions from university faculty, non-profit organizations, and NGOs to apply for our Salesforce AI Research Grant.
SimpleTOD: A Simple Language Model For Task Oriented Dialogue
Ehsan Hosseini-Asl · #researchWe propose a simple causal (unidirectional) language model for Task-oriented Dialogue. SimpleTOD enables modeling of the inherent dependencies between the sub-tasks of task-oriented dialogue, by optimizing for all tasks in an end-to-end manner.
It’s Morphin’ Time! Combating Linguistic Discrimination with Inflectional Perturbations
Samson Tan · #researchMorpheus exposes the potential allocative harms of popular pretrained NLP models by simulating inflectional variation. We propose adversarial fine-tuning for mitigating the effects of training only on error-free Standard English data.
Double Hard-Debias: Tailoring Word Embeddings for Gender Bias Mitigation
Tianlu Wang · #researchWord embeddings inherit strong gender bias in data which can be further amplified by downstream models. We propose to purify word embeddings against corpus regularities such as word frequency prior to inferring and removing the gender subspace, which significantly improves the debiasing performance.
(Re)Discovering Protein Structure and Function Through Language Modeling
Jesse Vig · #researchIn our study, we show how a language model, trained simply to predict a masked (hidden) amino acid in a protein sequence, recovers high-level structural and functional properties of proteins.
Prototypical Contrastive Learning: Pushing the Frontiers of Unsupervised Learning
Junnan Li · #artificial intelligencePrototypical Contrastive Learning unifies clustering and contrastive self-supervised learning to push the frontiers of unsupervised learning.
Explaining Solutions to Physical Reasoning Tasks
Nazneen Rajani · #researchWe show that deep neural models can describe common sense physics in a valid and sufficient way that is also generalizable. Our ESPRIT framework is trained on a new dataset with physics simulations and descriptions that we collected and have open-sourced.
The AI Economist: Improving Equality and Productivity with AI-Driven Tax Policies
Stephan Zheng · #researchIn this work, we focus on the opportunity to use AI to promote social welfare through the design of optimal tax policies in dynamic economies.
ProGen: Using AI to Generate Proteins
Ali Madani · #researchIn our study, we demonstrate that an artificial intelligence (AI) model can learn the language of biology in order to generate proteins in a controllable fashion.
Learning to retrieve reasoning paths from the Wikipedia graph
Akari Asai · #researchOur graph-based trainable retriever-reader framework retrieves evidence paragraphs from Wikipedia to answer open-domain questions. We show state-of-the-art performance on HotpotQA, SQuAD Open, and Natural Questions Open without any architectural changes.
ERASER: A Benchmark to Evaluate Rationalized NLP Models
Nazneen Rajani · #researchMany NLP applications today deploy state-of-the-art deep neural networks that are essentially black-boxes. One of the goals of Explainable AI (XAI) is to have AI models reveal why and how they make their predictions so that these predictions are interpretable by a human. But work in this direction has been
Introducing a Conditional Transformer Language Model for Controllable Generation
Richard Socher · #researchLarge-scale language models show promising text generation capabilities, but users cannot control their generated content, style or train them for multiple supervised language generation tasks.
Leveraging Language Models for Commonsense Reasoning in Neural Networks
Nazneen Rajani · #researchCommonsense reasoning that draws upon world knowledge derived from spatial and temporal relations, laws of physics, causes and effects, and social conventions is a feature of human intelligence.
Celebrating the Inaugural Salesforce Research Deep Learning Grant Winners
Salesforce Research · #researchThis summer, Salesforce Research announced our inaugural deep learning research grant for university researchers and faculty, non-profit organizations, and NGOs. Our goal is to identify and support diverse individuals with innovative ideas to join us in shaping the future of AI.
Identifying Generalization Properties in Neural Networks
Huan Wang · #researchIt has been empirically observed that different local optima, obtained from training deep neural networks don't generalize in the same way for the unseen data sets, even if they achieve the same training loss.
The Natural Language Decathlon
Bryan McCann · #researchDeep learning has significantly improved state-of-the-art performance for natural language processing tasks like machine translation, summarization, question answering, and text classification.
A Multi-Discriminator CycleGAN for Unsupervised Non-Parallel Speech Domain Adaptation
Ehsan Hosseini-Asl · #researchIn the same way that human decisions can be influenced by cognitive biases, decisions made by artificially intelligent systems can be vulnerable to algorithmic biases.
A Domain Specific Language for Automated RNN Architecture Search
Stephen Merity · #researchWhen humans generate novel neural architectures, they go through a surprisingly large amount of trial and error. This holds true almost regardless of how much experience in deep learning that person might have!
Interpretable Counting for Visual Question Answering
Alex Trott · #researchLearning to answer open-ended questions about images, a task known as visual question answering (VQA), has received much attention over the last several years. VQA has been put forth as a benchmark for complete scene understanding and flexible reasoning, two fundamental goals of AI.
Improving end-to-end Speech Recognition Models
Yingbo Zhou · #researchSpeech recognition has been successfully depolyed on various smart devices, and is changing the way we interact with them. Traditional phonetic-based recognition approaches require training of separate components such as pronouciation, acoustic and language model.
Thinking out Loud: Hierarchical and Interpretable Multi-task Reinforcement Learning
Caiming Xiong · #researchDeep reinforcement learning (deep RL) is a popular and successful family of methods for teaching computers tasks ranging from playing Go and Atari games to controlling industrial robots.
Weighted Transformer Network for Machine Translation
Nitish Shirish Keskar · #researchMost neural architectures for machine translation use an encoder-decoder model consisting of either convolutional or recurrent layers. The encoder layers map the input to a latent space and the decoder, in turn, uses this latent representation to map the inputs to the targets.
Fully-parallel text generation for neural machine translation
James Bradbury · #researchOver the past few years, neural networks have driven rapid improvements in accuracy and quality for natural language tasks like text classification and question answering.
How to Talk to Your Database
Victor Zhong · #researchA vast amount of today’s information is stored in relational databases. These databases provide the foundation of systems such as medical records, financial markets, and electronic commerce.
Learned in Translation: Contextualized Word Vectors
Bryan McCann · #researchThere are times when word vectors are initialized to lists of random numbers before a model is trained for a specific task, but it is also quite common to initialize the word vectors of a model with those obtained by running methods like word2vec, GloVe, or FastText.
Your TLDR by an ai: a Deep Reinforced Model for Abstractive Summarization
Romain Paulus · #researchThe last few decades have witnessed a fundamental change in the challenge of taking in new information. The bottleneck is no longer access to information; now it’s our ability to keep up. We all have to read more and more to keep up-to-date with our jobs, the news, and social media.
Learning When to Skim and When to Read
Alexander Rosenberg Johansen · #researchThe rise of Machine Learning, Deep Learning, and Artificial Intelligence more generally has been undeniable, and it has already had a massive impact on the field of computer science.
Knowing When to Look: Adaptive Attention via a Visual Sentinel for Image Captioning
Caiming Xiong · #researchAutomatically generating captions for images has emerged as a prominent interdisciplinary research problem in both academia and industry. It can aid visually impaired users, and make it easy for users to organize and navigate through large amounts of typically unstructured visual data.
A way out of the Odyssey: Analyzing and Combining Recent Insights for LSTMs
Salesforce Research · #researchLSTMs have become a basic building block for many deep NLP models. In recent years, many improvements and variations have been proposed for deep sequence models in general, and LSTMs in particular.
Multiple Different Natural Language Processing Tasks in a Single Deep Model
Kazuma Hashimoto · #researchHumans learn natural languages, such as English, starting from basic grammar to complex semantics in a single brain. How do we build such a single model to handle a variety of natural language processing (NLP) tasks in computers?
State of the art Deep Learning Model for Question Answering
Victor Zhong · #researchWe introduce the Dynamic Coattention Network, a state of the art neural network designed to automatically answer questions about documents.
New Neural Network Building Block Allows Faster and More Accurate Text Understanding
James Bradbury · #researchIn deep learning, there are two very different ways to process input (like an image or a document). Typically, images are processed all at once, with the same kind of computation happening for every part of the image simultaneously.
Teaching Neural Networks to Point to Improve Language Modeling and Translation
Stephen Merity · #researchImagine you were a young child and wanted to ask about something. Being so young (and assuming you are not exceedingly precocious), how would you describe a new object, the name of which you have yet to learn? The intuitive answer: point to it!
The WikiText Long Term Dependency Language Modeling Dataset
Stephen Merity · #researchThe WikiText language modeling dataset is a collection of over 100 million tokens extracted from the set of verified Good and Featured articles on Wikipedia. The dataset is available under the Creative Commons Attribution-ShareAlike License.
MetaMind Neural Machine Translation System for WMT 2016
James Bradbury · #researchNeural Machine Translation (NMT) systems, introduced only in 2013, have achieved state of the art results in many MT tasks. MetaMind’s submissions to WMT ’16 seek to push the state of the art in one such task, English→German newsdomain translation.
Dynamic Memory Networks for Visual and Textual Question Answering
Caiming Xiong · #researchNeural network architectures with memory and attention mechanisms exhibit certain reasoning capabilities required for question answering. One such architecture, the dynamic memory network (DMN), obtained high accuracy on a variety of language tasks.
New Deep Learning Model Understands and Answers Questions
Salesforce Research · #researchToday, we published new state of the art results on a variety of natural language processing (NLP) tasks. Our model, which we call the Dynamic Memory Network (DMN), combines two lines of recent work on memory and attention mechanisms in deep learning.