This work presents a novel objective function for the unsupervised training of neural network sentence encoders. It exploits signals from paragraph-level discourse coherence to train these models to understand text. Our objective is purely …
In this work, we present the Grounded Recurrent Neural Network (GRNN), a recurrent neural network architecture for multi-label prediction which explicitly ties labels to specific dimensions of the recurrent hidden state (we call this process …
We consider multi-class classification where the predictor has a hierarchical structure that allows for a very large number of labels both at train and test time. The predictive power of such models can heavily depend on the structure of the tree, …
We describe a simple neural language model that relies only on character-level inputs. Predictions are still made at the word-level. Our model employs a convolutional neural network (CNN) and a highway network over characters, whose output is given …
Language modelling is a fundamental building block of natural language processing. However, in practice the size of the vocabulary limits thedistributions applicable for this task: specifically, one has to either resort to local optimization methods, …
This paper introduces algorithms for non-projective parsing based on dual decomposition. We focus on parsing algorithms for non-projective head automata, a generalization of head-automata models to non-projective structures. The dual decomposition …
This paper introduces dual decomposition as a framework for deriving inference algorithms for NLP problems. The approach relies on standard dynamic-programming algorithms as oracle solvers for sub-problems, together with a simple method for forcing …