Neural Classification of Linguistic Coherence using Long Short-Term Memories Pashutan Modaresi, Matthias Liebeck and Stefan Conrad
Order of Sentences Hi! My name is Alan. My name is Alan. Hi! Is what makes a text semantically I am a computer scientist! I am a computer scientist! meaningful Hi! ➔ Hi! I am a computer scientist! I am a computer scientist! My name is Alan. ➔ My name is Alan. My name is Alan. Hi! I am a computer scientist! ➔ I am a computer scientist! My name is Alan. Hi! I am a computer scientist! My name is Alan. Hi!
Humans vs. Machines Discourse Coherence Linguistic Contradiction Question Is there a need to teach Linguistic Redundancy all these abilities to a machine? Pragmatics
0, 1 Sentence Ordering -1, 0, 1 Question? Hi! My name is Alan. What about the sizes of m and m’ ? Should they be equal?
Many Applications! Focus was TEXT SUMMARIZATION in the news domain Question? What are the other applications of sentence ordering?
Treat the problem as a classification task Number of Instances Question Why do we use the negative log-likelihood and not the log-likelihood? Class probability of the n-th pair
Deep Neural Architecture S1 +1 0 LSTM Dropout LSTM Dropout -1 S2
Deep Neural Architecture One-Hot Encoding S1 +1 0 LSTM Dropout LSTM Dropout -1 S2
Tip Embedding: Simple matrix multiplication Deep Neural Architecture with input vector Init the matrix E Embedding S1 +1 0 LSTM Dropout LSTM Dropout -1 S2
Tip Concatenate the embeddings Deep Neural Architecture Merge S1 +1 0 LSTM Dropout LSTM Dropout -1 S2
Tip LSTM: Just a special kind of RNNs addressing Deep Neural Architecture their difficulties Long Short-Term Memory S1 +1 0 LSTM Dropout LSTM Dropout -1 S2
Tip Dropout: Sets a random set of its arguments to Deep Neural Architecture zero. Regularization S1 +1 0 LSTM Dropout LSTM Dropout -1 S2
Tip Dropout: Sets a random set of its arguments to Deep Neural Architecture zero. Softmax S1 +1 0 LSTM Dropout LSTM Dropout -1 S2
Data How to collect the required data to train the -1 network? +1 Binary ➔ Correct Wrong Order Order Ternary ➔ +1 -1 0 Correct Wrong Mising Order Order Context
Baseline - SVM English German Binary Ternary Binary Ternary 0.24 0.16 0.25 0.16 SVMs: Not really appropriate for sequential modelling
Macro-Averaged F1
Lessons Learned ● Use appropriate tools for sequence modeling ● RNNs are slow. First train on a subset of data ● Train deep models with lots of data points ● Find a way to automatically annotate data ● Use regularization (be generous)
Thank You For Your Attention
Recommend
More recommend