Model a conditioned language model: - 𝑄 𝑥 " … 𝑥 - 𝑑 = 0 𝑄(𝑥 2 |𝑥 " , … , 𝑥 25" , 𝑑) 27" Condition each word on the history , as well as on a context c .
Model In our case, c is a concatenation of the parameters values embedding vectors c: Personal:False Sentiment:Positive Length:≤10 Descriptive:True Proffesional:True Theme:Plot
Model In our case, c is a concatenation of the parameters values embedding vectors start c: Personal:False Sentiment:Positive Length:≤10 Descriptive:True Proffesional:True Theme:Plot
Model In our case, c is a concatenation of the parameters values embedding vectors start c: Personal:False Sentiment:Positive Length:≤10 Descriptive:True Proffesional:True Theme:Plot
Model In our case, c is a concatenation of the parameters values embedding vectors An start c: Personal:False Sentiment:Positive Length:≤10 Descriptive:True Proffesional:True Theme:Plot
Model In our case, c is a concatenation of the parameters values embedding vectors An start An c: Personal:False Sentiment:Positive Length:≤10 Descriptive:True Proffesional:True Theme:Plot
Model In our case, c is a concatenation of the parameters values embedding vectors An start An c: Personal:False Sentiment:Positive Length:≤10 Descriptive:True Proffesional:True Theme:Plot
Model In our case, c is a concatenation of the parameters values embedding vectors An entertaining start An c: Personal:False Sentiment:Positive Length:≤10 Descriptive:True Proffesional:True Theme:Plot
Model In our case, c is a concatenation of the parameters values embedding vectors family- . friendly attractive story An visually and entertaining family-friendly entertaining attractive visually story start and An c: Personal:False Sentiment:Positive Length:≤10 Descriptive:True Proffesional:True Theme:Plot
The model is simple, but… we need training data annotated with the appropriate values.
Text extract Parameters
Text Meta data extract Heuristics Parameters
Text Meta data extract train Heuristics Parameters
Rotten-Tomatoeswebsite. 7,500 movies. 1,002,625 movie reviews. Text Meta data extract train Heuristics Parameters
Rotten-Tomatoeswebsite. 7,500 movies. 1,002,625 movie reviews. Text Meta data extract train Heuristics Parameters
Professional
Professional In rottentomatoes the critic reviews are separated from the audience review
Professional Non In rottentomatoes the critic reviews are separated from the audience review Professional Professional
Some of the non-professional reviewers are considered as “super reviewers” Also professional
Sentiment
Sentiment Sentiment scores
Sentiment Sentiment We normalized the critics scores to be on 0-5 scale Negative Neutral Positive 0-2 3 4-5
Rotten-Tomatoeswebsite. 7,500 movies. 1,002,625 movie reviews. Text Meta data extract train Heuristics Parameters
Rotten-Tomatoeswebsite. 7,500 movies. 1,002,625 movie reviews. Text Content Meta data words extract train Function Heuristics words POS tags Parameters
Theme Content words To determine the value for the theme parameter we searched for words that are related to the 4 topics and are common in our data set Theme Plot Acting Production Effects Story Acting Effects Director Storytelling Cast Song Directed Plot Performance Music Production Script Play Voice co-production Manuscript Role Visual Tale Miscasting Soundtrack Scene Actor Shot
Theme Content words To determine the value for the theme parameter we searched for words that are related to the 4 topics and are common in our data set Theme Plot Acting Production Effects Story Acting Effects Director Storytelling Cast Song Directed Plot Performance Music Production Script Play Voice co-production Manuscript Role Visual Tale Miscasting Soundtrack Scene Actor Shot Each sentence was labeled with the category that has the most words in the sentence. Sentences that do not include any words from our lists are labeled as other
Personal Voice Personal Pronouns To determine weather a review is written in personal voice we search for words that express subjectivity Personal True False I Other cases My
Descriptiveness Distribution of part-of-speech tags We assume that descriptive texts make heavy use of adjectives Descriptive True False % JJ ≥ 35 Other cases
Length Length 21-40 words ≤ 10 words > 40 words 11-20 words
Dataset Statistics Our final data-set includes 2,773,435 sentences We divided the data set to training (~2.7M), development (~2K) and test (~2K) sets Each sentence is labeled with the 6 parameters
easy Parameters Values Text
easy Parameters Values Text
easy Parameters Values Text hard
extract Parameters Values Text hard
extract Parameters Values Text Conditioned Language Model
extract Parameters Values Text Conditioned Language Model Does this work?
Examples of Generated Sentences Parameter Value Professional False Personal True Length 11-20 Descriptive True Theme Other Sentiment Negative
Examples of Generated Sentences “Ultimately, I can honestly say that this movie is full of stupid stupid and stupid stupid stupid stupid stupid.” Parameter Value Professional False Personal True Length 11-20 Descriptive True Theme Other Sentiment Negative
Examples of Generated Sentences “Ultimately, I can honestly say that this movie is full of stupid stupid and stupid stupid stupid stupid stupid.” Parameter Value Professional False Personal True Length 11-20 Descriptive True Theme Other Sentiment Negative
Examples of Generated Sentences “Ultimately, I can honestly say that this movie is full of stupid stupid and stupid stupid stupid stupid stupid.” Parameter Value Professional False Personal True Length 11-20 Descriptive True Theme Other Sentiment Negative
Examples of Generated Sentences “Ultimately, I can honestly say that this movie is full of stupid stupid and stupid stupid stupid stupid stupid.” Parameter Value Professional False Personal True Length 11-20 Descriptive True Theme Other Sentiment Negative
Examples of Generated Sentences “Ultimately, I can honestly say that this movie is full of stupid stupid and stupid stupid stupid stupid stupid.” Parameter Value Professional False Personal True Length 11-20 Descriptive True Theme Other Sentiment Negative
Examples of Generated Sentences “Ultimately, I can honestly say that this movie is full of stupid stupid and stupid stupid stupid stupid stupid.” Parameter Value Professional False Personal True Length 11-20 Descriptive True Theme Other Sentiment Negative
Examples of Generated Sentences “Ultimately, I can honestly say that this movie is full of stupid stupid and stupid stupid stupid stupid stupid.” Parameter Value Professional False Personal True Length 11-20 Descriptive True Theme Other Sentiment Negative
Examples of Generated Sentences “Ultimately, I can honestly say that this movie “The film’s simple, and a refreshing take on the is full of stupid stupid and stupid stupid stupid complex family drama of the regions of human stupid stupid.” intelligence.” Parameter Value Parameter Value Professional False Professional True Personal True Personal False Length 11-20 Length 11-20 Descriptive True Descriptive False Theme Other Theme Other Sentiment Negative Sentiment Positive
Examples of Generated Sentences “Ultimately, I can honestly say that this movie “The film’s simple, and a refreshing take on the is full of stupid stupid and stupid stupid stupid complex family drama of the regions of human stupid stupid.” intelligence.” Parameter Value Parameter Value Professional False Professional True Personal True Personal False Length 11-20 Length 11-20 Descriptive True Descriptive False Theme Other Theme Other Sentiment Negative Sentiment Positive We would like to quantitatively measure our model capabilities.
Evaluation • Evaluating LM Quality (Perplexity) • Evaluating the Generated Sentences
Evaluating LM Quality
Sanity Check 1. Conditioned vs. Unconditioned Does knowing the parameters indeed helps in achieving better language modeling results?
Sanity Check 1. Conditioned vs. Unconditioned Does knowing the parameters indeed helps in achieving better language modeling results? Dev Test Not-conditioned 25.8 24.4 Conditioned 24.8 23.3 Knowing the correct parameter values indeed results in better perplexity!
Baseline 2. Conditioned vs. Dedicated LMs Is our model effective comparing to train a separate unconditioned LM on subset of the data (dedicated LM)?
Baseline 2. Conditioned vs. Dedicated LMs Is our model effective comparing to train a separate unconditioned LM on subset of the data (dedicated LM)? Data Set
when generating text, we would choose the model that corresponds to the requested Baseline 2. Conditioned vs. Dedicated LMs Is our model effective comparing to train a separate unconditioned LM on subset of the data (dedicated LM)? Sentiment:Positive When generating text, we would choose the model that corresponds Data Set to the requested value Sentiment:Neutral Sentiment:Negative
Recommend
More recommend