log ( parseProb ) (Alex) log ( parseProb / trigramProb ) (Anoop) - PowerPoint PPT Presentation

Features • “Implicit” Syntax • Shallow Syntax (POS, chunks) • Deep Syntax (trees) • Tricky Syntax (tree fragments) Syntax for Statistical MT JHU 2003 WS

Deep Syntax • What is deep? — use of parser output • Why parser? — grammaticality can be measured by parse trees • How to use parser output? – simple features – model-based features – dependency-based features – tree fragments Syntax for Statistical MT JHU 2003 WS

Simple features: Parser score Motivation: grammatical sentences should have higher parse prob. Feature Functions: • log ( parseProb ) (Alex) • log ( parseProb / trigramProb ) (Anoop) Result: worse than baseline Syntax for Statistical MT JHU 2003 WS

Does Parser give high probability for grammatical sentence? Parser LogProb for produced/oracle/reference sentences (Shankar) log ( parseProb ) produced -147.2 oracle -148.5 ref 1 -148.0 ref 2 -157.5 ref 3 -155.6 ref 4 -158.6 Syntax for Statistical MT JHU 2003 WS

Other simple parse-tree features Motivation: grammatical sentences should have specific tree shape. Feature Functions: (Anoop) • right branching factor • tree depth • num. of PPs • VP probs • ... Syntax for Statistical MT JHU 2003 WS

Model-based features Translation Model as Feature Function • Originally developed as a standalone model P ( f | e ) – Syntax-based model for parse trees • P ( f | e ) can be used as a feature value – Tree-based models represent systematic difference between two languages’ grammar ∗ e.g. SVO vs. verb-final word order ∗ constituents (e.g. NP) tend to move as a unit • Better translation should yield higher probs • featureVal = log [ P ( f | e )] Syntax for Statistical MT JHU 2003 WS

Syntax-based Translation Model Tree-based probability model for translation • Early work: – Inversion Transduction Grammar [Wu 1997] – Bilingual Head Automata [Alshawi, et. al 2000] • Tree-to-String [Yamada & Knight 2001] • Tree-to-Tree [Gildea 2003] Syntax for Statistical MT JHU 2003 WS

Syntax-based Translation Model (cont) Probabilistic operation on parse tree: • Reorder • Insert • Translate • Merge • Clone Parameters are estimated from training pairs (Tree/Tree, Tree/String) using EM algorithm. Syntax for Statistical MT JHU 2003 WS

Tree-to-String Alignment Yamada & Knight 2001 S NP 1 NP 2 NP 3 VB 4 Chu-Ka Kong-Keup-Mul 103 Tae-Tae Sa-Ryeong-Pu Cu S NP 3 VB 4 NP 2 NP 1 Sa-Ryeong-Pu Cu 103 Tae-Tae Chu-Ka Kong-Keup-Mul re-order step: P r ( 3 , 4 , 2 , 1 | S ⇒ NP NP NP VB ) Syntax for Statistical MT JHU 2003 WS

Tree-to-String Alignment 2 S NP VB NP NP the Sa-Ryeong-Pu Cu 103 Tae-Tae Chu-Ka Kong-Keup-Mul insertion step: P ins ( the ) P ( ins | NP ) S NP VB NP NP Headquarters gave the 103rd battalion additional supplies translation step: P t ( give | Cu ) Syntax for Statistical MT JHU 2003 WS

Tree-to-Tree Alignment Chinese tree: Merge/Split nodes: xianzhu xianzhu chengjiu chengshi chengjiu chengshi jianshe shisi Zhongguo shisi bianjing kaifang jingji jianshe Zhongguo bianjing kaifang jingji ge Zhongguo ge Lexical Translation: marked Reorder: xianzhu cities achievements chengjiu chengshi jianshe ’s 14 open border economic shisi kaifang bianjing Zhongguo jingji ge China Zhongguo Syntax for Statistical MT JHU 2003 WS

Cloning example S VP VP VP NP LV NP Ci-Keup VP NP NNC VV issued Myeoch Su-Kap Pat Ci NP LV NNX how gloves each you Ci-Keup Ssik Khyeol-Re VV NULL many pairs Pat Ci NULL NULL Syntax for Statistical MT JHU 2003 WS

Problems • n -best list doesn’t contain big word jump – reordering at upper node is useless • English/Chinese word-order is almost the same – both SVO in general – but relative clause comes before noun • Computationally expensive – use word-level alignment from MT output – limit by sentence length and fanout – break up long sentences into small fragments (machete) Syntax for Statistical MT JHU 2003 WS

Experiments Tree-to-String (Kenji, Anoop) • Trained on 3M words of parallel text – English side parsed by Collins • Max sentence length 20 Chinese characters – 273/993 sentences covered Tree-to-Tree (Dan, Katherine) • Trained on 40,000 biparsed FBIS sentences • Max fan-out 6, max sentence length 60 – 525/993 sentences covered Syntax for Statistical MT JHU 2003 WS

Results BLEU% Baseline 31.6 ParseProb 31.6 ParseProbDivLM 31.0 RightBranching 31.6 TreeDepth 31.5 numPPs 31.3 VPProb 31.3 Tree-to-String 31.7 Tree-to-Tree 31.6 Syntax for Statistical MT JHU 2003 WS

Lessons / Directions • Feature combination: BLEU 31.6 → 33.2 • But two thirds of improvement from lexical probs (IBM model 1) • Hard to use off-the-shelf taggers, parsers, etc • Limitations of rescoring n-best lists: syntax-based decoders • Probelms with evaluation metric: – human evaluation – syntax-based measures Syntax for Statistical MT JHU 2003 WS

log ( parseProb ) (Alex) log ( parseProb / trigramProb ) (Anoop) - PowerPoint PPT Presentation

Features Implicit Syntax Shallow Syntax (POS, chunks) Deep Syntax (trees) Tricky Syntax (tree fragments) Syntax for Statistical MT JHU 2003 WS Deep Syntax What is deep? use of parser output Why parser?

(142733/102960-Log[4])+(614851/73920-2 Log[64]) h 2 +(2329/1680-Log[4]) h 4 -h 10 /20160

Chandra data reduction The CDFs Giorgio, Margherita, Elisabeta, Eleonora, Lazarus, Enrica,

Syslog and Log Rotate Computer Center, CS, NCTU Log files Execution information of each

Distributed ephemeral log service Log entries are replicated,dispersed See Ivy,

Scaling Log-Structured KV-Stores featuring Monkey and Dostoevsky SIGMOD17 / SIGMOD18 Niv Dayan

Section 3.7 Derivatives of logarithmic functions 1 Rules of exponentials and logarithms 1.

Section5.4 Properties of Logarithmic Functions PropertiesofLogarithms Formulas Basic

STUDIES OF CLOSED/OPEN MIRROR SYMMETRY FOR QUINTIC THREE-FOLDS THROUGH LOG MIXED HODGE THEORY 0.

CS320: Performance Evaluation Plotting data sets Semi log plots Log log plots Analyzing Program

Complementary log-log and probit: activation functions implemented in artificial neural networks

CS4102 Algorithms Summer 2020 Warm up Show log ! = ( log ) Hint: show !

CS320: Performance Evaluation Plotting data sets Semi-log plots Log-log plots Analyzing Program

Paper 1 example question A Find the value of 1 1 1 1 27 1 27 log 3 2 27 4

y ( y log x x a ) a is equivalent to f(x) = log a x Logarithmic

HOW TO PROCESS REFERRALS RECEIVED FROM A HOSPITAL WITHIN COMMUNITY PHARMACY Log In: Log in

Index-Free Log Analytics with Kafka Kresten Krab Thorup, Humio CTO Log Everything, Answer

Course Script INF 5110: Compiler con- struction INF5110, spring 2020 Martin Steffen Contents

61A Lecture 37 Two TAs are available every hour One room will be a review session going

Abstract Syntax Trees COMP 520: Compiler Design (4 credits) Alexander Krolik

Exceptions Introduction to Computing Using Python Types of errors We saw different types of

Recall Impcore concrete syntax Definitions and expressions: def ::= (define f (x1 ... xn) exp)

Gesture Recognition: Hand Pose Estimation Adrian Spurr Ubiquitous Computing Seminar FS2014

Exact Camera Location Recovery by Least Unsquared Deviations Gilad Lerman University of

Hyperprior bayesian approach for inverse problems in imaging. Application to single shot HDR.