Learning(to(Translate(with( Mul2ple(Objec2ves � Kevin&Duh&(NAIST)& Katsuhito&Sudoh&(NTT)& Xianchao&Wu&(Baidu)& Hajime&Tsukada&(NTT)& Masaaki&Nagata&(NTT)&
& How&many&metrics&have&been& proposed&for&MT&evaluaGon? � ��
RIBES � DepOverlap � IMPACT � RTE � NIST � TER � BLEU � WER � RED � ParaEval � GTM � PER � TESLA � METEOR � SemPos � NCT � SEPIA � ��
& How&many&metrics&are&used&for& MT&opGmizaGon? � ��
BLEU � ��
Metrics&for&EvaluaGon � for&OpGmizaGon � RIBES � DepOverlap � IMPACT � WER � TER � BLEU � NIST � RED � BLEU � GTM � RTE � TESLA � ParaEval � PER � METEOR � SemPos � NCT � SEPIA � ��
& Each&metric&has&its&strengths.& ! &OpGmize&with&mulGple&metrics � ��
Outline � 1. MoGvaGon& 2. Basic&Concepts:&Pareto&opGmality& 3. MulGobjecGve&opGmizaGon&in&MT& 4. Experiments � ��
Outline � 1. MoGvaGon& 2. Basic&Concepts:&Pareto&opGmality& 3. MulGobjecGve&opGmizaGon&in&MT& 4. Experiments � ��
MulGobjecGve&opGmizaGon � max w [ F 1 ( w ), F 2 ( w ),..., F K ( w )] Find&one&w&that&simultaneously&opGmizes& K&objecGves& But&what&does&it&mean&to&be&“opGmum”?& � ��
MulGobjecGve&opGmizaGon&& of&your&ACL&Hotel � Hotel � Customer( Distance(to( Price( Reviews � Conference(Center � (KRW) � The(Shilla(Jeju � 4.5&stars � 5&minutes � 230,000 � You’re&irraGonal!& That&choice&is¬& Hotel(LoMe(Jeju � 4.5&stars � 5&minutes � 200,000 � Pareto&OpGmal! � � Poonglim(Resort � 3&stars � 10&minutes � 120,000& � Hana(Hotel � 3&stars � 5&minutes � 120,000& � Gyulhyanngi(Pension � 2&stars � 10&minutes � &&90,000& & Vilfredo&Pareto,&& Economist&(1848_1923) �
How&to&define&opGmality � 0.1&&&&&&&&0.2&&&&&&&&0.3&&&&&&&&&0.4&&&&&&&0.5 � A ObjecGve&2 � D B E C F G 0.1&&&&&&&&0.2&&&&&&&&0.3&&&&&&&&&0.4&&&&&&&0.5 � ObjecGve&1 � ���
A&point&p&is& weakly(paretoPop2mal &iff&there&does¬& exist&another&point&q&such&that&F k (q)&>&F k (p)&for&all&k � 0.1&&&&&&&&0.2&&&&&&&&0.3&&&&&&&&&0.4&&&&&&&0.5 � A ObjecGve&2 � D B E C F G 0.1&&&&&&&&0.2&&&&&&&&0.3&&&&&&&&&0.4&&&&&&&0.5 � ObjecGve&1 � ���
A&point&p&is& paretoPop2mal &iff&there&does¬&exist&a&q&such& that&F k (q)&>=&F k (p)&for&all&k&and&F k (q)&>&F k (p)&for&at&least&one&k& � 0.1&&&&&&&&0.2&&&&&&&&0.3&&&&&&&&&0.4&&&&&&&0.5 � A ObjecGve&2 � D B Pareto&&&Weakly_Pareto& � E C F G Weakly_Pareto& � 0.1&&&&&&&&0.2&&&&&&&&0.3&&&&&&&&&0.4&&&&&&&0.5 � ObjecGve&1 � ���
Given&a&set&of&points,&the&subset&of&pareto_ opGmal&points&form&the& Pareto(Fron2er � 0.1&&&&&&&&0.2&&&&&&&&0.3&&&&&&&&&0.4&&&&&&&0.5 � A ObjecGve&2 � D B E C F G 0.1&&&&&&&&0.2&&&&&&&&0.3&&&&&&&&&0.4&&&&&&&0.5 � ObjecGve&1 � ���
Outline � 1. MoGvaGon& 2. Basic&Concepts:&Pareto&opGmality& 3. MulGobjecGve&opGmizaGon&in&MT& 4. Experiments � ���
OpGmizaGon&in&Machine&TranslaGon � Reference(&(( Evalua2on(Metrics � Op2miza2on( NPbest � Weights � ( Decode( Sentence(from( Development(Set � ���
Baseline:(( K ∑ max α k F k ( w ) Linear(Combina2on � w k = 1 Importance&of&each&objecGve&& K ∑ α k ≥ 0, = 1 α k Advantages:& k = 1 1. Single_objecGve&tools&can&be&used& 2. Sufficiency:&If&w*&is&a&soluGon,&then&it’s&Weakly&Pareto& Disadvantages:& 1. How&to&set&α?& 2. No&Necessary&CondiGons:&Some&Pareto&points&can& never&been&obtained,&whatever&semng&of&α. � ���
Pareto&points¬&on&Convex&Hull&are&missed � 0 ≤ α 1 ≤ 0.5 0.1&&&&&&&&0.2&&&&&&&&0.3&&&&&&&&&0.4&&&&&&&0.5 � A ObjecGve&2 � D B 0.5 ≤ α 1 ≤ 1 E C α 1 = 1 F G 0.1&&&&&&&&0.2&&&&&&&&0.3&&&&&&&&&0.4&&&&&&&0.5 � ObjecGve&1 � ���
New&method:&Directly&opGmize&Pareto&Front � ���
New&method:&Directly&opGmize&Pareto&Front � Step&1:&& 0.1&&&&&&&&0.2&&&&&&&&0.3&&&&&&&&&0.4&&&&&&&0.5 � A Compute&Pareto&FronGer& ObjecGve&2 � on&N_best&List& Complexity&O(#objecGve&*&N^2) � D B Step&2:&&& E C Find&w&separaGng&& Pareto&vs.&Non_Pareto � F G 0.1&&&&&&&&0.2&&&&&&&&0.3&&&&&&&&&0.4&&&&&&&0.5 � ObjecGve&1 � ���
MulG_objecGve&Pairwise&Ranking&OpGmizaGon& � Regularizer � Slack � w || w || 2 + c ∑ min ξ ij ij Feature&vector � s.t. w T Φ ( x , y i ) − w T Φ ( x , y j ) ≥ 1 − ζ ij Input&sentence � Good&hypothesis � Poor&hypothesis � ∀ y i ∈ ParetoFront , y j ∉ ParetoFront i.e.&score&of&pareto&hypothesis&should&be&higher&than&non_pareto&hypotheses � ���
Outline � 1. MoGvaGon& 2. Basic&Concepts:&Pareto&opGmality& 3. MulGobjecGve&opGmizaGon&in&MT& 4. Experiments � ���
Experiment&Setup � Task(1:(NIST(ZhPEn( Task(2:(PubMed(EnPJa( ( ( Op2mize(BLEU(&(NTER( Op2mize(BLEU(&(RIBES( NTER(=(max(1PTER,0)( RIBES(=(permuta2on(metric([Isozaki,(EMNLP10]( ( ( Moses(decoder,(7M(train(sentences,( Moses(decoder,(0.2M(train(sentences,(2k(dev,(14( 1.6k(dev,(8(features � features � • Compare&Linear&CombinaGon&vs.&Pareto& – Both&use&pairwise&rank&opGmizaGon,&but&different&objecGve.& – For&Linear&CombinaGon,&mulGple& α&semngs&(α 1 &=&{1,0.7,0.5,0.3,0}) & – 5&runs,&20&iteraGons&each.&Collect/visualize&set&of&soluGons. � ���
Result&VisualizaGon � (α 1 =0.5,&α 2 =0.5)& (α 1 =1,&α 2 =0)& ���
NIST(Result � OBSERVARTIONS: � 1. Pareto&>&Linear&CombinaGon& for&any&α& PubMed(Result � ���
NIST(Result � OBSERVARTIONS: � 1. Pareto&>&Linear&CombinaGon& for&any&α& 2. Metric&tunability:&Pareto& outperform&single_objecGve& opGmizaGon&of&RIBES � PubMed(Result � ���
Analysis:&Number&of&Pareto&Points � ���
Analysis:&Metric&Tunability � Sampling(of(10k(random(w’s � RIBES � BLEU � ���
& Summary&&&Final&Thoughts& ���
Metrics&for&EvaluaGon � for&OpGmizaGon � RIBES � DepOverlap � IMPACT � WER � TER � BLEU � NIST � RED � BLEU � GTM � RTE � TESLA � ParaEval � PER � METEOR � SemPos � NCT � SEPIA � ���
Metrics(for(Evalua2on(and(Op2miza2on � RIBES � DepOverlap � IMPACT � WER � TER � BLEU � NIST � RED � GTM � RTE � TESLA � ParaEval � PER � METEOR � SemPos � NCT � SEPIA �
Vilfredo&Pareto&(1848_1923) � ���
Recommend
More recommend