predicting structures practical concerns
play

Predicting structures: Practical concerns CS 6355: Structured - PowerPoint PPT Presentation

Predicting structures: Practical concerns CS 6355: Structured Prediction 1 So far What are structures? A graph A collection of parts that are scored jointly A collection of interconnected decisions Conditional


  1. Predicting structures: Practical concerns CS 6355: Structured Prediction 1

  2. So far… What are structures? • A graph – A collection of parts that are scored jointly – A collection of interconnected decisions – • Conditional models We want to convert some input to an output – Model the conditional distribution of the output – Score groups of inter-connected variables – • Algorithms for learning Local vs. global learning – Different algorithms – Inference algorithms • Predicting the final output – Different algorithms, tradeoffs – 2

  3. This lecture: Using the tools: Practical concerns What are structures? • A graph – A collection of parts that are scored jointly – A collection of interconnected decisions – • Conditional models We want to convert some input to an output – Model the conditional distribution of the output – Score groups of inter-connected variables – • Algorithms for learning Local vs. global learning – Different algorithms – Inference algorithms • Predicting the final output – Different algorithms, tradeoffs – 3

  4. This lecture: Using the tools: Practical concerns • We want to solve a task. Many choices ahead! • What are structures? • A graph – A collection of parts that are scored jointly – A collection of interconnected decisions – • Conditional models We want to convert some input to an output – Model the conditional distribution of the output – Score groups of inter-connected variables – • Algorithms for learning Local vs. global learning – Different algorithms – Inference algorithms • Predicting the final output – Different algorithms, tradeoffs – 4

  5. This lecture: Using the tools: Practical concerns • We want to solve a task. Many choices ahead! • What are structures? • A graph – A collection of parts that are scored jointly – What is the graph? A collection of interconnected decisions – • Conditional models We want to convert some input to an output – Model the conditional distribution of the output – Score groups of inter-connected variables – • Algorithms for learning Local vs. global learning – Different algorithms – Inference algorithms • Predicting the final output – Different algorithms, tradeoffs – 5

  6. This lecture: Using the tools: Practical concerns • We want to solve a task. Many choices ahead! • What are structures? • A graph – A collection of parts that are scored jointly – What is the graph? A collection of interconnected decisions – • Conditional models Modeling our problem? • We want to convert some input to an output – Identifying variables? • Model the conditional distribution of the output – Identifying groups that are • Score groups of inter-connected variables – scored together? (factors) What are features? • • Algorithms for learning Local vs. global learning – Different algorithms – Inference algorithms • Predicting the final output – Different algorithms, tradeoffs – 6

  7. This lecture: Using the tools: Practical concerns • We want to solve a task. Many choices ahead! • What are structures? • A graph – A collection of parts that are scored jointly – What is the graph? A collection of interconnected decisions – • Conditional models Modeling our problem? • We want to convert some input to an output – Identifying variables? • Model the conditional distribution of the output – Identifying groups that are • Score groups of inter-connected variables – scored together? (factors) What are features? • • Algorithms for learning Local vs. global learning – Different algorithms – The best way to learn? Inference algorithms • Predicting the final output – Different algorithms, tradeoffs – 7

  8. This lecture: Using the tools: Practical concerns • We want to solve a task. Many choices ahead! • What are structures? • A graph – A collection of parts that are scored jointly – What is the graph? A collection of interconnected decisions – • Conditional models Modeling our problem? • We want to convert some input to an output – Identifying variables? • Model the conditional distribution of the output – Identifying groups that are • Score groups of inter-connected variables – scored together? (factors) What are features? • • Algorithms for learning Local vs. global learning – Different algorithms – The best way to learn? Inference algorithms • Predicting the final output – Different algorithms, tradeoffs – What inference algorithm? 8

  9. Modeling your problem Understand the problem: What should your program produce? • Is there data? Very often, the answer is no. L – What are the decisions/random variables that constitute the output? • How do they interact? Identifying factors/parts • Some interactions are natural, some are spurious (specific to your small collection of data) – Some interactions make inference impossible for computational reasons – What are the feature representations? – Learning • What are the scoring functions? – Should every scoring function be jointly learned? – Perhaps, learn sub-sections independently and put them together with inference at the end – Which learning algorithm? – Inference • What algorithm? How expensive is it? – Exact or approximate? – 9

  10. Example 0: Named Entity Recognition Goal: To identify persons, locations and organizations in text Facebook CEO Mark Zuckerberg announced new privacy features in the conference in San Francisco 10

  11. Example 0: Named Entity Recognition Goal: To identify persons, locations and organizations in text Organization Person Facebook CEO Mark Zuckerberg announced new Location privacy features in the conference in San Francisco 11

  12. Example 0: Named Entity Recognition Goal: To identify persons, locations and organizations in text Organization Person Facebook CEO Mark Zuckerberg announced new Location privacy features in the conference in San Francisco Design choices: 1. What are the set of decisions the predictor needs to make? 2. How do these decisions interact? Factors? 3. Features? Factor potentials/scoring functions? 4. Learning? Inference? 12

  13. Example 0: Named Entity Recognition Goal: To identify persons, locations and organizations in text PER LOC ORG NONE What are the set of ✓ Facebook ✗ ✗ ✗ decisions the predictor ✓ Facebook CEO ✗ ✗ ✗ needs to make? ✓ Facebook CEO Mark ✗ ✗ ✗ One option: Label ✓ Facebook CEO Mark Zuckerberg ✗ ✗ ✗ spans of text … ✓ Mark Zuckerberg ✗ ✗ ✗ …. 13

  14. Example 0: Named Entity Recognition Goal: To identify persons, locations and organizations in text PER LOC ORG NONE How do the decisions ? Facebook ? ? ? interact? ? Facebook CEO ? ? ? ? Facebook CEO Mark ? ? ? A single word can ? ? ? ? Facebook CEO Mark Zuckerberg have only one label … ? Mark Zuckerberg ? ? ? …. 14

  15. Example 0: Named Entity Recognition Goal: To identify persons, locations and organizations in text PER LOC ORG NONE How do the decisions Disallowed ✓ ? Facebook ? ? interact? together ✓ ? Facebook CEO ? ? ? ? Facebook CEO Mark ? ? A single word can ? ? ? ? Facebook CEO Mark Zuckerberg have only one label … ? Mark Zuckerberg ? ? ? …. 15

  16. Example 0: Named Entity Recognition Goal: To identify persons, locations and organizations in text PER LOC ORG NONE Features? Factor Disallowed ✓ ? Facebook ? ? potentials/scoring functions? together ✓ ? Facebook CEO ? ? ? ? Facebook CEO Mark ? ? Score(span, label) ? ? ? ? Facebook CEO Mark Zuckerberg … Could be linear in features • ? Mark Zuckerberg ? ? ? Could be a neural network • …. 16

  17. Example 0: Named Entity Recognition Goal: To identify persons, locations and organizations in text PER LOC ORG NONE Learning and inference Disallowed ✓ ? Facebook ? ? together ✓ ? Facebook CEO ? ? ? ? Facebook CEO Mark ? ? Various learning regimes ? ? ? ? Facebook CEO Mark Zuckerberg … Various inference algorithms ? Mark Zuckerberg ? ? ? …. 17

  18. Example 0: Named Entity Recognition Goal: To identify persons, locations and organizations in text A different modeling choice: One label per word 18

  19. Example 0: Named Entity Recognition Goal: To identify persons, locations and organizations in text A different modeling choice: One label per word Facebook CEO Mark Zuckerberg announced new privacy features in the conference in San Francisco 19

  20. Example 0: Named Entity Recognition Goal: To identify persons, locations and organizations in text A different modeling choice: One label per word Facebook CEO Mark Zuckerberg announced new privacy features in the conference in San Francisco B-org = Start of organization B-loc = Start of location B-per = Start of person I-loc = In location I-per = In person O = Not a named entity 20

Recommend


More recommend