String Extension Learning Jeffrey Heinz heinz@udel.edu University - PowerPoint PPT Presentation

Intro Defining learning S.E. language classes Examples Lattices Conclusion String Extension Learning Jeffrey Heinz heinz@udel.edu University of Delaware Association for Computational Linguistics Uppsala, Sweden July 13, 2010 1 / 29

Intro Defining learning S.E. language classes Examples Lattices Conclusion How can something learn? 1. How do people generalize beyond their experience? 2. How can any thing that computes generalize beyond its experience? • Linguistics • Computer Science • Artificial Intelligence • Natural Language Processing • Psychology • Language Acquisition • Philosophy • . . . 2 / 29

Intro Defining learning S.E. language classes Examples Lattices Conclusion This talk 1. Provide a recipe for constructing classes of languages (families of possible generalizations) f → L f 2. Show each these classes are identifiable in the limit from positive data (Gold 1967) with learners that are incremental, globally consistent, locally conservative , and set-driven . 3. Reveal the lattice structure underlying this class of languages and how the learner “climbs the lattice” (Kasprzik and Koetzing 2010) 3 / 29

Intro Defining learning S.E. language classes Examples Lattices Conclusion This talk 1. Provide a recipe for constructing classes of languages (families of possible generalizations) f → L f 2. Show each these classes are identifiable in the limit from positive data (Gold 1967) with learners that are incremental, globally consistent, locally conservative , and set-driven . 3. Reveal the lattice structure underlying this class of languages and how the learner “climbs the lattice” (Kasprzik and Koetzing 2010) Each string in the language maps to part of the grammar which generates a set of strings in the language 3 / 29

Intro Defining learning S.E. language classes Examples Lattices Conclusion String Extension Learners: Advantages 1. Distribution-free learning is a more difficult learning criteria than non-distribution-free learning criteria (Gold 1967, Horning 1969, Valiant 1984, Angluin 1988, Blumer et al. 1989) 2. Learners are very simple (useful pedagogically). 3. Learners are efficient (in size of learning sample) if f is efficient in the size of the word. 4. String extension learnable classes include ones of infinite size and ones which contain context-sensitive languages. 4 / 29

Intro Defining learning S.E. language classes Examples Lattices Conclusion String Extension Learners: Advantages 5. Provides a unified learning-theoretic analysis of many previously discussed learnable classes. Locally k -Testable Piecewise k -Testable Strictly k -Local Strictly k -Piecewise Strongly Testable Definite (McNaughton and Papert 1971, Rogers and Pullum, to appear, Simon 1975, Rogers et al., 2009, Beauquier and Pin 1991, Brzozowski 1962) 6. Probabilistic versions of many of these classes exist (e.g. previous talk!) 7. Different perspective: modular learning 5 / 29

Intro Defining learning S.E. language classes Examples Lattices Conclusion N } . Identification in the Limit from Positive Data • A text t is an infinite sequence: t (0) , t (1) , . . . with each t ( i ) ∈ Σ ∗ ∪ { # } . The # is a pause . • t [ i ] denotes the finite sequence t (0) , t (1) , . . . t ( i ). • The content of text t is the set { t ( i ) : i ∈ • t is a positive text for a language L iff content ( t ) = L . • Following Jain et al. 1999, let SEQ be the set of all possible t [ i ]. • A learner is a function φ : SEQ → G . The elements of G generate languages in some well-defined way. 6 / 29

N and a Intro Defining learning S.E. language classes Examples Lattices Conclusion Identification in the Limit from Positive Data • A learner converges on a text t iff there exists i ∈ grammar G such that for all j > i , φ ( t [ j ]) = G . • A learner φ identifies a language L in the limit iff for any positive text t for L , φ converges on t to grammar G and L ( G ) = L . • A learner φ identifies a class of languages L in the limit iff for any L ∈ L , φ identifies L in the limit. 7 / 29

Intro Defining learning S.E. language classes Examples Lattices Conclusion An Example of String Extension Learning k -factor languages: A word is well-formed iff the contiguous subsequences in the word are well-formed Functions fac k ( w ) = { x ∈ Σ k : ∃ u, v ∈ Σ ∗ such that w = uxv } (1) Grammars G ∈ P (Σ k ) (2) Languages L ( G ) = { w : fac k ( w ) ⊆ G } (3) 8 / 29

Intro Defining learning S.E. language classes Examples Lattices Conclusion Example: Illustration of k -factor Learning Let L = Σ ∗ \ Σ ∗ ba Σ ∗ time Word w fac 2 ( w ) Grammar G Language of G ∅ ∅ -1 a ∗ 0 aaaa { aa } { aa } a ∗ ∪ a ∗ b 1 aab { aa, ab } { aa, ab } a ∗ ∪ a ∗ b 2 a ∅ { aa, ab } { bb } { aa, ab, bb } Σ ∗ \ Σ ∗ ba Σ ∗ 3 bbb Σ ∗ \ Σ ∗ ba Σ ∗ 4 abbb { ab, bb } { aa, ab, bb } . . . 9 / 29

Intro Defining learning S.E. language classes Examples Lattices Conclusion What matters? k -factor languages: A word is well-formed iff the contiguous subsequences in the word are well-formed Functions fac k ( w ) = { x ∈ Σ k : ∃ u, v ∈ Σ ∗ such that w = uxv } (1) Grammars G ∈ P (Σ k ) (2) Languages L ( G ) = { w : fac k ( w ) ⊆ G } (3) 10 / 29

Intro Defining learning S.E. language classes Examples Lattices Conclusion Definitions of String Extentsion Functions, Grammars, and Languages • Let f be a total function with domain Σ ∗ and codomain the finite powerset of A , written P fin ( A ). Functions f : Σ ∗ → P fin ( A ) (4) Grammars G ∈ P fin ( A ) (5) Languages L f ( G ) = { w ∈ Σ ∗ : f ( w ) ⊆ G } (6) Classes L f = { L ( G ) : G ∈ P fin ( A ) } (7) 11 / 29

Intro Defining learning S.E. language classes Examples Lattices Conclusion Structure in L f Theorem (Closure under intersection) For any f ∈ SEF , L f is closed under intersection. • String extension language classes are not in general closed under union, complement or reversal (counterexamples are given later as examples.) 12 / 29

Intro Defining learning S.E. language classes Examples Lattices Conclusion Extending domain of f � f ( L ) = f ( w ) (8) w ∈ L Lemma (Monotonicity) Let L, L ′ ∈ L f . L ⊆ L ′ iff f ( L ) ⊆ f ( L ′ ) 13 / 29

Intro Defining learning S.E. language classes Examples Lattices Conclusion Characteristic Sample for each L ∈ L f Lemma (Smallest L in L f ) For any finite L 0 ⊆ Σ ∗ , L = L ( f ( L 0 )) is the smallest language in L f containing L 0 . Theorem (Characteristic Sample) For all L ∈ L f , there is a finite sample S such that L is the smallest language in L f containing S . S is called a characteristic sample of L in L f . Corollary For all L ∈ L f , a characteristic sample is f ( L ) . 14 / 29

Intro Defining learning S.E. language classes Examples Lattices Conclusion The String Extension Learner Definition For all f ∈ SEF , define φ f as follows:  ∅ if i = − 1  φ f ( t [ i ]) = φ f ( t [ i − 1]) if t ( i ) = # φ f ( t [ i − 1]) ∪ f ( t ( i )) otherwise  15 / 29

String Extension Learning Jeffrey Heinz heinz@udel.edu University - PowerPoint PPT Presentation

Intro Defining learning S.E. language classes Examples Lattices Conclusion String Extension Learning Jeffrey Heinz heinz@udel.edu University of Delaware Association for Computational Linguistics Uppsala, Sweden July 13, 2010 1 / 29

The String Class Trace Code Constructing a String String s = "Java"; String

1 2 3+4 2 type Parser = String Tree type Parser = String ( Tree, String) type Parser =

String Matching Inge Li Grtz CLRS 32 String Matching String matching problem: string

String Matching String matching problem: string T (text) and string P (pattern) over an

String Objectives Discuss string handling System.String class

HashMap Friday Four Square Today! Outside Gates at 4:15PM Not All Data is Linear

String Theory Ideology Or Tool Box Plan What is string theory? Unification ideology.

Character String 1 What we should learn about strings Representation in C String Literals

61A Lecture 16 Announcements String Representations String Representations 4 String

Improving User Experience for translators Translate Extension Translate Extension Translate

String Theory String Theory Thiago Macieira Thiago Macieira Qt Developer Days 2014 Qt

Gaugino masses from string loops problem: m 1 / 2 = 0 to lowest order generated by string

Symbolic String Verification: Combining String Analysis and Size Analysis Fang Yu Tevfik Bultan

What's a string? Characters enclosed by double quotes "this is a String" " this

+ Symbolic Encryption + String class // Comparing String objects, see reference below. String p

String Objects: The string class library Lecture 12 COP 3014 Spring 2018 March 26, 2018

Unions of Reducibility Families for -Calculus with Orthogonal Rewriting Colin Riba INRIA

Parallel Hyperedge Replacement Newcastle Geometry & Algebra Seminar Graham Campbell School

Characterising State Spaces of Concurrent Systems Eike Best University of Oldenburg Work

Table of contents 1. Quantum gravity & BH thermodynamics 2. AdS / CFT correspondence 3.

CS 188: Artificial Intelligence Neural Nets (wrap-up) and Decision Trees Instructors: Michele Van

CS 188: Artificial Intelligence Neural Nets Instructors: Brijen Thananjeyan and Aditya

Axiomatizing modal fixpoint logics Yde Venema http://staff.science.uva.nl/~yde SYSMICS, 8

Combining Theories Sharing Set Operations Ruzica Piskac joint work with Thomas Wies and Viktor

String Extension Learning Jeffrey Heinz heinz@udel.edu University - PowerPoint PPT Presentation

Intro Defining learning S.E. language classes Examples Lattices Conclusion String Extension Learning Jeffrey Heinz heinz@udel.edu University of Delaware Association for Computational Linguistics Uppsala, Sweden July 13, 2010 1 / 29

The String Class Trace Code Constructing a String String s = &quot;Java&quot;; String

1 2 3+4 2 type Parser = String Tree type Parser = String ( Tree, String) type Parser =

String Matching Inge Li Grtz CLRS 32 String Matching String matching problem: string

String Matching String matching problem: string T (text) and string P (pattern) over an

String Objectives Discuss string handling System.String class

HashMap Friday Four Square Today! Outside Gates at 4:15PM Not All Data is Linear

String Theory Ideology Or Tool Box Plan What is string theory? Unification ideology.

Character String 1 What we should learn about strings Representation in C String Literals

61A Lecture 16 Announcements String Representations String Representations 4 String

Improving User Experience for translators Translate Extension Translate Extension Translate

String Theory String Theory Thiago Macieira Thiago Macieira Qt Developer Days 2014 Qt

Gaugino masses from string loops problem: m 1 / 2 = 0 to lowest order generated by string

Symbolic String Verification: Combining String Analysis and Size Analysis Fang Yu Tevfik Bultan

What's a string? Characters enclosed by double quotes &quot;this is a String&quot; &quot; this

+ Symbolic Encryption + String class // Comparing String objects, see reference below. String p

String Objects: The string class library Lecture 12 COP 3014 Spring 2018 March 26, 2018

Unions of Reducibility Families for -Calculus with Orthogonal Rewriting Colin Riba INRIA

Parallel Hyperedge Replacement Newcastle Geometry &amp; Algebra Seminar Graham Campbell School

Characterising State Spaces of Concurrent Systems Eike Best University of Oldenburg Work

Table of contents 1. Quantum gravity &amp; BH thermodynamics 2. AdS / CFT correspondence 3.

CS 188: Artificial Intelligence Neural Nets (wrap-up) and Decision Trees Instructors: Michele Van

CS 188: Artificial Intelligence Neural Nets Instructors: Brijen Thananjeyan and Aditya

Axiomatizing modal fixpoint logics Yde Venema http://staff.science.uva.nl/~yde SYSMICS, 8

Combining Theories Sharing Set Operations Ruzica Piskac joint work with Thomas Wies and Viktor

The String Class Trace Code Constructing a String String s = "Java"; String

What's a string? Characters enclosed by double quotes "this is a String" " this

Parallel Hyperedge Replacement Newcastle Geometry & Algebra Seminar Graham Campbell School

Table of contents 1. Quantum gravity & BH thermodynamics 2. AdS / CFT correspondence 3.