papillon project
play

Papillon Project Mathieu Mangeot & David Thevenin Work done at - PowerPoint PPT Presentation

Online Generic Editing of Heterogeneous Dictionary Entries in Papillon Project Mathieu Mangeot & David Thevenin Work done at NII, Tokyo, Japan Now looking for a position... My Motivation Dictionaries are a Key Element of almost every


  1. Online Generic Editing of Heterogeneous Dictionary Entries in Papillon Project Mathieu Mangeot & David Thevenin Work done at NII, Tokyo, Japan Now looking for a position...

  2. My Motivation • Dictionaries are a Key Element of almost every NLP System • But Construction Costs are Heavy • => Lowering the Costs by Facilitating the Construction & Maintenance: • Building Dedicated Environments • Mutualizing the Resources by Reusing Existing Ones • Development by Voluntary Contributors • Resulting Data Publicly Available

  3. Outline • The Situation: Manipulation of XML Dictionaries with Heterogeneous Entry Structures • The Problem: How to Edit them Online? • Our Solution: Using an HMI Tool • 2 Examples: Papillon & GDEF Dicts • Conclusion and Future Work

  4. Papillon Platform http://www.papillon-dictionary.org Online Dict Server User Import Browse Papillon DiCo GDEF Dict WaDoKu FeM JMDict Ding Cedict SAIKAM VietDict

  5. Papillon Platform Contributor Online Dict Server Edits Import Papillon DiCo GDEF Dict WaDoKu FeM JMDict Ding Specialist Cedict SAIKAM Checks VietDict

  6. Outline • The Situation: Manipulation of XML Dictionaries with Heterogeneous Entry Structures • The Problem: How to Edit them Online? • Our Solution: Using an HMI Tool • 2 Examples: Papillon & GDEF Dicts • Conclusion and Future Work

  7. Requirements for the Edition • Editor Available Online • Heterogeneous Entry Structures • Adaptative Interfaces • To the User (Neophyth, Specialist) • To the Platform (PDA, Workstation)

  8. The Best: Ad Hoc Editor

  9. Inconvenients • Ad Hoc for a Particular Structure • Must be Reimplemented if the Entry Structure Changes • Local and Platform Dependent • Users Cannot Contribute Online

  10. Distributed & Democratic Conversion with Distribution to a LISP Program the Lexicographers RTF Data Files base

  11. With Word™!

  12. Inconvenients • Not Usable for Complex Structures • One Type of Information Per Line • No Complete Syntax Checking • Real Time Edition Not Possible • Delay Necessary for Conversion & Transport

  13. Online: with HTML

  14. Inconvenients • Not Dynamically Adaptable • Need to Write One Interface for Each Entry Structure • Lack of Interactors • Only Buttons, Text Boxes, Check Boxes & Pop up Menus

  15. Outline • The Situation: Manipulation of XML Dictionaries with Heterogeneous Entry Structures • The Problem: How to Edit them Online? • Our Solution: Using an HMI Tool • 2 Examples: Papillon & GDEF Dicts • Conclusion and Future Work

  16. Our Solution • None of the Previous Solutions Satisfy our Requirements • An Idea • Using HMI Techniques & Tools for Automatically Generating Interfaces • Generation Based on the Data Structure and the User Profile

  17. ArtStudio: a Multitarget Generation Framework Task Concept Initial description Instance Transit description Final description Abstract UI Concrete Concrete Platform Platform UI UI User User Final Final Environment Environment UI UI • Author: David Thevenin

  18. Our Implementation Necessary files: Concept Instances CUI Model: XML Model Model Schema Automatic Generator Generated UIs: Web/HTML Mobile/WML

  19. A Simple Entry entry head pos example example word scientifique adj journées journal scientifiques scientifique Legend: XML Link to a child element Element textual content Link to the element value

  20. Concepts Model: an XML Schema C_entry I_entry C_head I_head C_list I_list C_pos I_pos word word examples examples TextBox PopUp Menu Legend: • List example1 Concept example2 I_ example3 C_example examples Instance Link to a child concept Link to the interactor used by the concept TextBox Link to the instance

  21. Instances Model <entry><hv>scientifique</hv> <pos>adj</pos> I_entry <ex>journées scientifiques</ex> <ex>journal scientifique</ex></entry> I_examples <ex>journées scientifiques</ex> I_head scientifique I_pos adj list <ex>journal scientifique</ex> word Legend: journées journal I_example I_example scientifiques scientifique Instance Link to a child instance Link to the instance value

  22. CUI Model • XML Document • Describes the Graphic User Interface • Interactors and their Position • Target-Dependent • One Model for each Target: • Edition, Visualisation, Mobile Phone

  23. Outline • The Situation: Manipulation of XML Dictionaries with Heterogeneous Entry Structures • The Problem: How to Edit them Online? • Our Solution: Using an HMI Tool • 2 Examples: Papillon & GDEF Dicts • Conclusion and Future Work

  24. Papillon Dictionary • Multilingual Dictionary with a Pivot Structure • Monolingual Entries linked to a Pivot Volume • Microstructure based on the Meaning-T ext Theory • Very Complex: semantic formula, gvt pattern, lexical functions, etc.

  25. Edition Interface

  26. Other Views Consultation: Mobile Phone:

  27. GDEF Dictionary • Bilingual Estonian-French Dictionary • Project Leader: Antoine Chalvin INALCO, Paris • Microstructure based on the Lemma

  28. Edition Interface

  29. Outline • The Situation: Manipulation of XML Dictionaries with Heterogeneous Entry Structures • The Problem: How to Edit them Online? • Our Solution: Using an HMI Tool • 2 Examples: Papillon & GDEF Dicts • Conclusion and Future Work

  30. Conclusion • Innovative Solution • Generic: Multi-Dictionaries • Efficient: already 152 entries for GDEF Dict (2 people, 2 months) • Multitarget: Edition, Consultation, Mobile Phone • Multipurpose: can be adapted for other type of data

  31. Future Work • To Find a Position! • Implementing more Features of the XML Schemata: • Basic Types: boolean, date, etc. • Complex Structures: choice, etc. • Automatizing the Process: • Generation of the Interface Model from the XML Schema

Recommend


More recommend