capturing natural interactions Nick Campbell Trinity College, - PowerPoint PPT Presentation

capturing natural interactions Nick Campbell Trinity College, Dublin Clarin/FLaReNet Workshop@KTH November 26th, 2009 Thursday 26 November 2009

introduction • Speech recognition and synthesis technologies can now be considered mature, but their simple incorporation into speech-based human-computer interfaces reveals shortcomings in their capabilities. • Perhaps the biggest reason for this is that each technology was designed explicitly to convert between text and spoken modalities, without taking into consideration the complexities of human spoken interaction as the joint creation of mutually understood meaning. • If the goal of this meeting is to facilitate data collection to improve these interfaces and make them more intelligent and human-like (through a better understanding of human interaction and communication), then we might make a start by designing improved techniques for efficiently capturing, storing, annotating, and distributing large corpora of natural spoken interactions Thursday 26 November 2009

Overview of the Talk • Speech & Multimodal Databases • Annotating, Viewing & Distributing Data • Two-way Dissemination (crowd-sourcing) • plus, because I am now at Trinity, working again in the Humanities, a selection of 18th Century poetic thought (but with an engineering bias) ! Thursday 26 November 2009

Speech & Multimodal Databases • my primary interest: collecting natural speech - modelling human conversational interactions • age 12: WW, ’98: “we murder to dissect” . . . “quit your books; let Nature be your teacher” • important design notes for corpus gatherers! Thursday 26 November 2009

Verse > William Wordsworth > Complete Poetical Works THE TABLES TURNED, 1798 UP! up! my Friend, and quit your books; One impulse from a vernal wood Or surely you'll grow double: May teach you more of man, Up! up! my Friend, and clear your looks; Of moral evil and of good, Why all this toil and trouble? Than all the sages can. The sun, above the mountain's head, Sweet is the lore which Nature brings; A freshening lustre mellow Our meddling intellect Through all the long green fields has spread, Mis-shapes the beauteous forms of things:-- His first sweet evening yellow. We murder to dissect. Books! 'tis a dull and endless strife: Enough of Science and of Art; Come, hear the woodland linnet, Close up those barren leaves; How sweet his music! on my life, Come forth, and bring with you a heart There's more of wisdom in it. That watches and receives. And hark! how blithe the throstle sings! He, too, is no mean preacher: Come forth into the light of things, Let Nature be your teacher. She has a world of ready wealth, Our minds and hearts to bless-- Spontaneous wisdom breathed by health, Truth breathed by cheerfulness. Thursday 26 November 2009

multifaceted behaviour • by constraining a corpus, we limit the types of interaction that it can illustrate • only by releasing these constraints on participant behaviour can we gather a corpus that will teach us something new about human conversational interaction Thursday 26 November 2009

dimensions of speech Thursday 26 November 2009

example: what is a “turn” Thursday 26 November 2009

Thursday 26 November 2009

contact management? Thursday 26 November 2009

bias in corpora • the proposed ISO standard also illustrates the inherent bias in existing corpora: • e.g., Tables 1 & 2 in Annex F show considerable differences in “Contact Management” between corpora • Our conclusion is that Contact Management could be considered as an ʻ optional ʼ dimension, since this aspect of communication is not reflected in most existing dialogue act annotation schemes (6 out of 18). It was noticed, however, that for some types of dialogues, e.g. phone conversations or tele- conferences (as in the OVIS corpus), this aspect may be important.” • only 0.1% in AMI, vs 12.3% in OVIS ..... Thursday 26 November 2009

• Results from survey of dimensions and communicative functions in existing annotation schemas Thursday 26 November 2009

Annotating, Viewing and Distributing New Data There are presently several tools for manual annotation of data that each store the results in a prescribed format, easy for dissemination, but my experience of working with these and of talking with people who use them regularly is that the task is tedious, and the framework often restrictive. Rather than prescribe a standard at this time, we might benefit more from creating a support group whereby people who annotate data regularly can communicate and share samples, tools, and formats for rapid assisted evolution. My LREC 2010 paper ( A Software Toolkit for Viewing Annotated Multimodal Data Interactively over the Web ) may be relevant here. Thursday 26 November 2009

A Software Toolkit for Viewing Annotated Multimodal Data Interactively over the Web section headings (lrec-2010) • introduction • the freetalk multimodal corpus • assembling complex data • viewing complex data interactively • details of the software • downloading & use • summary & conclusion Thursday 26 November 2009

flash-based data interface Thursday 26 November 2009

flash movies & dataplots • we archive ALL originals, and link various derived annotations, data streams, and compressed video versions .... flash movie format (xxx.flv) appears to offer the most efficient service and access software ........ • interactive pages at www.speech-data.jp Thursday 26 November 2009

Two-Way Dissemination By sharing a corpus, we stand to gain added annotation levels. We should also examine crowd-sourcing in this respect. As with our own FreeTalk corpus (www.speech-data.jp), by making the initial data public and co-operating worldwide with interested partners, the annotations can be grown as researchers with different interests contribute their own layers of knowledge. Since the world of multimodal corpora is still young, perhaps the most we might expect from this initial meeting is the opening up of channels whereby the exchange of sources and resources might take place. Thursday 26 November 2009

a growing community • we don’t yet have clearly defined “interface standards” but we try to keep a flexible, open- minded approach • different people are working on the corpus each from their own viewpoints, using different software and both ‘top-down’ (theory driven) vs ‘bottom-up’ (data driven) approaches .... • we are hoping for a happy marriage of both Thursday 26 November 2009

summary • we do not yet know how to properly create a ‘balanced’ and ‘representative’ speech corpus • we do not yet know how to integrate & manage complex multimodal data packages • we do not yet know the best ways to disseminate and share these types of data • so maybe it is a bit early to propose standards • but we can gain a lot by encouraging exchange and interchange of related annotations & data Thursday 26 November 2009

• thank you .... Thursday 26 November 2009

capturing natural interactions Nick Campbell Trinity College, - PowerPoint PPT Presentation

capturing natural interactions Nick Campbell Trinity College, Dublin Clarin/FLaReNet Workshop@KTH November 26th, 2009 Thursday 26 November 2009 introduction Speech recognition and synthesis technologies can now be considered mature,

Ecosystem In Interactions and Energy Target 1: I can explain how natural disruptions, both

Capturing the Value of Ecosystem Services Jon D. Erickson Rubenstein School of Environment &

Capturing data and use of GPS Capturing data GIS GPS Paper maps Coordinates Satellite images

Improving T ext Mining with Controlled Natural Language: A Case Study for Protein Interactions

Well-Balanced and Positivity Preserving DG Schemes for Shallow Water Flows with Shock Capturing

for Flux Reconstruction Will Trojak Scope Why Shock Capturing? Invariance Preserving

Capturing Realistic HDR Images Topics : Post-Processing. Sample Workflow. Q &

Eye-tracking for capturing human visual attention Eye-tracking for capturing human visual

Pangaea Energy Flow Types of Plate Interactions and the Results Natural Disasters

Content Why : Background What : Deliverables Who: Governance When : Timings How : Getting involved

Capturing Kids Hearts To capture a kids mind, you first have to capture their hearts. Why

What are the right abstractions for capturing programming and intent Discussion section Thursday

Capturing Traffic Traces with Ground- Capturing Traffic Traces with Ground- Truth Information

Intermolecular interactions and scattering M.H.J. Koch 1 Intermolecular interactions

CAPTURING HOLINESS: PHOTOGRAPHY AND TORAH TEXT Photography of Stewart Cherlin Temple Sholom of

Part II: (S)RG and Low-Momentum Interactions To understand the properties of complex nuclei from

Annotated tertiary interactions in RNA structures reveal new interactions and composite motifs

Planning & Development SPC June 2020 Capturing Public Value from Development Lands John

Capturing WPA2 Enterprise credentials with a Pi Richard Frovarp Principal Software Engineer

Seal populations in the Baltic HELCOM Workshop on Seal-Fisheries Interactions Copenhagen,

Segmen&ng a Market Segment: New Ideas for Capturing

WGCV/WGISS interactions Ric har d MORE NO WGISS Chair CNE S WGCV / WGISS interactions

Early interactions on innovation at EMA (ITF) Session 2: Early interactions with regulators to

Capturing Testing with 3 Magic Words Presented by:

capturing natural interactions Nick Campbell Trinity College, - PowerPoint PPT Presentation

capturing natural interactions Nick Campbell Trinity College, Dublin Clarin/FLaReNet Workshop@KTH November 26th, 2009 Thursday 26 November 2009 introduction Speech recognition and synthesis technologies can now be considered mature,

Ecosystem In Interactions and Energy Target 1: I can explain how natural disruptions, both

Capturing the Value of Ecosystem Services Jon D. Erickson Rubenstein School of Environment &amp;

Capturing data and use of GPS Capturing data GIS GPS Paper maps Coordinates Satellite images

Improving T ext Mining with Controlled Natural Language: A Case Study for Protein Interactions

Well-Balanced and Positivity Preserving DG Schemes for Shallow Water Flows with Shock Capturing

for Flux Reconstruction Will Trojak Scope Why Shock Capturing? Invariance Preserving

Capturing Realistic HDR Images Topics : Post-Processing. Sample Workflow. Q &amp;

Eye-tracking for capturing human visual attention Eye-tracking for capturing human visual

Pangaea Energy Flow Types of Plate Interactions and the Results Natural Disasters

Content Why : Background What : Deliverables Who: Governance When : Timings How : Getting involved

Capturing Kids Hearts To capture a kids mind, you first have to capture their hearts. Why

What are the right abstractions for capturing programming and intent Discussion section Thursday

Capturing Traffic Traces with Ground- Capturing Traffic Traces with Ground- Truth Information

Intermolecular interactions and scattering M.H.J. Koch 1 Intermolecular interactions

CAPTURING HOLINESS: PHOTOGRAPHY AND TORAH TEXT Photography of Stewart Cherlin Temple Sholom of

Part II: (S)RG and Low-Momentum Interactions To understand the properties of complex nuclei from

Annotated tertiary interactions in RNA structures reveal new interactions and composite motifs

Planning &amp; Development SPC June 2020 Capturing Public Value from Development Lands John

Capturing WPA2 Enterprise credentials with a Pi Richard Frovarp Principal Software Engineer

Seal populations in the Baltic HELCOM Workshop on Seal-Fisheries Interactions Copenhagen,

Segmen&amp;ng a Market Segment: New Ideas for Capturing

WGCV/WGISS interactions Ric har d MORE NO WGISS Chair CNE S WGCV / WGISS interactions

Early interactions on innovation at EMA (ITF) Session 2: Early interactions with regulators to

Capturing Testing with 3 Magic Words Presented by:

Capturing the Value of Ecosystem Services Jon D. Erickson Rubenstein School of Environment &

Capturing Realistic HDR Images Topics : Post-Processing. Sample Workflow. Q &

Planning & Development SPC June 2020 Capturing Public Value from Development Lands John

Segmen&ng a Market Segment: New Ideas for Capturing