Rapid Advances in Computer Science and Opportunities for Society - PowerPoint PPT Presentation

Rapid Advances in Computer Science and Opportunities for Society European CS Presentation, October 2010 Alfred Spector VP, Research and Special Initiatives

Rapid Advances in Computer Science & Opportunities for Society Abstract Information and Communication Technologies have had a rapid impact on society, and –amazingly—the pace of innovation continues to accelerate. This innovation is catalyzed by ever-increasing hardware and networking capabilities, the growth in internet usage, as well as important advances in basic and applied computer science. In this talk, I will describe some of the research that Google is undertaking (for example, in machine translation, semantic processing, and information management), and discuss some of the likely beneficial impacts on our society – for example, in science, the humanities, education, philanthropic activities, and more. I’ll conclude my presentation with some interesting challenges from both a technology and policy point of view.

Outline Google Prodigiousness Advances in the Field: examples Translation Speech Vision Cloud-based collaboration around structured-data Operations Research Semantic Processing Beneficial Societal Impacts: examples Earth Engine Google Health Other Health Efforts Crisis Response Digital Humanities Education A Technical Themes Challenges

Mission Organizing the world’s information and Making it universally accessible and useful .

Google and Commerce Over 1 million AdWords advertisers worldwide Over 1 million AdSense publishers worldwide Via the Google Ad Network, AdSense publishers reach over 80% of global internet users in 100 countries and 20 languages YouTube is monetizing over a billion video views per week globally In 2009, Google generated $54 billion of economic activity for American businesses, website publishers, and non-profits

Prodigiousness Giga 10 9 , Tera 10 12 , Peta 10 15 , Exa 10 18 , Zetta10 21 Publicized: Bigtable of 70 petabytes, 10M ops/sec. Warehouse computing possibilities? 100 x 10 x 20 x 20 x 40 = 16,000,000 nodes… Some representative numbers: Storage: 10 18 -> 10 20-21 Users: 10 9 -> 10 10 Devices: 10 ? -> 10 12 Network: 10 20 , now, ->10 21/yr 32 KB/sec. for 1B people Apps: 10 5 -> 10 6-7 or more E.g., embedded car systems: 30-50 ECUs, 100M lines of code

A variety of science engineering challenges

Focus on Innovation that Benefits our Users Focus on Research and Engineering Commitment to advancing technology Rich domain of work due to our mission Grand challenge problems Internal consensus that production issues are often as challenging/fun as pure invention Technical leverage 1. Google Common Distributed System 2. A Focus on Services 3. Empiricism and a Holistic Approach to Design

Our Innovation Culture Focus on talent Distributed across the organization Impacting Google necessitates broad, diverse involvement in science and engineering Research is done both in our research team and in our engineering organization, organized opportunistically Teams benefit greatly From mutual talent From Google’s comparative advantages to our scale and broad use From service-based architecture (“ease” of working in vivo )

Ideal Distributed Computing Device s

Research Challenges in Ideal Distributed Computing Alternative designs that would give better energy efficiency at lower utilization Server O.S. design aimed at many highly-connected machines in one building Unifying abstractions for exploiting parallelism beyond inter- transaction parallelism and map-reduce Latency reduction A general model of replication including consistency choices, explained and codified Machine learning techniques applied to monitoring/controlling such systems Automatic, dynamic world-wide placement of data & computation to minimize latency and/or cost, given constraints on Building retrieval systems that efficiently and usably deal with ACLs Holistic models of privacy The user interface to the user’s diverse processing and state

Totally Transparent Processing For all d in D, all l in L, all m in M, and all c in C C: The set of D: The set of all L: The set of all M: The set of all all corpora end-user access human languages modalities devices Personal Computers Current languages Text The normal web Phone Historical languages Image The deep web Media Players/Readers Other forms of human Audio Periodicals Telematics notation Video Books Set-top Boxes Possible language Graphics Catalogs Appliances specialization Other sensor-based data Blogs Health devices Formal languages … Geodata … … Scientific datasets Health data …

Totally Transparent Processing

“Hybrid” Intelligence To extend the capability of people, not in isolation Aggregation of empirical signal is exceedingly valuable Ex: Feedback in Information Retrieval; e.g., in ranking or spelling correction Machine learning; e.g., image content analysis, speech recognition with semi-supervised learning

Research Challenges in Transparent Computing & Hybrid Intelligence Endless applications, with very new user interface implications Addressing limits to data Techniques to integrate user-feedback in acceptable fashions Approaches to new signal Explanation, scale, and variance minimization in machine learning Information fusion/learning across diverse signals – The Combination Hypothesis, more generally Usability: devices and subpopulations Privacy

Domains of Application Search engines Translation Speech recognition Vision Remedial Education Personal health Epidemiology Economic prediction Societal/environmental optimization Social Networking in ever more clever/useful ways Humanities and Social Sciences Multi-player gaming

Translation

Machine Translation @ Google Statistical Machine Translation Model translation process with a statistical model Learning from data: monolingual & bilingual More data: better translation quality Computationally expensive approach Models have many hundreds of Gigabyte of data (Moore's law helps here) Applying syntax information as a signal Results: Much better translation quality Ongoing progress More research groups, ... 58 languages (so far) recently: Haitian Creole, Urdu, Georgian, ..., Latin

Grand Challenges Long-distance reordering : Morphology : 'simple' case: SVO, SOV translating into (one) approach: morphologically rich parse source & reorder languages e.g. Russian, Hungarian need: morphology-aware translation models Reliability: some translation mistakes more severe than others: hotel - Montreal Heath Ledger - Tom Cruise issue: parsing accuracy for Research: How to detect out-of-domain texts 'crazy' translations? Finding all Training Data

How about Poetry? Paper at EMNLP 2010 conference: “Poetic” Statistical Machine Translation: Rhyme and Meter, D.Genzel, J.Uszkoreit, F.Och, EMNLP , 2010. Approach: Enforce meter and rhyme as extra constraints (similar to language model) E.g. iambic pentameter: stress pattern 0101010101 Produce most 'probable' translation that obeys constraints ("Function follows form") Example output (couplet in amphibrachic tetrameter An of ficer sta ted that three were ar res ted and that the e quip ment is cur rently tes ted.

Speech

Goals for Speech Technology at Google Much of the world’s information is spoken – we need to recognize it before we can organize it: YouTube transcription and translation (breaking the language barrier for YouTube access) Voicemail transcription Mobile is the fastest growing and most widespread platform for communication and services that has ever existed Spoken input and output is key to usability Our goal is completely ubiquitous availability of speech i/o (every application/service, every usage scenario, every language) How do we get there? Delivery from the cloud – support constant iteration and refinement Operating at large scale – train huge statistical models on huge amounts of data

Training Acoustic Models w/Unsupervised Learning Learning from use - without human transcription Challenges: How do we grow the model to take advantage of the data? (richer models of accent, speaker, noise, etc.) Huge computational demands Infrastructure demands – parallelization – leverage Google software environment Supervised vs. unsupervised training - hours of data vs. error rate

Vision

Computer Vision Advance state-of-the art in 3 key areas of image/audio/video analysis and apply results to our multimedia products. Semantic Interpretation: Generate human understandable description of content. (eg. auto-tagging videos on YouTube, Image annotation, porn classification etc.) Matching: Find similar entities from a large corpus. (eg. "find similar" on image search, video fingerprinting for YouTube etc ). Synthesis: Generate better images/video by understanding the statistics of a large corpus of images. (eg. better facades in 3D building on Google Earth, automatic shadow removal from areal images etc.)

Semantic Interpretation sample problem - Video Annotation Video metadata has a cognitive cost on the user because they have to type it in, be careful about what keywords they use, and in general try to make their video searchable Many uploaders don’t have the motivation, or energy to provide proper metadata Noisy metadata hurts everyone – spam, misspellings, 1337, acronyms, etc.

Rapid Advances in Computer Science and Opportunities for Society - PowerPoint PPT Presentation

Rapid Advances in Computer Science and Opportunities for Society European CS Presentation, October 2010 Alfred Spector VP, Research and Special Initiatives Rapid Advances in Computer Science & Opportunities for Society Abstract

Rapid Response Jobs are Alaskas Future Rapid Response Rapid Response Rapid Response is a

Model REM Rapid Engineering Model What is REM? REM Rapid Engineering Model What is REM? REM

CRISIS CRISIS Rapid Needs Assessment Rapid Needs Assessment ILO Crisis Response : Trainers

The Parker Center Rapid Recovery Program Plastic Surgery Patients Can Count on Rapid Recovery

Advances in Chemo-RT for cervical and Advances in Chemo-RT for cervical and Head & Neck

Destination Rapid City Rapid City, South Dakota Implementing Change: Revitalization and Organized

Rapid Transit From Arbutus Street to UBC Policy and Strategic Priorities Council Meeting January

eco-rapid transit board meeting transit oriented district guidelines deborah murphy, eco-rapid

What is Bus Rapid Transit? What is Bus Rapid Transit? Sarah Jo Peterson Infrastructure

El Camino Real Bus Rapid Transit Study Public Workshops November 2014 What is Bus Rapid

Best Practice Basics: Rapid Re-Housing Housing First Model origins Rapid Re-Housing

Value Added Opportunities with Value Added Opportunities with Value Added Opportunities with

Rapid Prototyping: Leapfrogging into Military Utility Mr. Randy Walden Air Force Rapid

The Need Advances and Challenges Related to The Need, Advances and Challenges Related to

EE 634 MIMO Wireless Communications: Fundamentals and Advances and Advances Prof. Rakhesh Singh

Advances in design and management of engineered slopes Dr Adam Fisher What changes/advances

assisted Content Delivery Systems Niklas Carlsson Linkping University Gyrgy Dan KTH Royal

Aerosol monsoon interactions in the Nepal Himalayas Rudra Shrestha 1 , Paul Connolly 2 and

How AppFog Built a PaaS around CloudFoundry Jeremy Voorhis Senior Engineer, AppFog Inc

Computer Networks M Openstack & more Antonio Corradi Luca Foschini Academic year

Vehicular Applications and 5G Marco Gruteser Connectivity Cellular Internet Dedicated Short

Deep learning 6.3. Dropout Fran cois Fleuret https://fleuret.org/dlc/ Dec 20, 2020 A first

Climate Change and Adaptation Costs in California Sean B. Hecht Co-Executive Director, Emmett

Adoption and Adaptation Making technology work for us https://flic.kr/p/odhWZ8 From the

Sambuz

Useful Links

Newsletter

Mail Us

Rapid Advances in Computer Science and Opportunities for Society - PowerPoint PPT Presentation

Rapid Advances in Computer Science and Opportunities for Society European CS Presentation, October 2010 Alfred Spector VP, Research and Special Initiatives Rapid Advances in Computer Science & Opportunities for Society Abstract

Rapid Response Jobs are Alaskas Future Rapid Response Rapid Response Rapid Response is a

Model REM Rapid Engineering Model What is REM? REM Rapid Engineering Model What is REM? REM

CRISIS CRISIS Rapid Needs Assessment Rapid Needs Assessment ILO Crisis Response : Trainers

The Parker Center Rapid Recovery Program Plastic Surgery Patients Can Count on Rapid Recovery

Advances in Chemo-RT for cervical and Advances in Chemo-RT for cervical and Head &amp; Neck

Destination Rapid City Rapid City, South Dakota Implementing Change: Revitalization and Organized

Rapid Transit From Arbutus Street to UBC Policy and Strategic Priorities Council Meeting January

eco-rapid transit board meeting transit oriented district guidelines deborah murphy, eco-rapid

What is Bus Rapid Transit? What is Bus Rapid Transit? Sarah Jo Peterson Infrastructure

El Camino Real Bus Rapid Transit Study Public Workshops November 2014 What is Bus Rapid

Best Practice Basics: Rapid Re-Housing Housing First Model origins Rapid Re-Housing

Value Added Opportunities with Value Added Opportunities with Value Added Opportunities with

Rapid Prototyping: Leapfrogging into Military Utility Mr. Randy Walden Air Force Rapid

The Need Advances and Challenges Related to The Need, Advances and Challenges Related to

EE 634 MIMO Wireless Communications: Fundamentals and Advances and Advances Prof. Rakhesh Singh

Advances in design and management of engineered slopes Dr Adam Fisher What changes/advances

assisted Content Delivery Systems Niklas Carlsson Linkping University Gyrgy Dan KTH Royal

Aerosol monsoon interactions in the Nepal Himalayas Rudra Shrestha 1 , Paul Connolly 2 and

How AppFog Built a PaaS around CloudFoundry Jeremy Voorhis Senior Engineer, AppFog Inc

Computer Networks M Openstack &amp; more Antonio Corradi Luca Foschini Academic year

Vehicular Applications and 5G Marco Gruteser Connectivity Cellular Internet Dedicated Short

Deep learning 6.3. Dropout Fran cois Fleuret https://fleuret.org/dlc/ Dec 20, 2020 A first

Climate Change and Adaptation Costs in California Sean B. Hecht Co-Executive Director, Emmett

Adoption and Adaptation Making technology work for us https://flic.kr/p/odhWZ8 From the

Sambuz

Useful Links

Newsletter

Mail Us

Advances in Chemo-RT for cervical and Advances in Chemo-RT for cervical and Head & Neck

Computer Networks M Openstack & more Antonio Corradi Luca Foschini Academic year