Big Table Indexing, session 9 CS6200: Information Retrieval Slides - PowerPoint PPT Presentation

Big Table Indexing, session 9 CS6200: Information Retrieval Slides by: Jesse Anderton

Distributed Storage BigTable was developed by Google to manage their storage needs. It is a distributed storage system designed to scale across hundreds of thousands of machines, and to gracefully continue service as machines fail and are replaced. Storage systems such as BigTable are natural fits for processes distributed with MapReduce. “A Bigtable is a sparse, distributed, persistent multidimensional sorted map.” –Chang et al, 2006.

BigTable Rows The data in BigTable is logically organized into rows. For instance, the inverted list for a term can be stored in a single row. A single cell is identified by its row key, column, and timestamp. Efficient methods exist for fetching or updating particular groups of cells. Only populated cells consume filesystem space: the storage is inherently sparse.

BigTable Tablets BigTable rows reside within logical tables, which have pre-defined columns and group records of a particular type. The rows are subdivided into ~200MB tablets, which are the fundamental underlying filesystem blocks. Tablets and transaction logs are replicated to several machines in case of failure. If a machine fails, another server can immediately read the tablet data and transaction log with virtually no downtime.

BigTable Operations All operations on a BigTable are row-based operations. Most SQL operations are impossible here: no joins or other structured queries. BigTable rows can have massive numbers of columns, and individual cells can contain large amounts of data. For instance, it’s no problem to store a translation of a document into many languages, each in its own column of the same row.

Wrapping Up Storage systems such as BigTable are natural fits for distributed algorithm execution. Google invented BigTable to handle its index, document cache, and most of its other massive storage needs. This has produced a whole generation of distributed storage systems, called NoSQL systems. Some examples include MongoDB, Couchbase, etc. Next, we’ll consider how to run queries efficiently on an index.

Big Table Indexing, session 9 CS6200: Information Retrieval Slides - PowerPoint PPT Presentation

Big Table Indexing, session 9 CS6200: Information Retrieval Slides by: Jesse Anderton Distributed Storage BigTable was developed by Google to manage their storage needs. It is a distributed storage system designed to scale across hundreds of

Databases Announcements Create Table and Drop Table Create Table 4 Create Table CREATE

Machine Learning Anders Holst SICS Big Data Analytics Analysis Big Data Big Value Big Data

I Prefer Pi Corey Sinnamon Febuary 3, 2015 Big Day 3/14/15 Big Day 3/14/15 Themes Big

Big Data Algorithms with Medical Applications Yixin Chen Outline Challenges to big data

CS535 Big Data 1/22/2020 Sangmi Lee Pallickara CS535 Big Data | Computer Science Department

COMP9313: Big Data Management Introduction to Big Data Management What is big data? Tweeted by

NEU TABLE By HAY Neu Table is a small table designed by HAY with a round or a square tabletop.

Michael Stonebraker The Meaning of Big Data - 3 V s Big Volume With simple (SQL)

Big Data Analytics: What is Big Data? Stony Brook University CSE545, Fall 2016 the inaugural

Big Data Analytics: What is Big Data? H. Andrew Schwartz Stony Brook University CSE545, Fall

HOW BIG IS BIG DATA FOR AN INSURER LIKE AXA? CHALLENGES & OPPORTUNITIES Paris Big Data

PRESENTATION Want big impact? USE BIG IMAGE 2 Source: The Indian Express Want big impact? USE

Crowdfunding Nico Ritschel, July 20 th August 3 rd 2018 Some History Some Theory What

The Periodic Table Periodic Table & Electron Configurations Effective Nuclear Charge

Chemistry The Periodic Table 2015-11-16 www.njctl.org Slide 3 / 163 Table of Contents: The

Table A2 Field Descriptions for the Laboratory Instrument Table (Table A2) Contains related to

Limitlessly Scalable Storage for Capacity-Intensive Computing Meet Cloudian S3-compatible

Diagnosis and Treatment of Osteoporosis: Whats New and Controversial in 2020? Douglas C.

An Empirical Comparison of Features and Tuning for Phrase-based Machine Translation Spence Green

FODO + Space Charge around the 90 deg stop-band Simulation Set-up Consider a proton beam in a

Real-time monitoring of growing pigs Thomas Nejsum Madsen IQinAbox www.iqinabox.com IQinAbox

Lecture 2.2 Pretensioned Concrete Dr. Hazim Dwairi Schematic of Pretentioning Bed Dr. Hazim

4th KSETA Plenary Workshop 2017 Tracking detectors in modern par2cle physics experiments (*)

Operation test of SOFIST:SOI sensor of ILC Teppei Mori 12/25/2015@H701 Shun Ono,Manabu Togawa,

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us

Big Table Indexing, session 9 CS6200: Information Retrieval Slides - PowerPoint PPT Presentation

Big Table Indexing, session 9 CS6200: Information Retrieval Slides by: Jesse Anderton Distributed Storage BigTable was developed by Google to manage their storage needs. It is a distributed storage system designed to scale across hundreds of

Databases Announcements Create Table and Drop Table Create Table 4 Create Table CREATE

Machine Learning Anders Holst SICS Big Data Analytics Analysis Big Data Big Value Big Data

I Prefer Pi Corey Sinnamon Febuary 3, 2015 Big Day 3/14/15 Big Day 3/14/15 Themes Big

Big Data Algorithms with Medical Applications Yixin Chen Outline Challenges to big data

CS535 Big Data 1/22/2020 Sangmi Lee Pallickara CS535 Big Data | Computer Science Department

COMP9313: Big Data Management Introduction to Big Data Management What is big data? Tweeted by

NEU TABLE By HAY Neu Table is a small table designed by HAY with a round or a square tabletop.

Michael Stonebraker The Meaning of Big Data - 3 V s Big Volume With simple (SQL)

Big Data Analytics: What is Big Data? Stony Brook University CSE545, Fall 2016 the inaugural

Big Data Analytics: What is Big Data? H. Andrew Schwartz Stony Brook University CSE545, Fall

HOW BIG IS BIG DATA FOR AN INSURER LIKE AXA? CHALLENGES &amp; OPPORTUNITIES Paris Big Data

PRESENTATION Want big impact? USE BIG IMAGE 2 Source: The Indian Express Want big impact? USE

Crowdfunding Nico Ritschel, July 20 th August 3 rd 2018 Some History Some Theory What

The Periodic Table Periodic Table &amp; Electron Configurations Effective Nuclear Charge

Chemistry The Periodic Table 2015-11-16 www.njctl.org Slide 3 / 163 Table of Contents: The

Table A2 Field Descriptions for the Laboratory Instrument Table (Table A2) Contains related to

Limitlessly Scalable Storage for Capacity-Intensive Computing Meet Cloudian S3-compatible

Diagnosis and Treatment of Osteoporosis: Whats New and Controversial in 2020? Douglas C.

An Empirical Comparison of Features and Tuning for Phrase-based Machine Translation Spence Green

FODO + Space Charge around the 90 deg stop-band Simulation Set-up Consider a proton beam in a

Real-time monitoring of growing pigs Thomas Nejsum Madsen IQinAbox www.iqinabox.com IQinAbox

Lecture 2.2 Pretensioned Concrete Dr. Hazim Dwairi Schematic of Pretentioning Bed Dr. Hazim

4th KSETA Plenary Workshop 2017 Tracking detectors in modern par2cle physics experiments (*)

Operation test of SOFIST:SOI sensor of ILC Teppei Mori 12/25/2015@H701 Shun Ono,Manabu Togawa,

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us

HOW BIG IS BIG DATA FOR AN INSURER LIKE AXA? CHALLENGES & OPPORTUNITIES Paris Big Data

The Periodic Table Periodic Table & Electron Configurations Effective Nuclear Charge