1 CASD-TeraLab Secure Remote Access to Confidential Big Data Alexandre Marty [ alexandre.marty@casd.eu ]
Outline 2 CASD-TeraLab Use Cases Live Demo
The Secure Data Access Centre 3 Data Insertions/extractions are A group of tightly-sealed controlled . Users do not have Hermetic Bubble secured servers Internet access from their workspace. User applications and processing are executed Sensitive data is hosted strictly within the Bubble. only within the Bubble. Sensitive Data Insertions Extractions Servers & Applications Hadoop cluster is available for handling SD-Boxes are the only Big Data. means of access to the Bubble. Access occurs via the Internet by encrypted channels
TeraLab 4 Publicly funded Big Data & Data Science platform Open to: R&D and teaching projects, proof of concepts Public and private sectors Everything for Big Data: Powerful and scalable infrastructure Hadoop-based with all Hadoop tools Extensive tools for scientists (R, SAS, machine learning…) Turnkey solution with full support and maintenance
Use Cases 5 Electricity transmission network data with RTE Impressive variety of data sources Development of innovative apps Health data Requires high confidentiality About 250 TB generated each year Mobile telecommunications data for tourism statistics European data Involvement in European projects: DwB, Eurostat Big Data Task Force
Scanner Data Project 6 Work in collaboration with the Consumer Price Index team One goal is to improve the CPI calculation Find new opportunities to use the data and develop new methodologies Daily sales data from 4 French major distribution companies Very detailed data: products, stores… 5.7 billion rows, 1 TB Randomly generated dataset used for this demonstration
Live Demo 7
For More Information 8 www.teralab-datascience.fr casd.eu alexandre.marty@casd.eu
Recommend
More recommend