introduction to data stream mining
play

Introduction to Data Stream Mining Albert Bifet March 2012 - PowerPoint PPT Presentation

Introduction to Data Stream Mining Albert Bifet March 2012 Motivation Source: IDCs Digital Universe Study (EMC), June 2011 Data is growing Motivation Memory unit Size Binary size 10 3 2 10 kilobyte (kB/KB) 10 6 2 20 megabyte (MB) 10 9


  1. Introduction to Data Stream Mining Albert Bifet March 2012

  2. Motivation Source: IDC’s Digital Universe Study (EMC), June 2011 Data is growing

  3. Motivation Memory unit Size Binary size 10 3 2 10 kilobyte (kB/KB) 10 6 2 20 megabyte (MB) 10 9 2 30 gigabyte (GB) 10 12 2 40 terabyte (TB) 10 15 2 50 petabyte (PB) 10 18 2 60 exabyte (EB) 10 21 2 70 zettabyte (ZB) 10 24 2 80 yottabyte (YB) Data is growing

  4. Motivation Source: IDC’s Digital Universe Study (EMC), June 2011 Data is growing

  5. Motivation Source: IDC’s Digital Universe Study (EMC), June 2011 Data is growing

  6. Motivation Source: IDC’s Digital Universe Study (EMC), June 2011 Data is growing

  7. Streaming Data Big Data & Real Time

  8. Big Data McKinsey Global Institute (MGI) Report on Big Data, 2011. Big data refers to datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze.

  9. Big Data McKinsey Global Institute (MGI) Report on Big Data, 2011. Big data refers to datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze.

  10. Methodology Sampling and distributed systems

  11. Methodology Paolo Boldi Big Data does not need big machines, it needs big intelligence

  12. Real time analytics We want to analyze what is happening now .

  13. Real time analytics We want to analyze what is happening now .

  14. Time and Memory Number 8 Wire Mentality Time and memory are the resource dimensions of the process.

  15. Time and Memory Time and memory are the resource dimensions of the process.

  16. Algorithms Classification, Regression, Clustering, Frequent Pattern Mining.

  17. Applications ◮ sensor data: industry, cities ◮ telecomm data ◮ social networks: twitter, facebook, yahoo ◮ marketing: sales business Data may come from: humans, sensors, or machines.

  18. Data Streams Big Data & Real Time

Recommend


More recommend