advanced analytics in business d0s07a big data platforms
play

Advanced Analytics in Business [D0S07a] Big Data Platforms & - PowerPoint PPT Presentation

Advanced Analytics in Business [D0S07a] Big Data Platforms & Technologies [D0S06a] Course Introduction Lecturer Prof. dr. Seppe vanden Broucke Studied at KU Leuven (Belgium) PhD in Applied Economics at KU Leuven, Belgium in 2014 PhD.:


  1. Advanced Analytics in Business [D0S07a] Big Data Platforms & Technologies [D0S06a] Course Introduction

  2. Lecturer Prof. dr. Seppe vanden Broucke Studied at KU Leuven (Belgium) PhD in Applied Economics at KU Leuven, Belgium in 2014 PhD.: Advances in Process Mining: Artificial Negative Events and Other Techniques Assistant professor at UGent and lecturer at KU Leuven, Belgium Research: data mining and analytics, process mining, fraud analytics Brussels Airport, FEDNOT research chair holder Co-academic organizer postgraduate studies in Big Data and Analytics Contact: seppe.vandenbroucke@kuleuven.be or seppe.net 2

  3. Goals of the course At the end of the course students will: Have insight in how advanced analytics can be used to optimize business decisions in e.g. marketing, finance, logistics, HR, etc. Have insight in issues related to the storage and processing of large datasets Be able to indicate which technologies and approaches are applicable for different types of datasets (including MapReduce, Hadoop, stream processing, etc.) Basically: all the solid fundamentals plus pathways to expand your knowledge so you’re ready to become a (better) data scientist! “ “ Information is easy to find, motivation is not 3

  4. Practicalities Course from 1pm-4pm at HIW1 00.16 Questions? During, before and after course, during break, or e-mail HOG 03.124: by appointment Slides will be made available on Toledo (http://toledo.kuleuven.be/) before each lecture Background material, frequent questions, etc. will also be posted on Toledo Note: all materials posted in “Advanced Analytics in Business [D0S07a]” Alternative: http://seppe.net/aa Course recordings posted after course (but do still try to go to class) Course material consists primarily of what has been taught during lectures! 4

  5. Schedule (check online for changes) Date Course Topic General introduction 11.02 Analytics The data science process: introduction to supervised and unsupervised modelling Preprocessing and feature engineering 18.02 Analytics Assignment 1 made available 25.02 Supervised modelling: k-NN, (logistic) regression and decision trees Analytics Model evaluation 03.03 Analytics Assignment 2 made available Data science tools and platforms Big data 10.03 Ensemble modeling: bagging and boosting Analytics 17.03 Unsupervised modelling: clustering, association rules, anomaly detection Analytics 24.03 Advanced techniques: artificial neural networks, deep learning (conceptual), q-learning Analytics Introduction to Hadoop and MapReduce 31.03 Big data Spark and SparkSQL No course (Easter break) 07.04 Assignment 3 made available 14.04 No course (Easter break) 21.04 Streaming analytics and other big data analytics trends Big data 28.04 Text mining Analytics Social network mining Analytics 05.05 NoSQL, Neo4j and Cypher Big data Assignment 4 made available 12.05 Wrap-up: Security, Ethics, where to go from here? 5

  6. Prerequisites Basic knowledge of statistics and analytics Basic operating systems skills in Windows or Linux Programming in Python or R, Java Motivation and willingness to work! 6

  7. Expectations What is expected of you: 1 study point = 25-30 hours of work (1h of lecture = 3h of student work ) So, 4 study points = 100-120 study hours Attend lectures and pay attention, keep up with material: read course text, check background information Assignments (!) What can you expect: High quality lecturing Answers to your questions (if related to course topic…) Typically, by email within 5 working days Right before and after class, during breaks Upon appointment by sending email to lecturer Up to date content and lots of background information for those who’re serious about data science! 7

  8. Exam The evaluation consists of a lab report (50% of the marks) and a closed-book written exam with both multiple-choice and open questions (50% of the marks) Lab report: Groups of 4-5 students 4 assignments: Research paper discussion Predictive model competition using R or Python Text mining with Spark streaming assignment Social network analytics assignment For each assignment, you describe your results (screenshots, numbers, approach) Deadline for completed report (all assignments): Sunday May 31st We’ll start by forming groups on Toledo after the first (this) lesson: follow-up on this as soon as possible! 8

  9. Any questions? 9

Recommend


More recommend