statistical methods lecture 1
play

Statistical Methods: Lecture 1 Dennis Dobler Vrije Universiteit - PowerPoint PPT Presentation

Course parameters Course Introduction Introduction to Statistics Summarising and graphing data Describing data Statistical Methods: Lecture 1 Dennis Dobler Vrije Universiteit Amsterdam October 30, 2017 Dennis Dobler Vrije Universiteit


  1. Course parameters Course Introduction Introduction to Statistics Summarising and graphing data Describing data Statistical Methods: Lecture 1 Dennis Dobler Vrije Universiteit Amsterdam October 30, 2017 Dennis Dobler Vrije Universiteit Amsterdam Statistical Methods: Lecture 1

  2. Course parameters Course Introduction Introduction to Statistics Summarising and graphing data Describing data Lecture Overview Course parameters Course Introduction Introduction to Statistics Summarising and graphing data Describing data Dennis Dobler Vrije Universiteit Amsterdam Statistical Methods: Lecture 1

  3. Course parameters Course Introduction Introduction to Statistics Summarising and graphing data Describing data Organisation ◮ Lecturer: Dennis Dobler ◮ Assistants: Nurzhan, Paul, Francisco, Birgit ◮ Lectures: 10, Mon and Wed, see course manual ◮ Computer sessions: Tue (1 st week: also Thu), bring fully charged laptop! Division: check Canvas this evening / tomorrow! ◮ Exercise classes: Thu Dennis Dobler Vrije Universiteit Amsterdam Statistical Methods: Lecture 1

  4. Course parameters Course Introduction Introduction to Statistics Summarising and graphing data Describing data Assessment ◮ Assignments: 4 weekly assignments (graded) + Assignment 0 (fail/pass) all have to be handed in, otherwise you fail the course ◮ Midterm Exam: Monday November 20, 16:00–17:45 (instead of lecture), TenT ◮ Final Exam: Tuesday December 19, 15:15–18:00, Emergohal (Amstelveen) ◮ Exam Grade: Exam = 0.4 × Midterm + 0.6 × Final, if both at least 5. Otherwise Exam = min { Midterm,Final } 3 4 × Exam + 1 ◮ Grade: 4 × Assignments ◮ Pass: if(Exam >= 5.5 & Grade >= 5.5) print(’Pass’) Dennis Dobler Vrije Universiteit Amsterdam Statistical Methods: Lecture 1

  5. Course parameters Course Introduction Introduction to Statistics Summarising and graphing data Describing data Resources ◮ Course manual: available on Canvas ◮ Book: Elementary Statistics, by Mario F. Triola, twelfth edition (Pearson New International Edition), ISBN: 9781292039411 ◮ A lot of sections are divided in Part 1 (basics) and Part 2 (more advanced): unless stated otherwise during lectures, only Part 1 has to be studied. ◮ A copy of the book is stationary available in the course literature shelfs in the library on the first floor of the VU main building (1C-02). ◮ Lecture slides: available on Canvas (NB. sometimes we treat topics not in the book). ◮ Software: R , software package, and RStudio , IDE for R . Downloadable from r-project.org and rstudio.com ◮ R manual: pdf available on Canvas ◮ Setting up R : pdf available on Canvas. Dennis Dobler Vrije Universiteit Amsterdam Statistical Methods: Lecture 1

  6. Course parameters Course Introduction Introduction to Statistics Summarising and graphing data Describing data Assignments ◮ Assignments have to be made in groups of 2 students. ◮ Published on Wednesday on Canvas (after lecture), due Wed. 23.59 week after ◮ Enrol yourself on Canvas asap (if you have not done it yet). ◮ Assignments can only be submitted if you are enrolled to a group. ◮ Groups of size 1 are not allowed! Otherwise: Two groups of size 1 are randomly merged. ◮ Hand in online via Canvas. ◮ Deadlines are strict: too late → course failed. ◮ Theoretical questions: solve these without R . ◮ Other questions: use R . Ask questions about these exercises during computer sessions. ◮ On Canvas: how to write your assignment. ◮ On average, one day of work. Dennis Dobler Vrije Universiteit Amsterdam Statistical Methods: Lecture 1

  7. Course parameters Course Introduction Introduction to Statistics Summarising and graphing data Describing data Exercise classes ◮ The exercise classes are a good oppurtunity to prepare for the exams. ◮ Exercises from the Triola book are discussed. Warning: former book editions have a different numbering! ◮ Prepare the exercises before class - this will maximise your learning effect. ◮ Division into exercise classes: basically the same as for computer sessions. Dennis Dobler Vrije Universiteit Amsterdam Statistical Methods: Lecture 1

  8. Course parameters Course Introduction Introduction to Statistics Summarising and graphing data Describing data Motivation Cupid in your network Source: ”Cupid in your network” http://research.facebook.com/ Dennis Dobler Vrije Universiteit Amsterdam Statistical Methods: Lecture 1

  9. Course parameters Course Introduction Introduction to Statistics Summarising and graphing data Describing data Motivation Why do I have to learn statistics? Can’t I just crunch the numbers? ◮ Answer research questions: test claims/hypotheses ◮ Make decisions and/or predictions ◮ Statistical literacy ◮ Statistics is used (almost) everywhere: business (data/business analyst), medical sciences, politics, sports, economics, . . . Dennis Dobler Vrije Universiteit Amsterdam Statistical Methods: Lecture 1

  10. Course parameters Course Introduction Introduction to Statistics Summarising and graphing data Describing data Motivation Statistical methods are used to: ◮ Compare search engines ◮ Analyse experiments in human-computer interaction ◮ Analyse and interpret survey results ◮ Analyse and interpret user data of social media ◮ Error analysis of social web ◮ Design and analyse data of experiments for social networks ◮ Google Analytics Dennis Dobler Vrije Universiteit Amsterdam Statistical Methods: Lecture 1

  11. Course parameters Course Introduction Introduction to Statistics Summarising and graphing data Describing data Motivation Statistical Methods (the course) can be followed up by: ◮ Information Retrieval ◮ Human Computer Interaction ◮ Machine Learning ◮ Data Mining Techniques ◮ The Social Web ◮ Collective Intelligence More advanced statistics courses in Master: ◮ Experimental Design and Data Analysis (CS, AI) ◮ Research Methods (IS, AI) Dennis Dobler Vrije Universiteit Amsterdam Statistical Methods: Lecture 1

  12. Course parameters Course Introduction Introduction to Statistics Summarising and graphing data Describing data Goals and topics After this course you should be: ◮ familiar with basic principles and techniques of statistics; ◮ able to apply them to data using the statistical package R; ◮ able to present results from statistical analyses in a clear, concise way; ◮ able to interpret and critically evaluate these results. The topics you will learn about are: ◮ summarising data; ◮ basics of probability theory; ◮ estimating means and proportions; ◮ hypothesis testing for one- and two-sample problems; ◮ correlation and linear regression; ◮ contingency tables. Dennis Dobler Vrije Universiteit Amsterdam Statistical Methods: Lecture 1

  13. Course parameters Course Introduction Introduction to Statistics Summarising and graphing data Describing data What is statistics? Statistics is the science of data: the study of collecting, organising, analysing, interpreting and presenting data. We use statistics to gain information about a group of objects (i.e. population) and/or to make decisions and predictions when randomness is involved. Census is collection of data from every member of population. Usually too large too collect. Therefore, a sample, a selected sub collection from the population, is studied: Sample → Data → Analysis → Conclusion about population Dennis Dobler Vrije Universiteit Amsterdam Statistical Methods: Lecture 1

  14. Course parameters Course Introduction Introduction to Statistics Summarising and graphing data Describing data 1.2 Statistical and critical thinking A statistical study consists of the following steps: 1. Prepare ◮ Context ◮ Source ◮ Sampling method (how to obtain samples?) 2. Analyse ◮ Graph data ◮ Explore data ◮ Apply statistical methods 3. Conclude Dennis Dobler Vrije Universiteit Amsterdam Statistical Methods: Lecture 1

  15. Course parameters Course Introduction Introduction to Statistics Summarising and graphing data Describing data 1.2 Statistical and critical thinking Recall: sample is subcollection of population. So different sample → different data. Hence, possibly different conclusions about population! A sample should be representative (same characteristics as population) and unbiased (no systematic difference with population). Dennis Dobler Vrije Universiteit Amsterdam Statistical Methods: Lecture 1

  16. Course parameters Course Introduction Introduction to Statistics Summarising and graphing data Describing data 1.4 Collecting sample data There are different methods to collect sample data: ◮ Voluntary response sample: subjects decide themselves to be included in sample. ◮ Random sample: each member of population has equal probability of being selected. ◮ Simple random sample: each sample of size n has equal probability of being chosen. ◮ Systematic sampling: after starting point, select every k -th member. ◮ Convenience sampling: easily available results. ◮ Stratified sampling: divide population into subgroups (strata) such that subjects within groups have same characteristics, then draw a (simple) random sample from each group. ◮ Cluster sampling: Divide popluation into sections (clusters), then randomly select some of these clusters. Dennis Dobler Vrije Universiteit Amsterdam Statistical Methods: Lecture 1

Recommend


More recommend