load u p and look at some data
play

Load u p and look at some data P YTH ON FOR R U SE R S Daniel - PowerPoint PPT Presentation

Load u p and look at some data P YTH ON FOR R U SE R S Daniel Chen Instr u ctor O v er v ie w 2013 NYC Flights Load and E x plore Manip u late and plot PYTHON FOR R USERS Importing data w ith list comprehensions List comprehension to load


  1. Load u p and look at some data P YTH ON FOR R U SE R S Daniel Chen Instr u ctor

  2. O v er v ie w 2013 NYC Flights Load and E x plore Manip u late and plot PYTHON FOR R USERS

  3. Importing data w ith list comprehensions List comprehension to load data Sa v es repetiti v e t y ping Condensed w a y of appending v al u es to a list PYTHON FOR R USERS

  4. List comprehensions import glob import pandas as pd csv_files = glob.glob('*.csv') csv_files ['data3.csv', 'data2.csv', 'data1.csv'] all_dfs = [pd.read_csv(x) for x in csv_files] all_dfs[0] A B C D 0 a0 b0 c0 d0 1 a1 b1 c1 d1 2 a2 b2 c2 d2 3 a3 b3 c3 d3 PYTHON FOR R USERS

  5. Let ' s practice ! P YTH ON FOR R U SE R S

  6. Manip u lating data P YTH ON FOR R U SE R S Daniel Chen Instr u ctor

  7. Gro u pb y df = pd.DataFrame({ 'name':['John Smith', 'Jane Doe', 'Mary Johnson'], 'treatment_a': [np.NaN, 16, 3], 'treatment_b': [2, 11, 1] }) df_melt = pd.melt(df, id_vars='name') df_melt.groupby('name')['value'].mean() name Jane Doe 13.5 John Smith 2.0 Mary Johnson 2.0 Name: value, dtype: float64 PYTHON FOR R USERS

  8. Gro u pb y aggregate df_melt.groupby('name')['value'].agg(['mean', 'max']) mean max name Jane Doe 13.5 16.0 John Smith 2.0 2.0 Mary Johnson 2.0 3.0 PYTHON FOR R USERS

  9. D u mm y v ariables Categorical v ariables need to be encoded as d u mm y v ariables One - hot encoding df = pd.DataFrame({ 'status':['sick', 'healthy', 'sick'], 'treatment_a': [np.NaN, 16, 3], 'treatment_b': [2, 11, 1] }) df status treatment_a treatment_b 0 sick NaN 2 1 healthy 16.0 11 2 sick 3.0 1 PYTHON FOR R USERS

  10. Get d u mmies pd.get_dummies(df) treatment_a treatment_b status_healthy status_sick 0 NaN 2 0 1 1 16.0 11 1 0 2 3.0 1 0 1 PYTHON FOR R USERS

  11. Let ' s practice ! P YTH ON FOR R U SE R S

  12. Wrap -u p P YTH ON FOR R U SE R S Daniel Chen Instr u ctor

  13. Re v ie w of topics Ho w R translates into P y thon Basic t y pes , f u nctions , methods N u mp y and Pandas Data Manip u lation and cleaning techniq u es Vis u ali z ation PYTHON FOR R USERS

  14. R v s P y thon One lang u age isn ' t " be � er " than the other Broaden y o u r toolkit PYTHON FOR R USERS

  15. Let ' s practice ! P YTH ON FOR R U SE R S

Recommend


More recommend