Types of joins PAN DAS JOIN S F OR S P READS H EET US ERS John Miller Principal Data Scientist
Three types of joins Types One-to-one One-to-many Many-to-many Join type indicates relationship of tables Like lookup values in VLOOKUP PANDAS JOINS FOR SPREADSHEET USERS
One-to-one Datasets at the same level Lookup column same for both tables Rows match one-to-one PANDAS JOINS FOR SPREADSHEET USERS
One-to-many Rows are unique at different levels Join based on the higher-level column Result based on lower-level column Similar to VLOOKUP based on lower-level data PANDAS JOINS FOR SPREADSHEET USERS
Many-to-many Values are repeated Results in all possible combinations Can be hard to interpret Best supplemented with a "bridge" table PANDAS JOINS FOR SPREADSHEET USERS
Let's practice! PAN DAS JOIN S F OR S P READS H EET US ERS
A closer look at one- to-one joins PAN DAS JOIN S F OR S P READS H EET US ERS John Miller Principal Data Scientist
Basics of pandas.merge() Used with complementary data pd.merge(left, right) Joins on one or more columns Similar to VLOOKUP PANDAS JOINS FOR SPREADSHEET USERS
Left merges pd.merge(df_left, df_right, on='GameKey', how='left') List data frame names for "df_left" and "df_right" placeholders Resulting frame matches rows in left frame PANDAS JOINS FOR SPREADSHEET USERS
Right merges pd.merge(df_left, df_right, on='GameKey', how='right') Same concept as left merge Resulting frame rows match right frame PANDAS JOINS FOR SPREADSHEET USERS
A framework for joins After viewing and understanding the data: Determine the relationship Check for unique values in key column Write merge statement and execute PANDAS JOINS FOR SPREADSHEET USERS
Joining on two keys Determine the relationship Check for unique values in key column Write merge statement and execute pd.merge(df_left, df_right, on=['GameKey', 'PlayId']) PANDAS JOINS FOR SPREADSHEET USERS
Let's practice! PAN DAS JOIN S F OR S P READS H EET US ERS
Combining common data with inner joins PAN DAS JOIN S F OR S P READS H EET US ERS John Miller Principal Data Scientist
Object-oriented expressions pandas-level function: pd.merge(df_left, df_right, on=None) Shorthand, pandas dataframe method: df_left.merge(df_right, on=None) PANDAS JOINS FOR SPREADSHEET USERS
A basic inner join Similar to VLOOKUP followed by removing NA's df1.merge(df2, on='GameKey', how='inner') Returns only rows present in both tables PANDAS JOINS FOR SPREADSHEET USERS
Joining on different names Key columns may have different names No need to rename columns df1.merge(df2, left_on='GameKey', right_on='game-key', how='inner') PANDAS JOINS FOR SPREADSHEET USERS
Let's practice! PAN DAS JOIN S F OR S P READS H EET US ERS
Recommend
More recommend