bbs網(wǎng)站模板怎么創(chuàng)作自己的網(wǎng)站
在SQL和Pandas中,連接查詢(join)是處理數(shù)據(jù)集之間關系的重要工具。下面是SQL中的各種連接查詢類型及其與Pandas中相應操作的對應關系:
1. INNER JOIN
-
SQL:
INNER JOIN
返回兩個表中具有匹配值的行。
-
Pandas:
merge()
方法的how
參數(shù)設置為'inner'
。- 示例代碼:
merged_df = pd.merge(df1, df2, on='key', how='inner')
2. LEFT JOIN / LEFT OUTER JOIN
-
SQL:
LEFT JOIN
返回左表中的所有行,并在右表中找到匹配項時返回相應的行。如果沒有匹配項,則返回NULL
。
-
Pandas:
merge()
方法的how
參數(shù)設置為'left'
。- 示例代碼:
merged_df = pd.merge(df1, df2, on='key', how='left')
3. RIGHT JOIN / RIGHT OUTER JOIN
-
SQL:
RIGHT JOIN
返回右表中的所有行,并在左表中找到匹配項時返回相應的行。如果沒有匹配項,則返回NULL
。
-
Pandas:
merge()
方法的how
參數(shù)設置為'right'
。- 示例代碼:
merged_df = pd.merge(df1, df2, on='key', how='right')
4. FULL OUTER JOIN / FULL JOIN
-
SQL:
FULL OUTER JOIN
返回兩個表中的所有行。對于沒有匹配項的行,缺失的列會被填充為NULL
。
-
Pandas:
merge()
方法的how
參數(shù)設置為'outer'
。- 示例代碼:
merged_df = pd.merge(df1, df2, on='key', how='outer')
5. CROSS JOIN
-
SQL:
CROSS JOIN
返回兩個表的笛卡爾積,即所有可能的行組合。
-
Pandas:
merge()
方法沒有直接對應的方法,但可以通過設置on
參數(shù)為None
并將how
設置為'outer'
來實現(xiàn)。- 示例代碼:
merged_df = pd.merge(df1, df2, how='outer')
6. SEMI JOIN
-
SQL:
SEMI JOIN
返回左表中在右表中有匹配項的行。
-
Pandas:
merge()
方法結(jié)合boolean indexing
可以模擬SEMI JOIN
。- 示例代碼:
semi_joined_df = df1[df1['key'].isin(df2['key'])]
7. ANTI JOIN
-
SQL:
ANTI JOIN
返回左表中在右表中沒有匹配項的行。
-
Pandas:
merge()
方法結(jié)合boolean indexing
可以模擬ANTI JOIN
。- 示例代碼:
anti_joined_df = df1[~df1['key'].isin(df2['key'])]
示例代碼
假設我們有兩個DataFrame df1
和 df2
,我們將演示這些連接操作:
import pandas as pd# 創(chuàng)建示例 DataFrame
data1 = {'key': ['A', 'B', 'C', 'D'],'value1': [1, 2, 3, 4]
}
df1 = pd.DataFrame(data1)data2 = {'key': ['B', 'D', 'E'],'value2': [5, 6, 7]
}
df2 = pd.DataFrame(data2)# INNER JOIN
inner_joined_df = pd.merge(df1, df2, on='key', how='inner')
print("INNER JOIN:")
print(inner_joined_df)# LEFT JOIN
left_joined_df = pd.merge(df1, df2, on='key', how='left')
print("\nLEFT JOIN:")
print(left_joined_df)# RIGHT JOIN
right_joined_df = pd.merge(df1, df2, on='key', how='right')
print("\nRIGHT JOIN:")
print(right_joined_df)# FULL OUTER JOIN
full_outer_joined_df = pd.merge(df1, df2, on='key', how='outer')
print("\nFULL OUTER JOIN:")
print(full_outer_joined_df)# CROSS JOIN
cross_joined_df = pd.merge(df1, df2, how='outer')
print("\nCROSS JOIN:")
print(cross_joined_df)# SEMI JOIN
semi_joined_df = df1[df1['key'].isin(df2['key'])]
print("\nSEMI JOIN:")
print(semi_joined_df)# ANTI JOIN
anti_joined_df = df1[~df1['key'].isin(df2['key'])]
print("\nANTI JOIN:")
print(anti_joined_df)
輸出示例
假設 df1
和 df2
如下所示:
df1:key value1
0 A 1
1 B 2
2 C 3
3 D 4df2:key value2
0 B 5
1 D 6
2 E 7
輸出結(jié)果將會是:
INNER JOIN:key value1 value2
1 B 2 5
3 D 4 6LEFT JOIN:key value1 value2
0 A 1 NaN
1 B 2 5.0
2 C 3 NaN
3 D 4 6.0RIGHT JOIN:key value1 value2
1 B 2.0 5.0
3 D 4.0 6.0
2 E NaN 7.0FULL OUTER JOIN:key value1 value2
0 A 1.0 NaN
1 B 2.0 5.0
2 C 3.0 NaN
3 D 4.0 6.0
4 E NaN 7.0CROSS JOIN:key_x value1 key_y value2
0 A 1 B 5
1 A 1 D 6
2 A 1 E 7
3 B 2 B 5
4 B 2 D 6
5 B 2 E 7
6 C 3 B 5
7 C 3 D 6
8 C 3 E 7
9 D 4 B 5
10 D 4 D 6
11 D 4 E 7SEMI JOIN:key value1
1 B 2
3 D 4ANTI JOIN:key value1
0 A 1
2 C 3