Manipulating `DataFrame`s Using `pandas`

One DataFrame has the columns A, B and another has the columns A, C. How to merge into one DataFrame with columns A, B, and C?

You can achieve this using pd.merge() in pandas with the how='outer' argument. This will merge on the common column A and include all rows from both DataFrames, filling in missing values (as NaN) where the data does not exist.

Here's an example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
import pandas as pd

# Example data
df1 = pd.DataFrame({
'A': [1, 2, 3],
'B': ['X', 'Y', 'Z']
})

df2 = pd.DataFrame({
'A': [2, 3, 4],
'C': ['P', 'Q', 'R']
})

# Merge on column 'A'
merged = pd.merge(df1, df2, on='A', how='outer')

print(merged)

Result:

1
2
3
4
5
   A    B    C
0 1 X NaN
1 2 Y P
2 3 Z Q
3 4 NaN R

Iterate over rows and access columns in a DataFrame

If the column names are valid Python identifiers, using itertuples() to yield namedtuples is fastest:

1
2
for row in df.itertuples():
print(row.Index, row.A, row.B) # Access columns with dot notation

If not all column names are valid Python identifiers (e.g., some column names contain spaces), use iterrows() to yield an index and a Series for each row:

1
2
for index, row in df.iterrows():
print(index, row['A'], row['B']) # Access columns via indexing

Manipulating `DataFrame`s Using `pandas`
https://jifengwu2k.github.io/2025/08/12/Manipulating-DataFrame-s-Using-pandas/
Author
Jifeng Wu
Posted on
August 12, 2025
Licensed under