Link Search Menu Expand Document

Subset a df according to string present in columns name

df.loc[:, df.columns.str.startswith('alp')]
df.loc[:, df.columns.str.contains('alp')]

Rename columns

df.rename(columns={'oldName1': 'newName1', 'oldName2': 'newName2'}, inplace=True)

Rename columns by position

df.rename(columns={ df.columns[1]: "your value" }, inplace = True)

Check df datatype

df.dtypes

Convert to a specific type

df.year.astype(int)

From continuous to categorical

pd.cut(df.Age,bins=[0,2,17,65,99],labels=['Toddler/Baby','Child','Adult','Elderly'])

Merge two df based on index

pd.merge(df1, df2, left_index=True, right_index=True)

replace specific string in values

df['Column2'] = df.Column2.str.replace('b,?' , '')

drop column according to regex

df = df[df.columns.drop(list(df.filter(regex='Test')))]

If Na replace with value of the same row but another column

https://stackoverflow.com/a/29177664

df.Temp_Rating.fillna(df.Farheit, inplace=True)
del df['Farheit']
df.columns = 'File heat Observations'.split()

Extract digits from a string

https://stackoverflow.com/a/37683738

df.A.str.extract('(\d+)')

Create multiples columns values conditionally using np.where

https://stackoverflow.com/a/19913845

df = pd.DataFrame({'Type':list('ABBC'), 'Set':list('ZZXY')})
conditions = [
    (df['Set'] == 'Z') & (df['Type'] == 'A'),
    (df['Set'] == 'Z') & (df['Type'] == 'B'),
    (df['Type'] == 'B')]
choices = ['yellow', 'blue', 'purple']
df['color'] = np.select(conditions, choices, default='black')
print(df)

pklaceholder