Tricks
import pandas as pd
df = pd.DataFrame({
'article': ['a', 'b', 'c'],
'tags': [['happy', 'fun'], ['sad', 'frustrating', 'stressful'], ['ಠ_ಠ']],
'views': [100, 10, 10000]
})
df
df.explode('tags')
df = pd.DataFrame({
'article': ['a', 'b', 'c'],
'tags': ['happy,fun', 'sad,frustrating,stressful', 'ಠ_ಠ'],
'views': [100, 10, 10000]
})
df
(
df
.assign(tags=df.tags.str.split(","))
.explode('tags')
)
(
df
.query('views > 10 & views < 10000')
)
can also pass variables with @
threshold = 100
(
df
.query('views == @threshold')
)
some_list = ['a', 'b']
df[df.article.isin(some_list)]
we can reverse this easily with ~
df[~df.article.isin(some_list)]
Assign
Skipping basics since you can just look here: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.assign.html
Create columns with variable names, also f strings (prob cleaner to use than
{}.format()
col_name = 'cat'
(
df
.assign(
**{f'some_{col_name}': ['haku', 'nagi', 'poki']}
)
.style.set_caption("some random dataframe I imagined")
)
can also set a caption for the dataframe lol