
import pandas as pd
import numpy as np
import re
aita_url = ""
df = pd.read_csv(aita_url)
df["body"].fillna("", inplace = True)
# with pd.read_csv(aita_url, chunksize=1) as rdr:
#     for chunk in rdr:
#         print(chunk["id"])
idxb = df["id"].str.match(r"^b.*")
idxq = df["id"].str.match(r".*q$")
idx = np.logical_or(idxb, idxq)
p = np.mean(df['is_asshole'])
data = {"Label": ["A-hole", "Not A-hole"], "Proportion": [p, 1-p]}
Label Proportion
0 A-hole 0.24
1 Not A-hole 0.76
jdx = df["body"].str.match(r'.*\b([W|w]ife|[G|g]irlfriend)\b.*')
df["body"].loc[np.where(jdx == 1)[0][3]]
"So I (27M) went for a drink with some friends (2 guys my age and also 2 girls). Just for a preliminary note, I have a girlfriend (26F) and so do the two guys. However our girlfriends weren't with us.\n\nNow anyway, as we were drinking one of the guy friends, we'll call Steve, said that he thinks he is ''in love'' with his girlfriend and that there's ''noone else in the world that he loves more than her''. Now we can call the other guy Joe and the other two girls Bethany and Katy. Joe congratulated him and Bethany and Katy said ''awwwh''. I congratulated him too that he's found a girl he really likes.\n\nThen they turned to me and asked whether I love my girlfriend. I said that I do, but I don't know why I said this, as I was drunk, I just blurted out ''I love her very much, I really do, but...there's noone in the whole world that I love more than myself. Not even my own mother.''\n\nThey looked at me startled and the girls said ''That's such a fucking douchey thing to say''. I quickly changed the topic, but when I think of it, I don't see the issue. It's socially acceptable for a guy to say he loves his girlfriend/wife more than anyone, or it's socially acceptable for a mother to say she loves her son more than anyone (or a father loves his daughter more than anyone). But it's douchey to say I love myself more than anyone?\n\nI 100% feel that I love myself more than anyone. I love myself more than my parents. More than my friends. More than my girlfriend. I don't see why people get triggered at this, AITA?"
adx = df["is_asshole"] == 1
np.round(np.sum(np.logical_and(adx, jdx)) / np.sum(jdx), 2)