A corpus-based approach to finding happiness

  • Rada Mihalcea & Hugo Liu

Proceedings ofAAAI-CAAW-06, the Spring Symposia on Computational Approaches to Analyzing Weblogs |

What are the sources of happiness and sadness in everyday life? In this paper, we employ ‘linguistic ethnography’ to seek out where happiness lies in our everyday lives by considering a corpus of blogposts from the LiveJournal community annotated with happy and sad moods. By analyzing this corpus, we derive lists of happy and sad words and phrases annotated by their ‘happiness factor.’ Various semantic analyses performed with this wordlist reveal the happiness trajectory of a 24-day (3am and 9-10p are most happy), and a 7-day week (Wednesdays are saddest), and compare the socialness and human centeredness of happy descriptions versus sad descriptions. We evaluate our corpus-based approach in a classification task and contrast our wordlist with emotionally-annotated wordlists produced by experimental focus groups. Having located happiness temporally and semantically within this corpus of everyday life, the paper concludes by offering a corpus-inspired livable recipe for happiness.