The new words of English – A computational approach

Imagine that every day, approximately 15 new words are introduced in the English language.  And some of them are rapidly adopted by speakers and added to dictionaries. For example, if you have not heard the word “infomania” yet, then check your dictionary! The growing use of online social media, like Twitter, also provides a suitable medium for the quick spread of words regardless of the distance between users. Twitter also provides a useful environment for researches, like me, to conduct big data analysis studies to identify the recently popular words and the spread patterns of those words.

The previous studies suggest that particularly the population, distance, and demographic factors have a huge impact on the distribution of emerging words (Eisenstein et al., 2014). It would be interesting to see the impact of these factors on the nature of the interaction between two English speaking neighbouring countries, Canada and the United States: whether the newly popular words cross the border or stay in their home country. My aim in that project is to dig into this question.

The previous studies identified a collection of emerging words by analyzing the geo-tagged Twitter data (Eisenstein et al., 2014; Grieve et al., 2018). Twitter provides an informal communication platform, so most of these emerging words can be characterized as acronyms, slangs, and abbreviations (e.g., “notifs” means Notifications, “tfw” means That Feel When “af” means as fuck)). The results in those studies belong to the data between 2009 and 2014. I examined the similar procedure on a recent geo-tagged twitter data (a 2-and-a- half-year corpus of tweets between 2015 and 2017). My results have been suggesting an interesting picture: The words which were found popular in the previous studies have been losing their popularity in our data. Twitter users are not productive in terms of creating new words anymore. And also, when I tracked geographical distribution of those words over time based on their adaptation rate and the maps mostly revealed two cases: (i) words are likely to die where they born and (ii) small cities are more agile to initiate and drop the use of a word (see Figure 1).

For more, stay tuned!

Figure 1. The spread and shrinkage pattern of the word “af”.
The first three maps from Eisenstein et al. (2014). Each map represents 50 weeks intervals. The next 6 maps are from our study, except for the last one, each map shows 6-month periods.

The word “af” (means as fuck) was found in both studies. The maps of Eisenstein et al. (2014) displays an increase in the usage of the word whereas ours show a decrease.

In the first map of Eisenstein et al. (2014), the word “af” mostly appears around Florida and California. There is also few uses in New York, Arizona and Texas states. In the second and third maps, there is an increase in the usage around California.  It is also possible to say that the usage of the word is first initiated in the inner states and then followed by Washington.

In our first map, there is a widespread usage of the word, it even crosses the border and arrived in Alberta, BC, Manitoba, Prince Edward and Nova scotia. On the other hand, there is a smooth decrease in the usage of the word in the last three maps. The word mostly disappears in the central part but stays active around costal states like Florida, California, and New York.

References:

Grieve, J., Nini, A., & Guo, D. (2018). Mapping Lexical Innovation on American Social Media. Journal Of English Linguistics46(4), 293-319. doi: 10.1177/0075424218793191

Eisenstein, J., O’Connor, B., Smith, N., & Xing, E. (2014). Diffusion of Lexical Change in Social Media. Plos ONE9(11), e113114. doi: 10.1371/journal.pone.0113114

Posted in Blog

Leave a Reply

Your email address will not be published. Required fields are marked *

*