Taboo Words in Technology

November 2015

This is the research that sits behind the project "No Replacements Found".

Recording language is not neutral; instead it reflects the cultural and social ideologies of the culture. Any dictionary takes into account and makes decisions on matters such as dialectical, scatological and taboo terms and whether to include new terms.  Taboo terms have always been a problem for dictionaries. One of the first dictionaries, created by Cockeram in 1590, made sure that it attempted to distinguish between “vulgar words” and “refined and elegant language”.  Other dictionary makers have been less polite and just ignored anything that they felt was offensive: Samuel Johnson, renowned dictionary maker, did not include anything that he felt would offend and Webster excluded any words that had sexual or excremental meanings.  “Fuck”, for example, first appeared in a major dictionary in 1965 and for a very long time did not have a definition that included that it meant sexual intercourse although it has had this meaning since medieval times. 

But dictionaries are being used and compiled in a different way than they were when Samuel Johnson wrote his. It is becoming rarer and and rarer for someone to take a dictionary off a shelf, look up a word and then write the correct one down. The dictionary, as a book, is becoming obsolete, just like CDs, videos and cassettes. Instead dictionaries are much more integrated in how we going about writing: uncertain of a word,  if it is typed in incorrectly autocorrect will give a list of options as to what it might be. But there are gaps where, when words are felt by operating systems to be offensive, nothing is suggested. 

About a year ago, it was reported that there are certain words that Google, Apple and Microsoft are trying to ban.  Some of the methodologies behind this were suspect (throwing a load of terms from Urban Dictionary into Bing to see what would and would not come out) and some of the figures that articles like Wired magazine and even the Daily Mail were quoting seemed excessively high (such as the quote of 20,000 words). However, finding what these words are, particularly for iOS, was very difficult. Using the methodology given by the Daily Beast I followed similar steps. I took the existing dictionary from my machines and created two misspelt dictionaries, one with a changed first letter and one with a changed last letter. I then checked to see they would be recognised by the iOS using an API function that would return a word if it could guess it. I then took all of the words that it could not guess and created a masterlist of all the words that failed in both sets. This gives the 20,000 figure that were in all of the articles. But if you look at the words you realise that the vast majority of them are incredibly obscure. I wanted to see which are the ones that are actually used and banned, and therefore cross-checked with the Concise Oxford English Dictionary as a proxy for everyday speech. Once I had done this, the list dropped down to around 200 (illustrated left).  

This anxiety plays into wider social trends around what is considered taboo. The educational commissioner in Texas refused to list any of the four major American dictionaries in schools during the Seventies because they included such seditious terms like “deflower”,  “john”, “g-string” and “slut”. While that might seem ridiculous, it’s not that far off to what is happening with large technology companies and how they are influencing what is being perceived as offensive and inoffensive.