Research, tool/analysis review

Asking new questions, part 4 – clusters/N-Grams

The fourth part in the series of analyses reviews covers Clusters and N-Grams, both used within the AntConc program. This is a brilliant tool to learn more about notable cases from the Voyant tools

So if you remember when I talked about TermsRadio, there was a significant spike in the number of terms related to protests and the square in Istanbul where the Gezi protests happened. This however does not tell use much about how each term is portrayed. How would we know whether the language around these topics had even changed? It turns out that the Gezi protest seemed to have had a lasting affect on the way ‘police’ was discussed.

AntConc has a useful feature that takes in a keyword and a range of words around that term, and spits out the most frequent phrases that contain that keyword. For example, the results below show the clusters around the word ‘police’ before the protests started. The first shows the frequency of phrases that begin with ‘police’, and the second shows the phrases ending with police.

Screen Shot 2015-07-07 at 17.26.22

Screen Shot 2015-07-07 at 17.13.00


The terms surrounding ‘police’ are what one might expect them to be: ‘police departments’, ‘police conducted’, prepositional phrases with police – peaceful phrases. When you click on individual cases, you can view word concordance, which shows each case more in depth, to more clearly see the words surrounding it:

Screen Shot 2015-07-07 at 17.29.26


When you do the same analysis for the first two weeks of the protests, results are as one would expect. ‘Police’ appears with words that represent how they appeared in the protests. Aggressive words like ‘riot police’ and ‘police crackdown’ are more common. This is expected; the Turkish police acted very aggressively against the protesters and it was reported as such. (This is a clear area to expand on with other news outlets.)

Screen Shot 2015-07-08 at 11.09.54 Screen Shot 2015-07-08 at 11.08.52

It may be interesting to ask the question, Why don’t phrases like ‘police conducted’ appear just as often as before? Surely police investigations did not halt around the country once the protests started.

Really interesting results came with the analysis on the articles two weeks after the height:

Screen Shot 2015-07-08 at 11.20.26 Screen Shot 2015-07-08 at 11.22.38

As you can see, the language around ‘police’ has not faded away very rapidly. The most important thing to note is that ‘police officers’ occurs at the highest frequency of any other phrase beginning with ‘police’, even in other clusters results. In the image below, the correspondence shows that they language around ‘police officers’ differs significantly from the language around ‘police department’ (above).

Does this indicate a shift in the perspective of the police from the department to individuals? What doe that mean about the kind of stories that they cover?

Screen Shot 2015-07-08 at 11.28.13

What does this mean about the way the newspaper has shifted its discussion about the police?

I conducted another quick analysis on the word ‘media’, but I am still digging deeper into the results. One noticeable result is the higher frequency of ‘social media’ in the article published after the protests.

Screen Shot 2015-07-08 at 11.51.36

2 thoughts on “Asking new questions, part 4 – clusters/N-Grams”

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s