RezoViz takes a corpus of text, identifies which words appear in documents together, and assigns a weight according to the frequency of that appearance. For example, one document that has the words Istanbul, Erdogan, and Minister and a document with Istanbul, Erdogan, and CHP will add 1 unit of weight to the link between Istanbul and Erdogan. Across many documents, you can begin to see which topics are discussed together. Below is an example of a graph that can be drawn given this data.
The green lines represent the words between which there are links, i.e. between the words that appear in the same document. The numbers in circles represent the ‘weight’ with which each have a link, i.e. how many documents they appear in together. In this image, the terms displayed are the ones with the most links (above whatever threshold). I’ve highlighted the term ‘Turkey’, so that linked terms with ‘Turkey’ appear in red.
Several types of questions can be derived from this kind of representation. Are there unexpected links between terms? Are there no links between terms that would be expected to have links? What can the weights of each link tell us about the importance of links?
Turkey is known for vary nationalistic press, so terms like ‘turkey’ are expected. But why don’t terms like ‘Taksim solidarity’, ‘Kurdish Communities Union’ or ‘Multu’ (the then governor of Istanbul province) appear with mentions of ‘Turkey’? To research this further, we would have to dig deeper into the contexts in which they arise, a tool for which could be here.
The picture below depicts the same data, but showing only the most frequent terms. This means that even terms which have no links to others on the graph are shown. The ones floating alone have a high frequency, but are not linked with other terms that also have a high frequency.
This can be al the more revealing, because the ‘floaters’ aren’t linked with popular topics, but the newspaper still chooses to publish on them regularly. Why are topics unassociated with popular topics published frequently?
Or this example:
CHP is the main opposition party to the AKP. Yet when I highlight CHP, the leader of both parties are also highlighted, and with the same weight. Does this mean that the newspaper speaks about the AKP in every article on the CHP, and in terms of their leaders?
Below is the raw analyzed data from the corpus of text, in this case from after the height of the protests. RezoViz identifies the frequent terms as either a person, organization, or location, without extra effort. This may however limit the kinds of things it draws links between, because it could still be revealing to see the link between figures in government and the kind of language that surrounds them, for example. For this reason, it is a kind of collocate tool that only looks at people, places, and organizations that appear anywhere in the same document.
Below is a graph showing the frequency of each term and which “type” they are. As you can see, the program uses a Stanford database of people and places, but guesses for the rest and is not always right.