Page:The World Within Wikipedia: An Ecology of Mind.pdf/20

This page needs to be proofread.
Information 2012, 3
248

Although the results in Table 15 are informative, they do not completely capture the qualitative data and intuition behind Table 14, the outlink structure for sleep. What that table describes is much further away from a gist-like representation and much closer to an activation based representation. Assuming that these articles are activated by list words, one can imagine activation flowing from them, through their outlinks and redirect links, to the article for sleep. Or put another way, instead of calculating the difference between entire vectors for list words and target word, only the vector element of the list word that corresponds to the target word is considered. We performed this analysis for Wikipedia outlinks on the DRM lists, using WLM’s model of outlink structure. For each word on the list, we collected all outlinks from all possible senses of that word and counted those pointing to the target article. Since the list words converge on a particular sense (the target word’s sense), taking the most frequent linked-to sense provided a strong predictor of backward associative strength, r(53) = 0.53.


We applied this same strategy to COALS and ESA. For COALS, we first identified the dimension associated with the non-presented target word. Then for each word on in the list, we retrieved the associated vector and added up the value at that target dimension. No normalization or SVD projection was used. Likewise, for ESA we found the target dimension and summed the word list vectors on that dimension. In the case of ESA, the standard normalization was used. The obtained correlations using these more activation-aligned models are presented in Table 17, along with the correlation for WLM outlink-based activation measure above.


Table 17. Spearman rank correlations with backward associative strength for DRM lists, using an activation-type metric (N = 55).


Model Correlation
COALS Activation 0.23
ESA Activation 0.41
WLM Activation 0.53


As shown in Table 17, framing the model more in terms of activation rather than gist improves correlations considerably for ESA and WLM Activation models, such that the WLM Activation model has a non-significantly higher correlation with backward associative strength than the gist-type LDA model in Table 13 and W3C3 model presented in Table 15. For WLM this is perhaps not surprising because of how the outlinks on Wikipedia pages are generated in the first place: by people. Wikipedia’s guidelines on linking center on the likelihood that a reader will want to read the linked article[1]. It is up to the authors of Wikipedia pages to consider the association between one page’s topic and another before linking them. It seems only natural that some level of backward association strength would manifest in this process. By the same token, one might argue that WLM Activation only provides a circular definition of backward associative strength, since similar word associative processes are at work when people link to Wikipedia pages as are in word association tasks. While this is likely true, it also is evidence that the internal cognitive-linguistic processes involved in word association and DRM are externally represented in Wikipedia’s link structure.

  1. Wikipedia. Manual of Style (linking). 2011. Available online: http://en.wikipedia.org/wiki/Wikipedia:Manual_of_Style_(linking) (accessed on 26 January 2011).