analyses were conducted to test for multicollinearity of COALS, ESA, and WLM by regressing each on the other two. The obtained tolerances, all between 0.49 and 0.60, suggest that the three models are not collinear. The explanation that each model is contributing substantially and equitably to the prediction is further supported by the similar magnitudes of β in Table 6.
Table 5. Average distance by WordSimilarity-353 semantic category.
Relation Types | COALS Distance | ESA Distance | WLM Distance | W3C3 Distance |
Antonym | 158 | 80 | 104 | 75 |
First hypernym | 40 | 27 | 40 | 36 |
Second hypernym | 53 | 62 | 56 | 48 |
Identical | 0 | 0 | 8 | 0 |
Second is part | 57 | 56 | 79 | 65 |
First is part | 56 | 76 | 45 | 53 |
Siblings | 62 | 55 | 59 | 55 |
Synonyms | 43 | 92 | 55 | 29 |
Topical | 64 | 65 | 67 | 53 |
All | 60 | 64 | 64 | 51 |
{[center|Table 6. Regression on ranks of COALS, ESA, and WLM, for human judgment ranks (N = 353).}}
Feature | B | SE(B) | β |
COALS | 0.358 | 0.046 | 0.358 * |
ESA | 0.270 | 0.045 | 0.269 * |
WLM | 0.304 | 0.045 | 0.303 * |
Notes: R = 0.80, ∗p < 0.0001.
To address the question of the maximum potential of the COALS, ESA, WLM, and W3C3 models
for correlation with human ratings, an oracle analysis was undertaken[1]. The oracle first converts the
output of each model to ranks. Then for each word pair, the oracle selects the output of the model whose
rank most closely matches the rank of the human rating. This procedure generates the best possible
correlation with the human ratings, based on the assumption that the oracle will choose the closest model
output every time. Using this methodology with all four models, the oracle correlation is r(351)=0.93.
Using only the three constituent models, the oracle correlation is r(351)=0.92, which is equivalent to
the previous best reported oracle correlation that used roughly an order of magnitude more data than the
present study[2]. So the maximum potential correlation of the three constituent models matches the
previous best result, with a minor improvement due to including the W3C3 model in the oracle.
The preceding analyses provide fairly strong evidence for reason behind the W3C3 model’s efficacy.
The W3C3 model has significantly higher correlations than the constituent models on the entire dataset,
- ↑ Agirre, E.; Alfonseca, E.; Hall, K.; Kravalova, J.; Pas¸ca, M.; Soroa, A. A Study on Similarity and Relatedness Using Distributional and WordNet-Based Approaches. In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics ( NAACL ’09), Association for Computational Linguistics: Stroudsburg, PA, USA, 2009; pp. 19–27.
- ↑ Agirre, E.; Alfonseca, E.; Hall, K.; Kravalova, J.; Pas¸ca, M.; Soroa, A. A Study on Similarity and Relatedness Using Distributional and WordNet-Based Approaches. In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics ( NAACL ’09), Association for Computational Linguistics: Stroudsburg, PA, USA, 2009; pp. 19–27.