Hi, I'm working on the Chinese corpus downloaded from Histwords.
I read the vectors of 病毒 & 电脑 and get the following results for cosine similarity:
('病毒', '电脑')
1950, cosine similarity=0.000
1960, cosine similarity=0.000
1970, cosine similarity=0.000
1980, cosine similarity=0.360
1990, cosine similarity=0.263
The Spearman correlation between [0, 0, 0, 0.36, 0.26] and [1950, 1960, 1970, 1980, 1990] is 0.78. However, in the paper reports the correlation as 0.89 (at the end of section 3.2).
Is there anything going wrong with my data processing? Thank you for your attention.