Spark Word2vec vector mathematics

Question

1 Answer

JaneShaw · Answer 1 · 2019-08-01T08:02:18+0000

Here is an example in pyspark, which I guess is straightforward to port to Scala - the key is the use of model.transform.

from pyspark import SparkContext
from pyspark.mllib.feature import Word2Vec
sc = SparkContext()
inp = sc.textFile("text8_lines").map(lambda row: row.split(" "))
k = 200 # vector dimensionality
word2vec = Word2Vec().setVectorSize(k)
model = word2vec.fit(inp)

k is the dimensionality of the word vectors - the higher the better (default value is 100), but you will need memory, and the highest number I could go with my machine was 200.

Also, for more insights on this, aspirants can go through Pyspark Tutorial for a much broader

Spark Word2vec vector mathematics

1 Answer

Related questions

Browse Categories

Browse By Domains

Popular Courses

Popular Tutorials

Popular Resources