I'm trying to run the very simple example for Word2Vec given in the documentation here:
我正在尝试运行文档中给出的Word2Vec的一个非常简单的例子:
https://spark.apache.org/docs/1.4.1/api/python/_modules/pyspark/ml/feature.html#Word2Vec
https://spark.apache.org/docs/1.4.1/api/python/_modules/pyspark/ml/feature.html Word2Vec
from pyspark import SparkContext, SQLContext
from pyspark.mllib.feature import Word2Vec
sqlContext = SQLContext(sc)
sent = ("a b " * 100 + "a c " * 10).split(" ")
doc = sqlContext.createDataFrame([(sent,), (sent,)], ["sentence"])
model = Word2Vec(vectorSize=5, seed=42, inputCol="sentence", outputCol="model").fit(doc)
model.getVectors().show()
model.findSynonyms("a", 2).show()
TypeError Traceback (most recent call last)
<ipython-input-4-e57e9f694961> in <module>()
5 sent = ("a b " * 100 + "a c " * 10).split(" ")
6 doc = sqlContext.createDataFrame([(sent,), (sent,)], ["sentence"])
----> 7 model = Word2Vec(vectorSize=5, seed=42, inputCol="sentence", outputCol="model").fit(doc)
8 model.getVectors().show()
9 model.findSynonyms("a", 2).show()
TypeError: __init__() got an unexpected keyword argument 'vectorSize'
Any idea why this is failing?
知道为什么会失败吗?
1 个解决方案
#1
2
You are referring to documentation from ml
but importing from the mllib
package. In mllib
Word2Vec
doesn't take any parameters in __init__
.
Did you intend:
您正在引用来自ml的文档,但是从mllib包导入。在mllib Word2Vec中,__init__不接受任何参数。你打算:
from pyspark.ml.feature import Word2Vec
Output:
输出:
+----+--------------------+
|word| vector|
+----+--------------------+
| a|[-0.3511952459812...|
| b|[0.29077222943305...|
| c|[0.02315592765808...|
+----+--------------------+
+----+-------------------+
|word| similarity|
+----+-------------------+
| b|0.29255685145799626|
| c|-0.5414068302988307|
+----+-------------------+
#1
2
You are referring to documentation from ml
but importing from the mllib
package. In mllib
Word2Vec
doesn't take any parameters in __init__
.
Did you intend:
您正在引用来自ml的文档,但是从mllib包导入。在mllib Word2Vec中,__init__不接受任何参数。你打算:
from pyspark.ml.feature import Word2Vec
Output:
输出:
+----+--------------------+
|word| vector|
+----+--------------------+
| a|[-0.3511952459812...|
| b|[0.29077222943305...|
| c|[0.02315592765808...|
+----+--------------------+
+----+-------------------+
|word| similarity|
+----+-------------------+
| b|0.29255685145799626|
| c|-0.5414068302988307|
+----+-------------------+