I'm trying to build a DataFrame where one of the columns represents a vector. This is the part of code I'm having trouble with:
我正在尝试构建一个DataFrame,其中一列代表一个向量。这是我遇到问题的代码的一部分:
tweets = pd.DataFrame(train_tweets)
tweets["LangClass"] = "und"
tweets["LangVec"] = pd.Series[[0,0,0,0,0,0,0,0,0,0]]
train_tweets is an incoming DataFrame with only two columns, and I want to add a third and fourth column, LangClass and LangVec. The values in LangVec will be updated element by element.
train_tweets是一个只有两列的传入DataFrame,我想添加第三和第四列,LangClass和LangVec。 LangVec中的值将逐个元素地更新。
I had it working by using a for loop to iterate through the DataFrame and setting each value of LangVec to the desired vector, but that seems to be a very slow approach.
我通过使用for循环迭代DataFrame并将LangVec的每个值设置为所需的向量来实现它,但这似乎是一种非常缓慢的方法。
Thanks for any suggestions!
谢谢你的任何建议!
1 个解决方案
#1
0
I think the best is create list of tuples or list of lists and then call DataFrame
contructor:
我认为最好的是创建元组列表或列表列表,然后调用DataFrame构造函数:
L = []
for x in iterator:
first_val = some_code_for_count_val
second_val = some_code_for_count_val
L.append((first_val, second_val))
df1 = pd.DataFrame(L, columns = ['LangClass', 'LangVec'])
Last join to original DataFrame:
最后加入原始DataFrame:
df = df.join(df1)
#1
0
I think the best is create list of tuples or list of lists and then call DataFrame
contructor:
我认为最好的是创建元组列表或列表列表,然后调用DataFrame构造函数:
L = []
for x in iterator:
first_val = some_code_for_count_val
second_val = some_code_for_count_val
L.append((first_val, second_val))
df1 = pd.DataFrame(L, columns = ['LangClass', 'LangVec'])
Last join to original DataFrame:
最后加入原始DataFrame:
df = df.join(df1)