在Pandas DataFrame中创建一个新列,并将所有单元格设置为默认数组

时间:2022-09-30 07:07:23

I'm trying to build a DataFrame where one of the columns represents a vector. This is the part of code I'm having trouble with:


tweets = pd.DataFrame(train_tweets)
tweets["LangClass"] = "und"
tweets["LangVec"] = pd.Series[[0,0,0,0,0,0,0,0,0,0]]

train_tweets is an incoming DataFrame with only two columns, and I want to add a third and fourth column, LangClass and LangVec. The values in LangVec will be updated element by element.

train_tweets是一个只有两列的传入DataFrame,我想添加第三和第四列,LangClass和LangVec。 LangVec中的值将逐个元素地更新。

I had it working by using a for loop to iterate through the DataFrame and setting each value of LangVec to the desired vector, but that seems to be a very slow approach.


Thanks for any suggestions!


1 个解决方案



I think the best is create list of tuples or list of lists and then call DataFrame contructor:


L = []
for x in iterator:
    first_val = some_code_for_count_val
    second_val =  some_code_for_count_val
    L.append((first_val, second_val)) 

df1 = pd.DataFrame(L, columns = ['LangClass', 'LangVec'])

Last join to original DataFrame:


df = df.join(df1)



I think the best is create list of tuples or list of lists and then call DataFrame contructor:


L = []
for x in iterator:
    first_val = some_code_for_count_val
    second_val =  some_code_for_count_val
    L.append((first_val, second_val)) 

df1 = pd.DataFrame(L, columns = ['LangClass', 'LangVec'])

Last join to original DataFrame:


df = df.join(df1)