【python】使用python对csv表格某一列的重复数据去重

时间:2025-04-08 21:50:09
import pandas as pd
import csv

l = list()
with open('','r') as read:
    reader = csv.reader(read)
    for i in reader:
        l.append(i)
df = pd.DataFrame(l)
df.drop_duplicates(subset=3,inplace=True)
df.to_csv('')

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12

先读入list,转为Dataframe格式,然后去重,输出。

subset : column label or sequence of labels, optional
Only consider certain columns for identifying duplicates, by
default use all of the columns
keep : {‘first’, ‘last’, False}, default ‘first’
- first : Drop duplicates except for the first occurrence.
- last : Drop duplicates except for the last occurrence.
- False : Drop all duplicates.
inplace : boolean, default False
Whether to drop duplicates in place or to return a copy