csvdedupe：用于对CSV文件进行重复数据删除的命令行工具下载

【文件属性】：

文件名称：csvdedupe：用于对CSV文件进行重复数据删除的命令行工具

文件大小：706KB

文件格式：ZIP

更新时间：2024-02-20 13:41:49

cli dedupe record-linkage csv-files entity-resolution

csvdedupe 使用数据对CSV文件进行重复数据命令行工具。云服务和开源工具集的一部分，用于重复数据删除和查找数据中的模糊匹配。有关更多详细信息，请参见的。两个简单的命令： csvdedupe接收杂乱的输入文件或STDIN管道并识别重复项。 csvlink两个CSV文件并查找它们之间的匹配项。安装和依赖项 pip install csvdedupe 入门 csvdedupe csvdedupe接收杂乱的输入文件或STDIN管道并识别重复项。首先，选择以下三种重复数据删除策略之一：使用参数调用csvdedupe ，使用UNIX传输文件或定义配置文件。提供输入文件，字段名称和输出文件： csvdedupe examples/csv_example_messy_input.csv \ --field_names " Site name " Address Zip Phone \ --output_file output.csv 要么管道化，UNIX风格： cat examples/csv_example_messy

立即下载

【文件预览】：
csvdedupe-master
----csvdedupe()
--------__init__.py(0B)
--------csvlink.py(8KB)
--------csvdedupe.py(7KB)
--------csvhelpers.py(10KB)
----examples()
--------restaurant-2.csv(55KB)
--------CPS_Early_Childhood_Portal_Scrape.csv(135KB)
--------restaurant-1.csv(8KB)
--------config.json.example(483B)
--------Lobbyists_2012_present.csv(4.98MB)
--------IDHS_child_care_provider_list.csv(5KB)
--------csv_example_messy_input.csv(564KB)
--------.gitkeep(0B)
--------Contracts_after_8_2010.csv(901KB)
----.travis.yml(551B)
----requirements-test.txt(13B)
----setup.cfg(28B)
----LICENSE.md(1KB)
----setup.py(894B)
----README.md(12KB)
----docs()
--------conf.py(8KB)
--------index.rst(11KB)
----tests()
--------test_command_line.py(2KB)
----.gitignore(130B)

秒客网

csvdedupe：用于对CSV文件进行重复数据删除的命令行工具

网友评论

相关文章