将文件的编码格式转换为utf-8

时间:2023-01-05 14:22:06

背景:项目中有一些其他部门发过来的代码,编码格式有utf-8,也有GBK,而且是散乱在各个文件夹中的,处理起来十分的麻烦。我想把他们都转成统一的utf-8的格式。代码很简单,直接上代码好了。

import os,sys  
  
def convert( filename, in_enc = "gbk", out_enc="UTF-8" ):
	try:
		array = filename.split(".")
		if "java" == array[-1]:
			print 'Encode Converting (GBK to UTF-8) : ', filename
			utfFile=open(filename)
			tstr = utfFile.read()
			tstr = tstr.decode(in_enc).encode(out_enc)
			utfFile.close()
			utfFile = open(filename, 'w')
			utfFile.write(tstr)
			utfFile.close()
	except:
		print " error"
		
def explore(dir):  
    for root, dirs, files in os.walk(dir):  
        for file in files:  
            path = os.path.join(root, file)  
            convert(path)  
  
def main():
	for path in ['.']:
		print path
		if os.path.isfile(path):
			convert(path)
		elif os.path.isdir(path):
			explore(path)

if __name__ == "__main__":  
    main()

  使用前需要配置python的环境,将这个文件放在工程目录下,双击执行就可以了。