偶然间鄙人也碰到了Apache的FTPClient.listFiles()获取文件为空的问题。
目标服务器环境:HP小型机
client服务器环境:Linux jstmsapp2 2.6.32-279.el6.x86_64 #1 SMP Wed Jun 13 18:24:36 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux(脚本在此服务器上执行)
相关jar:common-net-1.4.1.jar(common-net-3.3.jar依旧有这个问题)、jakarta-oro-2.0.8.jar
我的代码如下:
/** * @desc: 从目标服务器FTP取文件到本地 * @author<chengsheng.wang@zznode.com> * @since 2015-7-27 * * @param url * @param userName * @param password * @param portnum * @param path * @param localPath * @return boolean */ private boolean downLoadFromFtp(String url, String userName, String password,int portnum ,String path , String localPath){ logger.info("url=" + url + " username=" + userName + " password=" + password + " hostpath=" + path + " localpath=" + localPath); boolean flag = false; FTPClient ftpClient = new FTPClient(); ftpClient.setControlEncoding("GBK"); int count = 0;//同步文件计数 try { ftpClient.connect(url, portnum); boolean loginFlag = ftpClient.login(userName, password); logger.info("登陆状态:"+loginFlag); ftpClient.changeWorkingDirectory(path); ftpClient.enterLocalPassiveMode(); FTPFile[] files = ftpClient.listFiles(); if (null == files || files.length == 0) { logger.info("没有文件数据"); return flag; } File tempfile = null; FileOutputStream fos = null; File localpathdir = new File(localPath); if (!localpathdir.exists()) { localpathdir.mkdirs(); } logger.info("host目录文件总数:"+files.length); for (int i = 0; i < files.length; i++) { if(files[i] == null){ continue; } String fileName = files[i].getName(); logger.info("第" + i + "个文件名:" + fileName); String local = localPath + File.separator + fileName; tempfile = new File(local); if(tempfile.exists()){ continue;//如文件已经存在,则不再重复下载/同步 } fos = new FileOutputStream(tempfile); ftpClient.setBufferSize(1024); ftpClient.setFileType(FTPClient.BINARY_FILE_TYPE); ftpClient.retrieveFile(path + File.separator + fileName, fos); fos.close(); count++; } flag = true; } catch (SocketException e) { logger.error("socket异常", e); } catch (IOException e) { logger.error("IO异常", e); } catch (Exception e) { logger.error("ftp下载文件异常", e); }finally{ if (null != ftpClient) { try { if (ftpClient.isConnected()) { ftpClient.logout(); ftpClient.disconnect(); } } catch (IOException e) { logger.error("关闭连接异常", e); } } logger.info("本次一共同步了"+count+"个文件"); } return flag; }执行到FTPClient.listFiles(),死活返回为空。
网上研究了很久,受一些前辈的启发,推测原因是目标服务器的中文语言环境,导致文件的修改日期格式,不能被apache正确解析造成的。
从网上找来common-net-1.4.1.jar的源码:http://apache.fayea.com//commons/net/source/commons-net-1.4.1-src.zip
在源码中直接加入日志调试,然后FTPClient.listFiles()返回null问题就豁然开朗了。
common-net-1.4.1.jar中问题,来一一说明一下:
UnixFTPEntryParser.java中parseFTPEntry
/** * Parses a line of a unix (standard) FTP server file listing and converts * it into a usable format in the form of an <code> FTPFile </code> * instance. If the file listing line doesn't describe a file, * <code> null </code> is returned, otherwise a <code> FTPFile </code> * instance representing the files in the directory is returned. * <p> * @param entry A line of text from the file listing * @return An FTPFile instance corresponding to the supplied entry */ public FTPFile parseFTPEntry(String entry) { FTPFile file = new FTPFile(); file.setRawListing(entry); int type; boolean isDevice = false; if (matches(entry))//此处匹配文件信息的正则表达式也有问题,写死在上面,其匹配规则导致某些文件因为最后修改日期信息被过滤 { String typeStr = group(1); String hardLinkCount = group(15); String usr = group(16); String grp = group(17); String filesize = group(18); String datestr = group(19) + " " + group(20); String name = group(21); String endtoken = group(22); try { file.setTimestamp(super.parseTimestamp(datestr)); //问题出在此处,由于语言环境引起的文件日期格式无法被解析,而导致return null并隐藏了解析错误的异常信息 } catch (ParseException e) { return null; // this is a parsing failure too. }有问题的正则表达式
/** * this is the regular expression used by this parser. * * Permissions: * r the file is readable * w the file is writable * x the file is executable * - the indicated permission is not granted * L mandatory locking occurs during access (the set-group-ID bit is * on and the group execution bit is off) * s the set-user-ID or set-group-ID bit is on, and the corresponding * user or group execution bit is also on * S undefined bit-state (the set-user-ID bit is on and the user * execution bit is off) * t the 1000 (octal) bit, or sticky bit, is on [see chmod(1)], and * execution is on * T the 1000 bit is turned on, and execution is off (undefined bit- * state) */ private static final String REGEX = "([bcdlfmpSs-])" +"(((r|-)(w|-)([xsStTL-]))((r|-)(w|-)([xsStTL-]))((r|-)(w|-)([xsStTL-])))\\+?\\s+" + "(\\d+)\\s+" + "(\\S+)\\s+" + "(?:(\\S+)\\s+)?" + "(\\d+)\\s+" /* numeric or standard format date */ + "((?:\\d+[-/]\\d+[-/]\\d+)|(?:\\S+\\s+\\S+))\\s+" //这句有问题,某些文件被过滤了,不过hp机器某些文件的修改日期中文格式也的确匪夷所思 /* year (for non-recent standard format) or time (for numeric or recent standard format */ + "(\\d+(?::\\d+)?)\\s+" + "(\\S*)(\\s*.*)";既然问题原因都知道了,那么讨论下解决方案
网上有前辈简洁地指出,把这两个地方修改了不就行了。
如果你都不关心文件的the last modification time,那么最方便的做法是:
file.setTimestamp(Calendar.getInstance()); //把文件的the last modification time重置为当前时间,这样其实并没有什么不妥。
正则表达式也可以照葫芦画瓢改为:"((?:\\d+[-/]\\d+[-/]\\d+)|(?:\\S+\\s+\\S+)|(?:\\S+))\\s+"
重新编译一个新的common-net.1..4.1.jar然后执行一遍,世界终于安宁了,一切美好。
请参考:http://www.blogjava.net/wodong/archive/2008/08/21/wodong.html
但是这样做真的好么?优雅么?apache的贡献者们的代码其实还是留有余地让我们去完善这个bug。
从org.apache.commons.net.ftp.FTPClient.listFiles()方法逐步去过一遍代码吧
listFiles()最终调用了
public FTPFile[] listFiles(String pathname) throws IOException { String key = null; FTPListParseEngine engine = initiateListParsing(key, pathname); return engine.getFiles(); }然后我们继续分析initiateListParsing(key, pathname)
public FTPListParseEngine initiateListParsing( String parserKey, String pathname) throws IOException { // We cache the value to avoid creation of a new object every // time a file listing is generated. if(__entryParser == null) { if (null != parserKey) { // if a parser key was supplied in the parameters, // use that to create the paraser __entryParser = __parserFactory.createFileEntryParser(parserKey); } else { // if no parserKey was supplied, check for a configuration // in the params, and if non-null, use that. if (null != __configuration) { __entryParser = __parserFactory.createFileEntryParser(__configuration); } else { // if a parserKey hasn't been supplied, and a configuration // hasn't been supplied, then autodetect by calling // the SYST command and use that to choose the parser. __entryParser = __parserFactory.createFileEntryParser(getSystemName()); } } } return initiateListParsing(__entryParser, pathname); }发现其实是可以通过__configuration参数去初始化__entryParser的。而默认__configuration为null,导致了程序执行到
__entryParser = __parserFactory.createFileEntryParser(getSystemName()); //初始化了一个不支持正文格式的Parser
继续假设我们已经new一个FTPClientConfig,通过FTPClientConfig来初始化Parser,继续跟代码
public FTPFileEntryParser createFileEntryParser(FTPClientConfig config) throws ParserInitializationException { this.config = config; String key = config.getServerSystemKey(); return createFileEntryParser(key); }进入createFileEntryParser(key)方法,揭示最终的真相
public FTPFileEntryParser createFileEntryParser(String key) { Class parserClass = null; FTPFileEntryParser parser = null; try { parserClass = Class.forName(key);//如果我们利用key来初始化一个自定义的FTPFileEntryParser是不是可以呢,key是从FTPClientConfig中传递而来 parser = (FTPFileEntryParser) parserClass.newInstance(); } catch (ClassNotFoundException e) { String ukey = null; if (null != key) { ukey = key.toUpperCase(); } if (ukey.indexOf(FTPClientConfig.SYST_UNIX) >= 0) { parser = createUnixFTPEntryParser(); } else if (ukey.indexOf(FTPClientConfig.SYST_VMS) >= 0) { parser = createVMSVersioningFTPEntryParser(); } else if (ukey.indexOf(FTPClientConfig.SYST_NT) >= 0) { parser = createNTFTPEntryParser(); } else if (ukey.indexOf(FTPClientConfig.SYST_OS2) >= 0) { parser = createOS2FTPEntryParser(); } else if (ukey.indexOf(FTPClientConfig.SYST_OS400) >= 0) { parser = createOS400FTPEntryParser(); } else if (ukey.indexOf(FTPClientConfig.SYST_MVS) >= 0) { parser = createMVSEntryParser(); } else { throw new ParserInitializationException("Unknown parser type: " + key); } } catch (ClassCastException e) { throw new ParserInitializationException(parserClass.getName() + " does not implement the interface " + "org.apache.commons.net.ftp.FTPFileEntryParser.", e); } catch (Throwable e) { throw new ParserInitializationException("Error initializing parser", e); } if (parser instanceof Configurable) { ((Configurable)parser).configure(this.config); } return parser; }
细心的网友一定发现了,我们可以通过给FTPClient对象设置一个FTPClientConfig,
通过FTPClientConfig的systemKey属性,初始化一个自定义的FTPFileEntryParser去完成文件时间等信息的解析工作。
再来看看FTPClientConfig是不是有符合要求的构造器
/** * The main constructor for an FTPClientConfig object * @param systemKey key representing system type of the server being * connected to. See {@link #getServerSystemKey() serverSystemKey} */ public FTPClientConfig(String systemKey) { this.serverSystemKey = systemKey; }太好了,恰好有这么一个构造器,可以开工了。
新建一个UnixFTPEntryParser,继承自ConfigurableFTPFileEntryParserImpl,然后在ftpClient.listFiles();调用前,初始化一个FTPClientConfig给ftpClient对象。
看代码:
ftpClient.changeWorkingDirectory(path); ftpClient.enterLocalPassiveMode(); //由于apache不支持中文语言环境,通过定制类解析中文日期类型 ftpClient.configure(new FTPClientConfig("com.zznode.tnms.ra.c11n.nj.resource.ftp.UnixFTPEntryParser")); FTPFile[] files = ftpClient.listFiles();
终于结束了,很累,末尾附上自定义的UnixFTPEntryParser.java和FTPTimestampParserImplExZH.java(用于处理中文日期,不关心修改日期的网友也可以不用它):
http://download.csdn.net/detail/wangchsh2008/8939331
本篇解决问题思路参考网上各位前辈,解决方案经本人实际应用验证可用,特发帖供网友参考。