偶然间鄙人也碰到了Apache的FTPClient.listFiles()获取文件为空的问题。
目标服务器环境:HP小型机
client服务器环境:Linux jstmsapp2 2.6.32-279.el6.x86_64 #1 SMP Wed Jun 13 18:24:36 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux(脚本在此服务器上执行)
相关jar:common-net-1.4.1.jar(common-net-3.3.jar依旧有这个问题)、jakarta-oro-2.0.8.jar
我的代码如下:
/**执行到FTPClient.listFiles(),死活返回为空。
* @desc: 从目标服务器FTP取文件到本地
* @author<chengsheng.wang@zznode.com>
* @since 2015-7-27
*
* @param url
* @param userName
* @param password
* @param portnum
* @param path
* @param localPath
* @return boolean
*/
private boolean downLoadFromFtp(String url, String userName, String password,int portnum ,String path , String localPath){
logger.info("url=" + url + " username=" + userName + " password=" + password + " hostpath=" + path + " localpath=" + localPath);
boolean flag = false;
FTPClient ftpClient = new FTPClient();
ftpClient.setControlEncoding("GBK");
int count = 0;//同步文件计数
try {
ftpClient.connect(url, portnum);
boolean loginFlag = ftpClient.login(userName, password);
logger.info("登陆状态:"+loginFlag);
ftpClient.changeWorkingDirectory(path);
ftpClient.enterLocalPassiveMode();
FTPFile[] files = ftpClient.listFiles();
if (null == files || files.length == 0) {
logger.info("没有文件数据");
return flag;
}
File tempfile = null;
FileOutputStream fos = null;
File localpathdir = new File(localPath);
if (!localpathdir.exists()) {
localpathdir.mkdirs();
}
logger.info("host目录文件总数:"+files.length);
for (int i = 0; i < files.length; i++) {
if(files[i] == null){
continue;
}
String fileName = files[i].getName();
logger.info("第" + i + "个文件名:" + fileName);
String local = localPath + File.separator + fileName;
tempfile = new File(local);
if(tempfile.exists()){
continue;//如文件已经存在,则不再重复下载/同步
}
fos = new FileOutputStream(tempfile);
ftpClient.setBufferSize(1024);
ftpClient.setFileType(FTPClient.BINARY_FILE_TYPE);
ftpClient.retrieveFile(path + File.separator + fileName, fos);
fos.close();
count++;
}
flag = true;
} catch (SocketException e) {
logger.error("socket异常", e);
} catch (IOException e) {
logger.error("IO异常", e);
} catch (Exception e) {
logger.error("ftp下载文件异常", e);
}finally{
if (null != ftpClient) {
try {
if (ftpClient.isConnected()) {
ftpClient.logout();
ftpClient.disconnect();
}
} catch (IOException e) {
logger.error("关闭连接异常", e);
}
}
logger.info("本次一共同步了"+count+"个文件");
}
return flag;
}
网上研究了很久,受一些前辈的启发,推测原因是目标服务器的中文语言环境,导致文件的修改日期格式,不能被apache正确解析造成的。
从网上找来common-net-1.4.1.jar的源码:http://apache.fayea.com//commons/net/source/commons-net-1.4.1-src.zip
在源码中直接加入日志调试,然后FTPClient.listFiles()返回null问题就豁然开朗了。
common-net-1.4.1.jar中问题,来一一说明一下:
UnixFTPEntryParser.java中parseFTPEntry
/**有问题的正则表达式
* Parses a line of a unix (standard) FTP server file listing and converts
* it into a usable format in the form of an <code> FTPFile </code>
* instance. If the file listing line doesn't describe a file,
* <code> null </code> is returned, otherwise a <code> FTPFile </code>
* instance representing the files in the directory is returned.
* <p>
* @param entry A line of text from the file listing
* @return An FTPFile instance corresponding to the supplied entry
*/
public FTPFile parseFTPEntry(String entry) {
FTPFile file = new FTPFile();
file.setRawListing(entry);
int type;
boolean isDevice = false;
if (matches(entry))//此处匹配文件信息的正则表达式也有问题,写死在上面,其匹配规则导致某些文件因为最后修改日期信息被过滤
{
String typeStr = group(1);
String hardLinkCount = group(15);
String usr = group(16);
String grp = group(17);
String filesize = group(18);
String datestr = group(19) + " " + group(20);
String name = group(21);
String endtoken = group(22);
try
{
file.setTimestamp(super.parseTimestamp(datestr)); //问题出在此处,由于语言环境引起的文件日期格式无法被解析,而导致return null并隐藏了解析错误的异常信息
}
catch (ParseException e)
{
return null; // this is a parsing failure too.
}
/**既然问题原因都知道了,那么讨论下解决方案
* this is the regular expression used by this parser.
*
* Permissions:
* r the file is readable
* w the file is writable
* x the file is executable
* - the indicated permission is not granted
* L mandatory locking occurs during access (the set-group-ID bit is
* on and the group execution bit is off)
* s the set-user-ID or set-group-ID bit is on, and the corresponding
* user or group execution bit is also on
* S undefined bit-state (the set-user-ID bit is on and the user
* execution bit is off)
* t the 1000 (octal) bit, or sticky bit, is on [see chmod(1)], and
* execution is on
* T the 1000 bit is turned on, and execution is off (undefined bit-
* state)
*/
private static final String REGEX =
"([bcdlfmpSs-])"
+"(((r|-)(w|-)([xsStTL-]))((r|-)(w|-)([xsStTL-]))((r|-)(w|-)([xsStTL-])))\\+?\\s+"
+ "(\\d+)\\s+"
+ "(\\S+)\\s+"
+ "(?:(\\S+)\\s+)?"
+ "(\\d+)\\s+"
/*
numeric or standard format date
*/
+ "((?:\\d+[-/]\\d+[-/]\\d+)|(?:\\S+\\s+\\S+))\\s+" //这句有问题,某些文件被过滤了,不过hp机器某些文件的修改日期中文格式也的确匪夷所思
/*
year (for non-recent standard format)
or time (for numeric or recent standard format
*/
+ "(\\d+(?::\\d+)?)\\s+"
+ "(\\S*)(\\s*.*)";
网上有前辈简洁地指出,把这两个地方修改了不就行了。
如果你都不关心文件的the last modification time,那么最方便的做法是:
file.setTimestamp(Calendar.getInstance()); //把文件的the last modification time重置为当前时间,这样其实并没有什么不妥。
正则表达式也可以照葫芦画瓢改为:"((?:\\d+[-/]\\d+[-/]\\d+)|(?:\\S+\\s+\\S+)|(?:\\S+))\\s+"
重新编译一个新的common-net.1..4.1.jar然后执行一遍,世界终于安宁了,一切美好。
请参考:http://www.blogjava.net/wodong/archive/2008/08/21/wodong.html
但是这样做真的好么?优雅么?apache的贡献者们的代码其实还是留有余地让我们去完善这个bug。
从org.apache.commons.net.ftp.FTPClient.listFiles()方法逐步去过一遍代码吧
listFiles()最终调用了
public FTPFile[] listFiles(String pathname)然后我们继续分析initiateListParsing(key, pathname)
throws IOException
{
String key = null;
FTPListParseEngine engine =
initiateListParsing(key, pathname);
return engine.getFiles();
}
public FTPListParseEngine initiateListParsing(发现其实是可以通过__configuration参数去初始化__entryParser的。而默认__configuration为null,导致了程序执行到
String parserKey, String pathname)
throws IOException
{
// We cache the value to avoid creation of a new object every
// time a file listing is generated.
if(__entryParser == null) {
if (null != parserKey) {
// if a parser key was supplied in the parameters,
// use that to create the paraser
__entryParser =
__parserFactory.createFileEntryParser(parserKey);
} else {
// if no parserKey was supplied, check for a configuration
// in the params, and if non-null, use that.
if (null != __configuration) {
__entryParser =
__parserFactory.createFileEntryParser(__configuration);
} else {
// if a parserKey hasn't been supplied, and a configuration
// hasn't been supplied, then autodetect by calling
// the SYST command and use that to choose the parser.
__entryParser =
__parserFactory.createFileEntryParser(getSystemName());
}
}
}
return initiateListParsing(__entryParser, pathname);
}
__entryParser = __parserFactory.createFileEntryParser(getSystemName()); //初始化了一个不支持正文格式的Parser
继续假设我们已经new一个FTPClientConfig,通过FTPClientConfig来初始化Parser,继续跟代码
public FTPFileEntryParser createFileEntryParser(FTPClientConfig config)进入createFileEntryParser(key)方法,揭示最终的真相
throws ParserInitializationException
{
this.config = config;
String key = config.getServerSystemKey();
return createFileEntryParser(key);
}
public FTPFileEntryParser createFileEntryParser(String key)
{
Class parserClass = null;
FTPFileEntryParser parser = null;
try
{
parserClass = Class.forName(key);//如果我们利用key来初始化一个自定义的FTPFileEntryParser是不是可以呢,key是从FTPClientConfig中传递而来
parser = (FTPFileEntryParser) parserClass.newInstance();
}
catch (ClassNotFoundException e)
{
String ukey = null;
if (null != key)
{
ukey = key.toUpperCase();
}
if (ukey.indexOf(FTPClientConfig.SYST_UNIX) >= 0)
{
parser = createUnixFTPEntryParser();
}
else if (ukey.indexOf(FTPClientConfig.SYST_VMS) >= 0)
{
parser = createVMSVersioningFTPEntryParser();
}
else if (ukey.indexOf(FTPClientConfig.SYST_NT) >= 0)
{
parser = createNTFTPEntryParser();
}
else if (ukey.indexOf(FTPClientConfig.SYST_OS2) >= 0)
{
parser = createOS2FTPEntryParser();
}
else if (ukey.indexOf(FTPClientConfig.SYST_OS400) >= 0)
{
parser = createOS400FTPEntryParser();
}
else if (ukey.indexOf(FTPClientConfig.SYST_MVS) >= 0)
{
parser = createMVSEntryParser();
}
else
{
throw new ParserInitializationException("Unknown parser type: " + key);
}
}
catch (ClassCastException e)
{
throw new ParserInitializationException(parserClass.getName()
+ " does not implement the interface "
+ "org.apache.commons.net.ftp.FTPFileEntryParser.", e);
}
catch (Throwable e)
{
throw new ParserInitializationException("Error initializing parser", e);
}
if (parser instanceof Configurable) {
((Configurable)parser).configure(this.config);
}
return parser;
}
细心的网友一定发现了,我们可以通过给FTPClient对象设置一个FTPClientConfig,
通过FTPClientConfig的systemKey属性,初始化一个自定义的FTPFileEntryParser去完成文件时间等信息的解析工作。
再来看看FTPClientConfig是不是有符合要求的构造器
/**太好了,恰好有这么一个构造器,可以开工了。
* The main constructor for an FTPClientConfig object
* @param systemKey key representing system type of the server being
* connected to. See {@link #getServerSystemKey() serverSystemKey}
*/
public FTPClientConfig(String systemKey) {
this.serverSystemKey = systemKey;
}
新建一个UnixFTPEntryParser,继承自ConfigurableFTPFileEntryParserImpl,然后在ftpClient.listFiles();调用前,初始化一个FTPClientConfig给ftpClient对象。
看代码:
ftpClient.changeWorkingDirectory(path);
ftpClient.enterLocalPassiveMode();
//由于apache不支持中文语言环境,通过定制类解析中文日期类型
ftpClient.configure(new FTPClientConfig("com.zznode.tnms.ra.c11n.nj.resource.ftp.UnixFTPEntryParser"));
FTPFile[] files = ftpClient.listFiles();
终于结束了,很累,末尾附上自定义的UnixFTPEntryParser.java和FTPTimestampParserImplExZH.java(用于处理中文日期,不关心修改日期的网友也可以不用它):
http://download.csdn.net/detail/wangchsh2008/8939331
本篇解决问题思路参考网上各位前辈,解决方案经本人实际应用验证可用,特发帖供网友参考。