I'm working a .NET web service that will be processing a text file with a relatively long, multilevel record format. Each record in the file represents a different entity; the record contains multiple sub-types. (The same record format is currently being processed by a COBOL job, if that gives you a better picture of what we're looking at). I've created a class structure (a DATA DIVISION if you will) to hold the input data.
我正在使用.NET Web服务,该服务将处理具有相对较长的多级记录格式的文本文件。文件中的每条记录代表不同的实体;记录包含多个子类型。 (COBOL作业当前正在处理相同的记录格式,如果这样可以更好地了解我们正在查看的内容)。我已经创建了一个类结构(如果你愿意,可以使用DATA DIVISION)来保存输入数据。
My question is, what best practices have you found for processing large, complex fixed-width files in .NET? My general approach will be to read the entire line into a string and then parse the data from the string into the classes I've created. But I'm not sure whether I'll get better results working with the characters in the string as an array, or with the string itself. I guess that's the specific question, string vs. char[], but I would appreciate any other pointers anyone has.
我的问题是,您在.NET中处理大型复杂的固定宽度文件时发现了哪些最佳实践?我的一般方法是将整行读入一个字符串,然后将字符串中的数据解析为我创建的类。但我不确定我是否会在字符串中的字符作为数组或字符串本身获得更好的结果。我猜这是具体的问题,字符串与char [],但我会感激任何其他任何指针。
Thanks.
1 个解决方案
#1
I would build classes that matched the data in the rows, using attributes for types, length etc. Then use the Microsoft.VisualBasic.FileIO.TextFieldParser
object for reading the file, with some generic code for programming the parser based on the class, then reading the data and creating an instance of the class (all using reflection).
我将构建与行中数据匹配的类,使用类型,长度等属性。然后使用Microsoft.VisualBasic.FileIO.TextFieldParser对象读取文件,使用一些通用代码根据类编程解析器,然后读取数据并创建类的实例(全部使用反射)。
I use this for reading CSVs and its fast, flexible, extenisble, generic and easy to maintain. I also have attributes that allow me to add generic validation to each field as its being read.
我使用它来读取CSV及其快速,灵活,可扩展,通用且易于维护。我还有一些属性,允许我在读取时为每个字段添加通用验证。
I'd share my code, but its the IP of the firm I work for.
我会分享我的代码,但它是我工作的公司的IP。
#1
I would build classes that matched the data in the rows, using attributes for types, length etc. Then use the Microsoft.VisualBasic.FileIO.TextFieldParser
object for reading the file, with some generic code for programming the parser based on the class, then reading the data and creating an instance of the class (all using reflection).
我将构建与行中数据匹配的类,使用类型,长度等属性。然后使用Microsoft.VisualBasic.FileIO.TextFieldParser对象读取文件,使用一些通用代码根据类编程解析器,然后读取数据并创建类的实例(全部使用反射)。
I use this for reading CSVs and its fast, flexible, extenisble, generic and easy to maintain. I also have attributes that allow me to add generic validation to each field as its being read.
我使用它来读取CSV及其快速,灵活,可扩展,通用且易于维护。我还有一些属性,允许我在读取时为每个字段添加通用验证。
I'd share my code, but its the IP of the firm I work for.
我会分享我的代码,但它是我工作的公司的IP。