工具类库系列(四)-CsvReader

第四个工具类：CsvReader

CsvReader是用来支持读取Csv表格用的

Csv格式其实就是一个有固定格式的txt，一行每一列用英文','隔开

游戏项目中，很多游戏静态表。策划提供的是Csv表格，可以用Excel编辑方便。客户端Unity也倾向于读取Csv表格，纯文本，格式简单，读取方便

然后为了读取更方便，我们人为规定了Csv前3行的内容有特殊意义，下图是一个示例

工具类库系列(四)-CsvReader

第一行的每一列为列名称

第二行的每一列为列类型

第三行的每一列为列名称中文说明

第四行开始为实际数据

具体说明一下支持的列类型

4个基本类型：

u：对应unsigned int

i：对应int

f：对应float

s：对应std::string

子结构：如果该列是一个子结构，则具体写明子结构的字段名和字段类型，用英文分号';'分割

如上图中的StructId;u;StructNum;i;StructFloat;f;StruceName;s：

表示这个子结构有4个字段：StructId，unsigned int类型，StructNum，int类型，StructFloat，float类型，StruceName，字符串类型

实际数据中子结构字段之间用英文与运算符‘&’分割

如上图中的1&-2&3.4&aaa：

表示StructId=1，StructNum=-2，StructFloat=3.4f，StruceName=aaa，即{1, -2, 3.4f, "aaa"}

列类型的限定类型：

k：表示该列是主键，作为索引使用

支持最多3列主键，即3列组合起来确定唯一一行记录

主键不一定放在前3列，可以用中间的某些列做主键

主键的类型限定为整数，即k后面的列类型为i或者u

lst：表示该列为不定长数组

如上列中的lst:u：

表示该列是一个unsigned int的数组，即对应std::vector<unsigned int>

如上例中的lst:StructId;u;StructNum;i;StructFloat;f;StruceName;s：

表示该列是一个子结构的数组

实际数据中数组的元素和元素之间用英文分号';'分割

如上例中的1&-2&3.4&aaa;5&-6&7.8&bbb：

表示该子结构数组有2个元素，第一个元素是{1, -2, 3.4f, "aaa"}，第二个元素是{5, -6, 7.8f, "bbb"}

k或者lst后面跟英文冒号“:”，再跟具体列类型

以上为人为规定

那么对于这样的一张表，读进内存对应的结构可能如下所示：

class Struct
{
public:
	unsigned int m_StructId;
	int m_StructNum;
	float m_StructFloat;
	std::string m_StruceName;
};

class Test
{
public:
	unsigned int m_Id1;	//测试主键1
	int m_Id2;	//测试主键2
	unsigned int m_Id3;	//测试主键3
	std::string m_Str;	//测试字符串
	std::vector<unsigned int> m_UInts;	//测试数字列表
	float m_Float;	//测试数字
	Struct m_Struct;	//测试子结构
	std::vector<Struct> m_StructLists;	//测试子结构列表
};

本文的CsvReader就是为了实现将如上人为规定了格式的Csv文件，读取成对应的类对象的一个工具类

下一篇会写一个代码自动生成工具，自动生成每张表对应的类对象的代码

（因为策划对表结构的修改是很频繁的，表结构一旦修改，对应的类结构就要跟着修改，干脆自动生成省事）

CsvReader提供这样几个接口：

1、ReadLine，读取一行内容，读到一个缓存中，按Csv格式保留的分隔符英文逗号','，分割成每一列的值

这里有一个文件编码的问题，即在windows下，Utf8格式的文件有个BOM头，即文件前3个字节是0xEF，0xBB，0xBF需要过滤掉，行末尾的'\r'，‘\n'，需要过滤掉

2、CheckLine，校验一行内容按逗号分隔之后的列数，是否和第一行的列名称的列数一致

如果不一致，说明实际数据中，尤其是字符串类型的列中，存在英文的逗号，但是英文逗号是Csv文件保留的分隔符

所以如果实际数据中出现英文逗号，就无法确定哪一列对应哪一列了

3、Name2Index，根据列名称，返回该列是第几列，首列为0

4、GetValue，读取单字段类型（非数组）的列的值

5、GetValueList，读取数组列的值

6、LoadFile，加载指定文件，并处理前3行内容，识别出列名称，以及列数量

对于子结构，需要子结构重载赋值运算符，参数为一个string，如

XXX& operator = (const std::string &other)，实现类似上例中“1&-2&3.4&aaa”这样一个字符串的解析

这里也用到了之前提到的StringTool，用来进行字符串的分割，以及校验是否是一个合法的数字

上代码

CsvReader.h

#ifndef __CsvReader_h__
#define __CsvReader_h__

#include <string>
#include <vector>
#include <map>
#include <fstream>

#include "StringTool.h"

namespace common{
	namespace tool{

		class CsvReader
		{
		public:
			CsvReader();
			~CsvReader();

			// csv文件打开
			bool OpenFile(const std::string& file_path_name);

			// csv文件打开，并处理前3行数据
			bool LoadFile(const std::string& file_path_name);

			// 读取一行数据（首行去BOM，每行行尾去换行符）
			size_t ReadLine(bool firstLine = false);

			// 读取一行数据后，获取每一列的原始内容
			const std::vector<std::string>& GetLine();

			// 校验列数是否和列名称数量匹配
			bool CheckLine();

			// 根据列名获得列编号，第一列编号为0
			size_t Name2Index(const std::string& name);

			// 取当前行的某一列的值
			bool GetValueList(size_t index, std::vector<std::string>& values, const std::string& split);

			bool GetValue(size_t index, std::string& value);

			bool GetValueList(size_t index, std::vector<unsigned int>& values, const std::string& split);

			bool GetValue(size_t index, unsigned int& value);

			bool GetValueList(size_t index, std::vector<int>& values, const std::string& split);

			bool GetValue(size_t index, int& value);

			bool GetValueList(size_t index, std::vector<float>& values, const std::string& split);

			bool GetValue(size_t index, float& value);

			template <class T>
			bool GetStructList(size_t index, std::vector<T>& values, const std::string& split);

			template <class T>
			bool GetStruct(size_t index, T& value);

		private:
			// 文件输入流
			std::ifstream m_file;

			// 列名称 <-> 列下标 映射表
			std::map<std::string, size_t> m_names;
			// 当前行每一列的值
			std::vector<std::string> m_values;
		};

		template <class T>
		bool CsvReader::GetStructList(size_t index, std::vector<T>& values, const std::string& split)
		{
			if (index < m_values.size())
			{
				std::vector<std::string> tempStrs;
				StringTool::SplitStr2List(m_values[index], split, tempStrs);
				for (size_t i = 0; i < tempStrs.size(); i++)
				{
					if (tempStrs[i].length() > 0)
					{
						T tempT;
						tempT = tempStrs[i];
						values.push_back(tempT);
					}
				}
				return true;
			}
			else
			{
				return false;
			}
		}

		template <class T>
		bool CsvReader::GetStruct(size_t index, T& value)
		{
			if (index < m_values.size())
			{
				value = m_values[index];
				return true;
			}
			else
			{
				return false;
			}
		}

	}
}

#endif

CsvReader.cpp

#include "CsvReader.h"

#include <string.h>

namespace common{
	namespace tool{

		const unsigned int MaxLineLen = 10240;
		const std::string CsvSplit = ",";

		CsvReader::CsvReader()
		{
			m_names.clear();
			m_values.clear();
		}

		CsvReader::~CsvReader()
		{
			m_names.clear();
			m_values.clear();

			if (m_file)
			{
				m_file.close();
			}
		}

		size_t CsvReader::ReadLine(bool firstLine)
		{
			if (m_file)
			{
				char line[MaxLineLen];
				memset(line, 0x00, sizeof(char)* MaxLineLen);
				m_file.getline(line, MaxLineLen);
				size_t len = strlen(line);

				// 去除首行BOM
				if (3 <= len)
				{
					if (firstLine)
					{
						if (0xEF == (unsigned char)line[0] &&
							0xBB == (unsigned char)line[1] &&
							0xBF == (unsigned char)line[2])
						{
							memcpy(line, line + 3, len + 1 - 3);
							len = len - 3;
						}
					}
				}

				// 去除每行末尾\r\n
				while (1 <= len)
				{
					if ('\r' == line[len - 1] || '\n' == line[len - 1])
					{
						line[len - 1] = '\0';
						len = len - 1;
					}
					else
					{
						break;
					}
				}

				if (0 < len)
				{
					m_values.clear();
					StringTool::SplitStr2List(line, CsvSplit, m_values);
				}

				return len;
			}
			else
			{
				return 0;
			}
		}

		const std::vector<std::string>& CsvReader::GetLine()
		{
			return m_values;
		}

		bool CsvReader::CheckLine()
		{
			if (m_values.size() == m_names.size())
			{
				return true;
			}
			else
			{
				return false;
			}
		}

		bool CsvReader::OpenFile(const std::string& file_path_name)
		{
			m_file.open(file_path_name.c_str(), std::ios::in);
			if (m_file)
			{
				return true;
			}
			else
			{
				return false;
			}
		}

		bool CsvReader::LoadFile(const std::string& file_path_name)
		{
			m_file.open(file_path_name.c_str(), std::ios::in);
			if (m_file)
			{
				// 读列名称
				ReadLine(true);
				// 保存列名称，用于字段验证，和根据列名称取列下标
				for (size_t i = 0; i < m_values.size(); i++)
				{
					m_names[StringTool::UpcaseFirstChar(m_values[i])] = i;
				}

				// 读列类型
				ReadLine();

				// 读注释
				ReadLine();

				return true;
			}
			else
			{
				return false;
			}
		}

		size_t CsvReader::Name2Index(const std::string& name)
		{
			std::map<std::string, size_t>::iterator it = m_names.find(name);
			if (it != m_names.end())
			{
				return it->second;
			}
			else
			{
				return -1;
			}
		}

		bool CsvReader::GetValueList(size_t index, std::vector<std::string>& values, const std::string& split)
		{
			if (index < m_values.size())
			{
				if (0 < m_values[index].length())
				{
					StringTool::SplitStr2List(m_values[index], split, values);
				}

				return true;
			}
			else
			{
				return false;
			}
		}

		bool CsvReader::GetValue(size_t index, std::string& value)
		{
			if (index < m_values.size())
			{
				value = m_values[index];
				return true;
			}
			else
			{
				return false;
			}
		}

		bool CsvReader::GetValueList(size_t index, std::vector<unsigned int>& values, const std::string& split)
		{
			if (index < m_values.size())
			{
				if (0 < m_values[index].length())
				{
					return StringTool::SplitStr2List(m_values[index], split, values);
				}

				return true;
			}
			else
			{
				return false;
			}
		}

		bool CsvReader::GetValue(size_t index, unsigned int& value)
		{
			if (index < m_values.size())
			{
				if (StringTool::IsUInt(m_values[index]))
				{
					value = static_cast<unsigned int>(atoi(m_values[index].c_str()));
					return true;
				}
				else
				{
					return false;
				}
			}
			else
			{
				return false;
			}
		}

		bool CsvReader::GetValueList(size_t index, std::vector<int>& values, const std::string& split)
		{
			if (index < m_values.size())
			{
				if (0 < m_values[index].length())
				{
					return StringTool::SplitStr2List(m_values[index], split, values);
				}

				return true;
			}
			else
			{
				return false;
			}
		}

		bool CsvReader::GetValue(size_t index, int& value)
		{
			if (index < m_values.size())
			{
				if (StringTool::IsInt(m_values[index]))
				{
					value = atoi(m_values[index].c_str());
					return true;
				}
				else
				{
					return false;
				}
			}
			else
			{
				return false;
			}
		}

		bool CsvReader::GetValueList(size_t index, std::vector<float>& values, const std::string& split)
		{
			if (index < m_values.size())
			{
				if (0 < m_values[index].length())
				{
					return StringTool::SplitStr2List(m_values[index], split, values);
				}

				return true;
			}
			else
			{
				return false;
			}
		}

		bool CsvReader::GetValue(size_t index, float& value)
		{
			if (index < m_values.size())
			{
				if (StringTool::IsFloat(m_values[index]))
				{
					value = static_cast<float>(atof(m_values[index].c_str()));
					return true;
				}
				else
				{
					return false;
				}
			}
			else
			{
				return false;
			}
		}

	}
}

秒客网

工具类库系列(四)-CsvReader

相关文章