可映射的CSV读取引擎

由于许多项目都会使用csv文件来存储数据，因此我在这里介绍一套我自己设计出来的解决方案。有不合理的地方还望指出。

一般的csv文件读取都会比较繁琐：按照分隔符(默认逗号)分好行，列，再根据对应的顺序，一行一行，一条一条地读取数据。这本书没什么问题，然而一旦更改csv里的列顺序，或者增删某行就会产生牵一发动全身的结果。而且字段多的时候，写起来是非常反人类的。。

我们项目起初就是用的这种原始解决方案，也的确碰到了上面提及的尴尬局面。后来我想到，如果我能对csv表结构做好映射，像json，像xml那样，不就能大大提高效率？

于是我引出了如下的设计方案：

1. 准备

首先定义两条特性，一条是表整体结构相关的，一条是用来做字段映射

 1 /// <summary>
 2     /// CSV column mapping
 3     /// </summary>
 4     [AttributeUsage(AttributeTargets.Property | AttributeTargets.Field)]
 5     public class CSVColumnAttribute : Attribute
 6     {
 7         /// <summary>
 8         /// Name of this property/field in csv file(default is property name)
 9         /// </summary>
10         public string Key { get; set; }
11 
12         /// <summary>
13         /// Column of this property/field in csv file(if column is assigned, key will be ignored)
14         /// </summary>
15         public int Column { get; set; }
16 
17         /// <summary>
18         /// Default value(if reading NULL or failed;deault: -1 for number value, null for class, false for bool)
19         /// </summary>
20         public object DefaultValue { get; set; }
21 
22         /// <summary>
23         /// Separator for parsing if it's an array(',' by default)
24         /// </summary>
25         public char ArraySeparator { get; set; }
26 
27 
28         public CSVColumnAttribute()
29         {
30             Column = -1;
31             ArraySeparator = '#';
32         }
33 
34         public CSVColumnAttribute(string key)
35         {
36             Key = key;
37             Column = -1;
38             ArraySeparator = '#';
39         }
40 
41         public CSVColumnAttribute(int column)
42         {
43             Column = column;
44             ArraySeparator = '#';
45         }
46     }

 1 /// <summary>
 2     /// CSV Mapping class or struct(Try avoid struct as possible. Struct is boxed then unboxed in reflection)
 3     /// </summary>
 4     [AttributeUsage(AttributeTargets.Class | AttributeTargets.Struct)]
 5     public class CSVMapperAttribute : Attribute
 6     {
 7         /// <summary>
 8         /// Path of the CSV file(without file extension). Base directory is Assets/Resources/
 9         /// </summary>
10         public string Path { get; set; }
11 
12         /// <summary>
13         /// Mapping key row(0 by default)
14         /// </summary>
15         public int KeyRow { get; set; }
16 
17         /// <summary>
18         /// Description row(1 by default. Will be skipped in decoding. If no desc in file, assign -1)
19         /// </summary>
20         public int DescRow { get; set; }
21 
22         /// <summary>
23         /// Separator for csv parsing(',' by default)
24         /// </summary>
25         public char Separator { get; set; }
26 
27         /// <summary>
28         /// Starting index of data rows
29         /// </summary>
30         public int StartRow { get; set; }
31 
32         public CSVMapperAttribute()
33         {
34             KeyRow = 0;
35             DescRow = 1;
36             Separator = ',';
37         }
38 
39         public CSVMapperAttribute(string name)
40         {
41             Path = name;
42             KeyRow = 0;
43             DescRow = 1;
44             Separator = ',';
45         }
46     }

表相关特性里的属性包括：CSV所在路径(可选)，键值所在行(针对非英文表格)，描述所在行(可选)，分隔符(默认为逗号',')，起始行(可选，解析时会跳过这之前的行)

字段映射相关特性的属性包括：键值，对应列号(键值和列号2选1即可。都不设置则默认键值为属性名)，默认值(可选，字段解析失败会返回此默认值)，数组分隔符(可选，默认为'#'，用来分隔数组)

CSVMapperAttribute可以添加到类或结构体上，CSVColumnAttribute可以添加到属性或字段上。

2. 读取和解析

  1 public class CSVEngine
  2     {
  3         private List<List<string>> _records;
  4 
  5         /// <summary>
  6         /// Get column count
  7         /// </summary>
  8         public int ColumnCount { get; private set; }
  9 
 10         /// <summary>
 11         /// Get row count
 12         /// </summary>
 13         public int RowCount { get; private set; }
 14 
 15         /// <summary>
 16         /// Get separator
 17         /// </summary>
 18         public char Separator { get; private set; }
 19 
 20         private int _keyRow = -1;
 21         private int _descRow = -1;
 22         private int _startRow = -1;
 23 
 24         /// <summary>
 25         /// Decode CSV file to target mapped type.
 26         /// </summary>
 27         /// <typeparam name="T"></typeparam>
 28         /// <param name="path"></param>
 29         /// <returns></returns>
 30         public IEnumerable<T> Decode<T>() where T : new()
 31         {
 32             if (_records == null || _keyRow < 0 || _descRow < 0 || _startRow < 0)
 33             {
 34                 Debug.LogError(string.Format("Decoding Failed: {0}", typeof (T)));
 35                 yield break;
 36             }
 37 
 38             //Decode each row
 39             for (int i = _startRow; i < _records.Count; i++)
 40             {
 41                 if (i == _keyRow || i == _descRow)
 42                     continue;
 43                 yield return DecodeRow<T>(_records[i], _records[_keyRow]);
 44             }
 45         }
 46 
 47         /// <summary>
 48         /// Decode single row
 49         /// </summary>
 50         /// <typeparam name="T"></typeparam>
 51         /// <param name="fields"></param>
 52         /// <param name="keys"></param>
 53         /// <returns></returns>
 54         private T DecodeRow<T>(List<string> fields, List<string> keys) where T : new()
 55         {
 56             T result = new T();
 57             IEnumerable<MemberInfo> members =
 58                 typeof (T).GetMembers()
 59                     .Where(m => m.MemberType == MemberTypes.Property || m.MemberType == MemberTypes.Field)
 60                     .Where(m => Attribute.IsDefined(m, typeof (CSVColumnAttribute), false));
 61 
 62             if (typeof (T).IsValueType)
 63             {
 64                 object boxed = result;
 65                 foreach (MemberInfo member in members)
 66                 {
 67                     CSVColumnAttribute attribute =
 68                         member.GetCustomAttributes(typeof (CSVColumnAttribute), false).First() as CSVColumnAttribute;
 69                     string field = GetRawValue(attribute, fields, keys, member.Name);
 70                     if (ReferenceEquals(field, member.Name))
 71                         return result;
 72                     SetValue(member, boxed, field, attribute.DefaultValue, attribute.ArraySeparator);
 73                 }
 74                 return (T) boxed;
 75             }
 76 
 77             foreach (MemberInfo member in members)
 78             {
 79                 CSVColumnAttribute attribute =
 80                     member.GetCustomAttributes(typeof (CSVColumnAttribute), false).First() as CSVColumnAttribute;
 81                 string field = GetRawValue(attribute, fields, keys, member.Name);
 82                 if (ReferenceEquals(field, member.Name))
 83                     return result;
 84                 SetValue(member, result, field, attribute.DefaultValue, attribute.ArraySeparator);
 85             }
 86             return result;
 87         }
 88 
 89         /// <summary>
 90         /// Get raw value by CSVColumnAttribute or name
 91         /// </summary>
 92         /// <param name="attribute"></param>
 93         /// <param name="fields"></param>
 94         /// <param name="keys"></param>
 95         /// <param name="name"></param>
 96         /// <returns></returns>
 97         private string GetRawValue(CSVColumnAttribute attribute, List<string> fields, List<string> keys, string name)
 98         {
 99             if (attribute.Column >= 0 && fields.Count > attribute.Column)
100             {
101                 return fields[attribute.Column];
102             }
103             if (!string.IsNullOrEmpty(attribute.Key) && keys.Contains(attribute.Key))
104             {
105                 return fields[keys.IndexOf(attribute.Key)];
106             }
107             if (keys.Contains(name))
108             {
109                 return fields[keys.IndexOf(name)];
110             }
111             Debug.LogError(string.Format("Mapping Error! Column: {0}, Key: {1}, Name:{2}", attribute.Column,
112                 attribute.Key ?? "NULL", name));
113             return name;
114         }
115 
116         /// <summary>
117         /// Parse and set raw value
118         /// </summary>
119         /// <param name="member"></param>
120         /// <param name="obj"></param>
121         /// <param name="value"></param>
122         /// <param name="defaultValue"></param>
123         /// <param name="arraySeparator"></param>
124         private void SetValue(MemberInfo member, object obj, string value, object defaultValue, char arraySeparator)
125         {
126             if (member.MemberType == MemberTypes.Property)
127             {
128                 (member as PropertyInfo).SetValue(obj,
129                     ParseRawValue(value, (member as PropertyInfo).PropertyType, defaultValue, arraySeparator),
130                     null);
131             }
132             else
133             {
134                 (member as FieldInfo).SetValue(obj,
135                     ParseRawValue(value, (member as FieldInfo).FieldType, defaultValue, arraySeparator));
136             }
137         }
138 
139         /// <summary>
140         /// Parse string value to specified type
141         /// </summary>
142         /// <param name="field"></param>
143         /// <param name="type">If type is collection, use array only(e.g. int[])</param>
144         /// <param name="defaultValue">If type is collection, use element default(e.g. 0 for int[])</param>
145         /// <param name="arraySeparator"></param>
146         /// <returns></returns>
147         private object ParseRawValue(string field, Type type, object defaultValue, char arraySeparator)
148         {
149             try
150             {
151                 if (type.IsArray)
152                 {
153                     IEnumerable<object> result =
154                         field.Split(arraySeparator)
155                             .Select(f => ParseRawValue(f, type.GetElementType(), defaultValue, arraySeparator));
156                     if (type.GetElementType() == typeof (string))
157                     {
158                         return result.Cast<string>().ToArray();
159                     }
160                     if (type.GetElementType() == typeof (int))
161                     {
162                         return result.Cast<int>().ToArray();
163                     }
164                     if (type.GetElementType() == typeof (float))
165                     {
166                         return result.Cast<float>().ToArray();
167                     }
168                     if (type.GetElementType() == typeof (double))
169                     {
170                         return result.Cast<double>().ToArray();
171                     }
172                     if (type.GetElementType() == typeof (bool))
173                     {
174                         return result.Cast<bool>().ToArray();
175                     }
176                     return null;
177                 }
178                 if (type == typeof (string))
179                 {
180                     return field;
181                 }
182                 if (type == typeof (int))
183                 {
184                     return Convert.ToInt32(field);
185                 }
186                 if (type == typeof (float))
187                 {
188                     return Convert.ToSingle(field);
189                 }
190                 if (type == typeof (double))
191                 {
192                     return Convert.ToDouble(field);
193                 }
194                 if (type == typeof (bool))
195                 {
196                     if (field == null)
197                     {
198                         return false;
199                     }
200                     field = field.Trim();
201                     return field.Equals("true", StringComparison.CurrentCultureIgnoreCase) || field.Equals("1");
202                 }
203             }
204             catch (FormatException ex)
205             {
206                 Debug.LogWarning(string.Format("{0}: {1} -> {2}", ex.Message, field, type));
207 
208                 //In case default value is null but the property/field is not a reference type
209                 if (defaultValue == null)
210                 {
211                     if (type == typeof (int) || type == typeof (float) || type == typeof (double))
212                     {
213                         defaultValue = -1;
214                     }
215                     else if (type == typeof (bool))
216                     {
217                         defaultValue = false;
218                     }
219                 }
220             }
221 
222             return defaultValue;
223         }
224 
225         /// <summary>
226         /// Load CSV into record list. If you need to decode records, use Decode(path) instead.
227         /// </summary>
228         /// <param name="path"></param>
229         /// <param name="separator"></param>
230         public bool Load(string path, char separator = ',')
231         {
232             //Dispose records
233             ClearRecord();
234 
235             if (string.IsNullOrEmpty(path))
236             {
237                 Debug.LogError(string.Format("CSV path not found: {0}", path));
238                 return false;
239             }
240 
241             //Read text
242             TextAsset asset = Resources.Load<TextAsset>(path);
243 
244             if (asset == null)
245             {
246                 Debug.LogError(string.Format("CSV file not found: {0}", path));
247                 return false;
248             }
249 
250             string content = asset.text;
251             if (string.IsNullOrEmpty(content))
252             {
253                 Debug.LogError(string.Format("CSV file content empty: {0}", path));
254                 return false;
255             }
256 
257             Separator = separator;
258             _records = new List<List<string>>();
259             foreach (string row in content.Split('
').Where(line => !string.IsNullOrEmpty(line.Trim())))
260             {
261                 List<string> columns = row.Split(separator).Select(s => s.Trim()).ToList();
262                 //Check each row's column count. They must match
263                 if (ColumnCount != 0 && columns.Count != ColumnCount)
264                 {
265                     Debug.LogError(
266                         string.Format("CSV parsing error in {0} at line {1} : columns counts do not match! Separator: '{2}'", path,
267                             content.IndexOf(row), separator));
268                     return false;
269                 }
270                 ColumnCount = columns.Count;
271                 _records.Add(columns);
272             }
273             RowCount = _records.Count;
274 
275             if (_records == null || !_records.Any())
276             {
277                 Debug.LogWarning(string.Format("CSV file parsing failed(empty records): {0}", path));
278                 return false;
279             }
280 
281             return true;
282         }
283 
284         public bool Load<T>()
285         {
286             ClearRecord();
287 
288             //Check mapping
289             if (!Attribute.IsDefined(typeof (T), typeof (CSVMapperAttribute), false))
290             {
291                 Debug.LogError(string.Format("CSV mapping not found in type: {0}", typeof (T)));
292                 return false;
293             }
294 
295             CSVMapperAttribute mapper =
296                 Attribute.GetCustomAttribute(typeof (T), typeof (CSVMapperAttribute), false) as CSVMapperAttribute;
297             _keyRow = mapper.KeyRow;
298             _descRow = mapper.DescRow;
299             _startRow = mapper.StartRow;
300 
301             bool result = Load(mapper.Path, mapper.Separator);
302             if (result)
303             {
304                 if (_records[_keyRow].Any(string.IsNullOrEmpty))
305                 {
306                     Debug.LogError(
307                         string.Format("Encoding Error! No key column found. Make sure target file is in UTF-8 format. Path: {0}",
308                             mapper.Path));
309                     return false;
310                 }
311             }
312             return result;
313         }
314 
315         /// <summary>
316         /// Get string value at specified row and column. If record empty or position not found, NULL will be returned. Row/Column starts at 0
317         /// </summary>
318         /// <param name="row"></param>
319         /// <param name="column"></param>
320         /// <returns></returns>
321         public string this[int row, int column]
322         {
323             get
324             {
325                 if (_records == null || _records.Count <= row || _records[row].Count <= column)
326                 {
327                     return null;
328                 }
329                 return _records[row][column];
330             }
331         }
332 
333         /// <summary>
334         /// Get a converted value at specified row and column. If record empty or position not found or convertion failed, defaultValue will be returned. Row/Column starts at 0
335         /// </summary>
336         /// <typeparam name="T">If T is collection, use array only(e.g. int[])</typeparam>
337         /// <param name="row"></param>
338         /// <param name="column"></param>
339         /// <param name="defaultValue">If T is collection, use element default(e.g. 0 for int[])</param>
340         /// <param name="arraySeparator"></param>
341         /// <returns></returns>
342         public T Read<T>(int row, int column, object defaultValue, char arraySeparator = '#')
343         {
344             string field = this[row, column];
345             if (field == null)
346             {
347                 Debug.LogWarning("Field is null. Make sure csv is loaded and field has content.");
348                 return typeof (T).IsArray ? default(T) : (T) defaultValue;
349             }
350 
351             return (T) ParseRawValue(field, typeof (T), defaultValue, arraySeparator);
352         }
353 
354 
355         /// <summary>
356         /// Remove all records.
357         /// </summary>
358         public void ClearRecord()
359         {
360             _records = null;
361         }
362     }

CSVEngine

看起来比较复杂？我们用例子来讲解：

添加一个表结构类

1 [CSVMapper("Configs/Resource")]
2     public class ResourceData : Data
3     {
4         [CSVColumn(0)] public int ID;
5         [CSVColumn(1)] public string Path;
6         [CSVColumn(2)] public float Ratio;
7         [CSVColumn(3)] public string Desc;
8     }

添加一个根据结构类读表的方法

 1 /// <summary>
 2     /// Get table
 3     /// </summary>
 4     /// <typeparam name="T"></typeparam>
 5     /// <returns></returns>
 6     private IEnumerable<T> GetTable<T>() where T : Data, new()
 7     {
 8         CSVReaderX reader = new CSVReaderX();
 9         if (reader.Load<T>())
10         {
11             Debug.Log(string.Format("{0} Loaded", typeof (T)));
12             return reader.Decode<T>();
13         }
14 
15         return null;
16     }

注意，这里让ResourceData继承Data，并且在GetTable里做了泛型约束是为了规范使用，并无其他意义

Data结构如下

1 /// <summary>
2     /// All table class must inherit this for constraint
3     /// </summary>
4     public abstract class Data
5     {
6     }

Resource.csv的内容如下：

资源ID,资源路径,缩放比例,说明
int,string,float,string
10001,Model/a,1,
10002,Model/b,1,
10003,Model/c,1,
10004,Model/d,1,
10005,Model/e,1,
10006,Model/f,1,
10007,Model/g,1,

还可以直接用键值索引：

[CSVMapper("Configs/Resource")]
    public class ResourceData : Data
    {
        [CSVColumn(“资源ID”)] public int ID;
        [CSVColumn(“资源路径”)] public string Path;
        [CSVColumn(“缩放比例”)] public float Ratio;
        [CSVColumn(“说明”)] public string Desc;
    }

第二行(int,string,float,string)其实没什么意义，因此他被当作Desc行(描述行)。

使用延迟实例化加载表格并存储为字典，即可进行键值索引

public Dictionary<int, ResourceData> ResourceDict
    {
        get
        {
            return _resourceDict ?? (_resourceDict = GetTable<ResourceData>().ToDictionary(k => k.ID));
        }
    }

var data = ResourceDict[0];

以上是映射好表结构后自动加载的结果。

我还额外提供了手动解析的接口：

手动Load

public bool Load(string path, char separator = ',');

手动Read

public T Read<T>(int row, int column, object defaultValue, char arraySeparator = '#');

或者通过索引器获得string类型的值再自己解析

1         CSVReaderX reader = new CSVReaderX();
2 
3         reader.Load("Path");
4         int val = reader.Read<int>(0, 0, 0);
5         int[] vals = reader.Read<int[]>(0, 0, null);
6         string raw = reader[0, 0];

注意，行和列都是从0开始算。

路径因为我这里是Unity3D的项目，所以映射的路径是Resources下不带后缀的路径，且Load方法里用的是Resources.Load方式来读取资源。其他平台的项目做相应修改即可~

集合字段只能用逗号之外的分隔符(默认'#')，且只能为数组类型

1     [CSVMapper("Configs/Skill")]
2     public class SkillData : Data
3     {
4         [CSVColumn(0)] public int ID;
5         [CSVColumn(1)] public int Name;
6         [CSVColumn(2)] public int[] SkillIDs;
7     }

有问题欢迎探讨。

源码参见我的github：

https://github.com/theoxuan/MTGeek/blob/master/Assets/Scripts/CSVReaderX.cs

相关阅读:
几种简单的素数判定法（转）
在Ubuntu下编译WebKit源码
 Struts2+JSON特别让人恶心的一个问题
 强大的asp.net 绑定组件
 关于单点登陆的示例代码
 NHibernate 如何高效的数据翻页？
FLEX学习网站大全
 pku1207
windows7试用过程常见问题解答
 什么是HTTPS？
原文地址：https://www.cnblogs.com/seancheung/p/4184582.html