由于许多项目都会使用csv文件来存储数据,因此我在这里介绍一套我自己设计出来的解决方案。有不合理的地方还望指出。
一般的csv文件读取都会比较繁琐:按照分隔符(默认逗号)分好行,列,再根据对应的顺序,一行一行,一条一条地读取数据。这本书没什么问题,然而一旦更改csv里的列顺序,或者增删某行就会产生牵一发动全身的结果。而且字段多的时候,写起来是非常反人类的。。
我们项目起初就是用的这种原始解决方案,也的确碰到了上面提及的尴尬局面。后来我想到,如果我能对csv表结构做好映射,像json,像xml那样,不就能大大提高效率?
于是我引出了如下的设计方案:
1. 准备
首先定义两条特性,一条是表整体结构相关的,一条是用来做字段映射
1 /// <summary> 2 /// CSV column mapping 3 /// </summary> 4 [AttributeUsage(AttributeTargets.Property | AttributeTargets.Field)] 5 public class CSVColumnAttribute : Attribute 6 { 7 /// <summary> 8 /// Name of this property/field in csv file(default is property name) 9 /// </summary> 10 public string Key { get; set; } 11 12 /// <summary> 13 /// Column of this property/field in csv file(if column is assigned, key will be ignored) 14 /// </summary> 15 public int Column { get; set; } 16 17 /// <summary> 18 /// Default value(if reading NULL or failed;deault: -1 for number value, null for class, false for bool) 19 /// </summary> 20 public object DefaultValue { get; set; } 21 22 /// <summary> 23 /// Separator for parsing if it's an array(',' by default) 24 /// </summary> 25 public char ArraySeparator { get; set; } 26 27 28 public CSVColumnAttribute() 29 { 30 Column = -1; 31 ArraySeparator = '#'; 32 } 33 34 public CSVColumnAttribute(string key) 35 { 36 Key = key; 37 Column = -1; 38 ArraySeparator = '#'; 39 } 40 41 public CSVColumnAttribute(int column) 42 { 43 Column = column; 44 ArraySeparator = '#'; 45 } 46 }
1 /// <summary> 2 /// CSV Mapping class or struct(Try avoid struct as possible. Struct is boxed then unboxed in reflection) 3 /// </summary> 4 [AttributeUsage(AttributeTargets.Class | AttributeTargets.Struct)] 5 public class CSVMapperAttribute : Attribute 6 { 7 /// <summary> 8 /// Path of the CSV file(without file extension). Base directory is Assets/Resources/ 9 /// </summary> 10 public string Path { get; set; } 11 12 /// <summary> 13 /// Mapping key row(0 by default) 14 /// </summary> 15 public int KeyRow { get; set; } 16 17 /// <summary> 18 /// Description row(1 by default. Will be skipped in decoding. If no desc in file, assign -1) 19 /// </summary> 20 public int DescRow { get; set; } 21 22 /// <summary> 23 /// Separator for csv parsing(',' by default) 24 /// </summary> 25 public char Separator { get; set; } 26 27 /// <summary> 28 /// Starting index of data rows 29 /// </summary> 30 public int StartRow { get; set; } 31 32 public CSVMapperAttribute() 33 { 34 KeyRow = 0; 35 DescRow = 1; 36 Separator = ','; 37 } 38 39 public CSVMapperAttribute(string name) 40 { 41 Path = name; 42 KeyRow = 0; 43 DescRow = 1; 44 Separator = ','; 45 } 46 }
表相关特性里的属性包括:CSV所在路径(可选),键值所在行(针对非英文表格),描述所在行(可选),分隔符(默认为逗号','),起始行(可选,解析时会跳过这之前的行)
字段映射相关特性的属性包括:键值,对应列号(键值和列号2选1即可。都不设置则默认键值为属性名),默认值(可选,字段解析失败会返回此默认值),数组分隔符(可选,默认为'#',用来分隔数组)
CSVMapperAttribute可以添加到类或结构体上,CSVColumnAttribute可以添加到属性或字段上。
2. 读取和解析
1 public class CSVEngine 2 { 3 private List<List<string>> _records; 4 5 /// <summary> 6 /// Get column count 7 /// </summary> 8 public int ColumnCount { get; private set; } 9 10 /// <summary> 11 /// Get row count 12 /// </summary> 13 public int RowCount { get; private set; } 14 15 /// <summary> 16 /// Get separator 17 /// </summary> 18 public char Separator { get; private set; } 19 20 private int _keyRow = -1; 21 private int _descRow = -1; 22 private int _startRow = -1; 23 24 /// <summary> 25 /// Decode CSV file to target mapped type. 26 /// </summary> 27 /// <typeparam name="T"></typeparam> 28 /// <param name="path"></param> 29 /// <returns></returns> 30 public IEnumerable<T> Decode<T>() where T : new() 31 { 32 if (_records == null || _keyRow < 0 || _descRow < 0 || _startRow < 0) 33 { 34 Debug.LogError(string.Format("Decoding Failed: {0}", typeof (T))); 35 yield break; 36 } 37 38 //Decode each row 39 for (int i = _startRow; i < _records.Count; i++) 40 { 41 if (i == _keyRow || i == _descRow) 42 continue; 43 yield return DecodeRow<T>(_records[i], _records[_keyRow]); 44 } 45 } 46 47 /// <summary> 48 /// Decode single row 49 /// </summary> 50 /// <typeparam name="T"></typeparam> 51 /// <param name="fields"></param> 52 /// <param name="keys"></param> 53 /// <returns></returns> 54 private T DecodeRow<T>(List<string> fields, List<string> keys) where T : new() 55 { 56 T result = new T(); 57 IEnumerable<MemberInfo> members = 58 typeof (T).GetMembers() 59 .Where(m => m.MemberType == MemberTypes.Property || m.MemberType == MemberTypes.Field) 60 .Where(m => Attribute.IsDefined(m, typeof (CSVColumnAttribute), false)); 61 62 if (typeof (T).IsValueType) 63 { 64 object boxed = result; 65 foreach (MemberInfo member in members) 66 { 67 CSVColumnAttribute attribute = 68 member.GetCustomAttributes(typeof (CSVColumnAttribute), false).First() as CSVColumnAttribute; 69 string field = GetRawValue(attribute, fields, keys, member.Name); 70 if (ReferenceEquals(field, member.Name)) 71 return result; 72 SetValue(member, boxed, field, attribute.DefaultValue, attribute.ArraySeparator); 73 } 74 return (T) boxed; 75 } 76 77 foreach (MemberInfo member in members) 78 { 79 CSVColumnAttribute attribute = 80 member.GetCustomAttributes(typeof (CSVColumnAttribute), false).First() as CSVColumnAttribute; 81 string field = GetRawValue(attribute, fields, keys, member.Name); 82 if (ReferenceEquals(field, member.Name)) 83 return result; 84 SetValue(member, result, field, attribute.DefaultValue, attribute.ArraySeparator); 85 } 86 return result; 87 } 88 89 /// <summary> 90 /// Get raw value by CSVColumnAttribute or name 91 /// </summary> 92 /// <param name="attribute"></param> 93 /// <param name="fields"></param> 94 /// <param name="keys"></param> 95 /// <param name="name"></param> 96 /// <returns></returns> 97 private string GetRawValue(CSVColumnAttribute attribute, List<string> fields, List<string> keys, string name) 98 { 99 if (attribute.Column >= 0 && fields.Count > attribute.Column) 100 { 101 return fields[attribute.Column]; 102 } 103 if (!string.IsNullOrEmpty(attribute.Key) && keys.Contains(attribute.Key)) 104 { 105 return fields[keys.IndexOf(attribute.Key)]; 106 } 107 if (keys.Contains(name)) 108 { 109 return fields[keys.IndexOf(name)]; 110 } 111 Debug.LogError(string.Format("Mapping Error! Column: {0}, Key: {1}, Name:{2}", attribute.Column, 112 attribute.Key ?? "NULL", name)); 113 return name; 114 } 115 116 /// <summary> 117 /// Parse and set raw value 118 /// </summary> 119 /// <param name="member"></param> 120 /// <param name="obj"></param> 121 /// <param name="value"></param> 122 /// <param name="defaultValue"></param> 123 /// <param name="arraySeparator"></param> 124 private void SetValue(MemberInfo member, object obj, string value, object defaultValue, char arraySeparator) 125 { 126 if (member.MemberType == MemberTypes.Property) 127 { 128 (member as PropertyInfo).SetValue(obj, 129 ParseRawValue(value, (member as PropertyInfo).PropertyType, defaultValue, arraySeparator), 130 null); 131 } 132 else 133 { 134 (member as FieldInfo).SetValue(obj, 135 ParseRawValue(value, (member as FieldInfo).FieldType, defaultValue, arraySeparator)); 136 } 137 } 138 139 /// <summary> 140 /// Parse string value to specified type 141 /// </summary> 142 /// <param name="field"></param> 143 /// <param name="type">If type is collection, use array only(e.g. int[])</param> 144 /// <param name="defaultValue">If type is collection, use element default(e.g. 0 for int[])</param> 145 /// <param name="arraySeparator"></param> 146 /// <returns></returns> 147 private object ParseRawValue(string field, Type type, object defaultValue, char arraySeparator) 148 { 149 try 150 { 151 if (type.IsArray) 152 { 153 IEnumerable<object> result = 154 field.Split(arraySeparator) 155 .Select(f => ParseRawValue(f, type.GetElementType(), defaultValue, arraySeparator)); 156 if (type.GetElementType() == typeof (string)) 157 { 158 return result.Cast<string>().ToArray(); 159 } 160 if (type.GetElementType() == typeof (int)) 161 { 162 return result.Cast<int>().ToArray(); 163 } 164 if (type.GetElementType() == typeof (float)) 165 { 166 return result.Cast<float>().ToArray(); 167 } 168 if (type.GetElementType() == typeof (double)) 169 { 170 return result.Cast<double>().ToArray(); 171 } 172 if (type.GetElementType() == typeof (bool)) 173 { 174 return result.Cast<bool>().ToArray(); 175 } 176 return null; 177 } 178 if (type == typeof (string)) 179 { 180 return field; 181 } 182 if (type == typeof (int)) 183 { 184 return Convert.ToInt32(field); 185 } 186 if (type == typeof (float)) 187 { 188 return Convert.ToSingle(field); 189 } 190 if (type == typeof (double)) 191 { 192 return Convert.ToDouble(field); 193 } 194 if (type == typeof (bool)) 195 { 196 if (field == null) 197 { 198 return false; 199 } 200 field = field.Trim(); 201 return field.Equals("true", StringComparison.CurrentCultureIgnoreCase) || field.Equals("1"); 202 } 203 } 204 catch (FormatException ex) 205 { 206 Debug.LogWarning(string.Format("{0}: {1} -> {2}", ex.Message, field, type)); 207 208 //In case default value is null but the property/field is not a reference type 209 if (defaultValue == null) 210 { 211 if (type == typeof (int) || type == typeof (float) || type == typeof (double)) 212 { 213 defaultValue = -1; 214 } 215 else if (type == typeof (bool)) 216 { 217 defaultValue = false; 218 } 219 } 220 } 221 222 return defaultValue; 223 } 224 225 /// <summary> 226 /// Load CSV into record list. If you need to decode records, use Decode(path) instead. 227 /// </summary> 228 /// <param name="path"></param> 229 /// <param name="separator"></param> 230 public bool Load(string path, char separator = ',') 231 { 232 //Dispose records 233 ClearRecord(); 234 235 if (string.IsNullOrEmpty(path)) 236 { 237 Debug.LogError(string.Format("CSV path not found: {0}", path)); 238 return false; 239 } 240 241 //Read text 242 TextAsset asset = Resources.Load<TextAsset>(path); 243 244 if (asset == null) 245 { 246 Debug.LogError(string.Format("CSV file not found: {0}", path)); 247 return false; 248 } 249 250 string content = asset.text; 251 if (string.IsNullOrEmpty(content)) 252 { 253 Debug.LogError(string.Format("CSV file content empty: {0}", path)); 254 return false; 255 } 256 257 Separator = separator; 258 _records = new List<List<string>>(); 259 foreach (string row in content.Split(' ').Where(line => !string.IsNullOrEmpty(line.Trim()))) 260 { 261 List<string> columns = row.Split(separator).Select(s => s.Trim()).ToList(); 262 //Check each row's column count. They must match 263 if (ColumnCount != 0 && columns.Count != ColumnCount) 264 { 265 Debug.LogError( 266 string.Format("CSV parsing error in {0} at line {1} : columns counts do not match! Separator: '{2}'", path, 267 content.IndexOf(row), separator)); 268 return false; 269 } 270 ColumnCount = columns.Count; 271 _records.Add(columns); 272 } 273 RowCount = _records.Count; 274 275 if (_records == null || !_records.Any()) 276 { 277 Debug.LogWarning(string.Format("CSV file parsing failed(empty records): {0}", path)); 278 return false; 279 } 280 281 return true; 282 } 283 284 public bool Load<T>() 285 { 286 ClearRecord(); 287 288 //Check mapping 289 if (!Attribute.IsDefined(typeof (T), typeof (CSVMapperAttribute), false)) 290 { 291 Debug.LogError(string.Format("CSV mapping not found in type: {0}", typeof (T))); 292 return false; 293 } 294 295 CSVMapperAttribute mapper = 296 Attribute.GetCustomAttribute(typeof (T), typeof (CSVMapperAttribute), false) as CSVMapperAttribute; 297 _keyRow = mapper.KeyRow; 298 _descRow = mapper.DescRow; 299 _startRow = mapper.StartRow; 300 301 bool result = Load(mapper.Path, mapper.Separator); 302 if (result) 303 { 304 if (_records[_keyRow].Any(string.IsNullOrEmpty)) 305 { 306 Debug.LogError( 307 string.Format("Encoding Error! No key column found. Make sure target file is in UTF-8 format. Path: {0}", 308 mapper.Path)); 309 return false; 310 } 311 } 312 return result; 313 } 314 315 /// <summary> 316 /// Get string value at specified row and column. If record empty or position not found, NULL will be returned. Row/Column starts at 0 317 /// </summary> 318 /// <param name="row"></param> 319 /// <param name="column"></param> 320 /// <returns></returns> 321 public string this[int row, int column] 322 { 323 get 324 { 325 if (_records == null || _records.Count <= row || _records[row].Count <= column) 326 { 327 return null; 328 } 329 return _records[row][column]; 330 } 331 } 332 333 /// <summary> 334 /// Get a converted value at specified row and column. If record empty or position not found or convertion failed, defaultValue will be returned. Row/Column starts at 0 335 /// </summary> 336 /// <typeparam name="T">If T is collection, use array only(e.g. int[])</typeparam> 337 /// <param name="row"></param> 338 /// <param name="column"></param> 339 /// <param name="defaultValue">If T is collection, use element default(e.g. 0 for int[])</param> 340 /// <param name="arraySeparator"></param> 341 /// <returns></returns> 342 public T Read<T>(int row, int column, object defaultValue, char arraySeparator = '#') 343 { 344 string field = this[row, column]; 345 if (field == null) 346 { 347 Debug.LogWarning("Field is null. Make sure csv is loaded and field has content."); 348 return typeof (T).IsArray ? default(T) : (T) defaultValue; 349 } 350 351 return (T) ParseRawValue(field, typeof (T), defaultValue, arraySeparator); 352 } 353 354 355 /// <summary> 356 /// Remove all records. 357 /// </summary> 358 public void ClearRecord() 359 { 360 _records = null; 361 } 362 }
看起来比较复杂?我们用例子来讲解:
添加一个表结构类
1 [CSVMapper("Configs/Resource")] 2 public class ResourceData : Data 3 { 4 [CSVColumn(0)] public int ID; 5 [CSVColumn(1)] public string Path; 6 [CSVColumn(2)] public float Ratio; 7 [CSVColumn(3)] public string Desc; 8 }
添加一个根据结构类读表的方法
1 /// <summary> 2 /// Get table 3 /// </summary> 4 /// <typeparam name="T"></typeparam> 5 /// <returns></returns> 6 private IEnumerable<T> GetTable<T>() where T : Data, new() 7 { 8 CSVReaderX reader = new CSVReaderX(); 9 if (reader.Load<T>()) 10 { 11 Debug.Log(string.Format("{0} Loaded", typeof (T))); 12 return reader.Decode<T>(); 13 } 14 15 return null; 16 }
注意,这里让ResourceData继承Data,并且在GetTable里做了泛型约束是为了规范使用,并无其他意义
Data结构如下
1 /// <summary> 2 /// All table class must inherit this for constraint 3 /// </summary> 4 public abstract class Data 5 { 6 }
Resource.csv的内容如下:
资源ID,资源路径,缩放比例,说明
int,string,float,string
10001,Model/a,1,
10002,Model/b,1,
10003,Model/c,1,
10004,Model/d,1,
10005,Model/e,1,
10006,Model/f,1,
10007,Model/g,1,
还可以直接用键值索引:
[CSVMapper("Configs/Resource")] public class ResourceData : Data { [CSVColumn(“资源ID”)] public int ID; [CSVColumn(“资源路径”)] public string Path; [CSVColumn(“缩放比例”)] public float Ratio; [CSVColumn(“说明”)] public string Desc; }
第二行(int,string,float,string)其实没什么意义,因此他被当作Desc行(描述行)。
使用延迟实例化加载表格并存储为字典,即可进行键值索引
public Dictionary<int, ResourceData> ResourceDict { get { return _resourceDict ?? (_resourceDict = GetTable<ResourceData>().ToDictionary(k => k.ID)); } }
var data = ResourceDict[0];
以上是映射好表结构后自动加载的结果。
我还额外提供了手动解析的接口:
手动Load
public bool Load(string path, char separator = ',');
手动Read
public T Read<T>(int row, int column, object defaultValue, char arraySeparator = '#');
或者通过索引器获得string类型的值再自己解析
1 CSVReaderX reader = new CSVReaderX(); 2 3 reader.Load("Path"); 4 int val = reader.Read<int>(0, 0, 0); 5 int[] vals = reader.Read<int[]>(0, 0, null); 6 string raw = reader[0, 0];
注意,行和列都是从0开始算。
路径因为我这里是Unity3D的项目,所以映射的路径是Resources下不带后缀的路径,且Load方法里用的是Resources.Load方式来读取资源。其他平台的项目做相应修改即可~
集合字段只能用逗号之外的分隔符(默认'#'),且只能为数组类型
1 [CSVMapper("Configs/Skill")] 2 public class SkillData : Data 3 { 4 [CSVColumn(0)] public int ID; 5 [CSVColumn(1)] public int Name; 6 [CSVColumn(2)] public int[] SkillIDs; 7 }
有问题欢迎探讨。
源码参见我的github:
https://github.com/theoxuan/MTGeek/blob/master/Assets/Scripts/CSVReaderX.cs