• (转)谈论 读取Excel丢失内容的问题


    原文出自:http://freeoh.spaces.live.com/blog/cns!57853FFFBF81D887!1185.entry?wa=wsignin1.0

    引用 :读取Excel丢失内容的问题

    通常读取Excel文件的方法有两种,一是通过ADO的OleDb,一是通过Com组件;
    这里要说的是通过OleDb读取时,遇到的内容丢失问题。
    症状:使用Microsoft.Jet.OLEDB.4.0读取数据时,当某一字段内分别含有文本和数字的混合数据时,某一类型的数据会产生丢失。
    读取方式:
    string connString = " Provider=Microsoft.Jet.OLEDB.4.0; Data Source=文件名; Extended Properties='Excel 8.0;'";
    OleDbConnection conn = new OleDbConnection(connString);
    string sql = " SELECT * FROM [Sheet1$] ";
    conn.Open();
    OleDbDataAdapter adapter = new OleDbDataAdapter(sql, conn);
    DataSet ds = new DataSet();
    adapter.Fill(ds, "[Sheet1$]");
    案例1
    Excel数据如下:(一共9行)
    "入库指令单号","SF公司","物料编号","实收数量","包装","批号","生产日期","入库时间","PO#","状态"
    "11-20080808-000001","01","11111","100","fasdfas","fdafad","08/08/2008","2008-08-08","SF001","1"
    "11-20080808-000001","01","11111","100","fasdfas","fdafad","08/08/2008","2008-08-08","SF001","1"
    ......
    "11-20080808-000001","01","11111","100","fasdfas","fdafad","08/08/2008","2008-08-08","SF001","1"
    "11-20080808-000001","01","xxxx","100","fasdfas","fdafad","08/08/2008","2008-08-08","SF001","1"
    读取结果并不理想,第9行的xxxx数据丢失了
    案例2
    Excel数据如下:(一共9行)
    "入库指令单号","SF公司","物料编号","实收数量","包装","批号","生产日期","入库时间","PO#","状态"
    "11-20080808-000001","01","xxxx","100","fasdfas","fdafad","08/08/2008","2008-08-08","SF001","1"
    "11-20080808-000001","01","11111","100","fasdfas","fdafad","08/08/2008","2008-08-08","SF001","1"
    ......
    "11-20080808-000001","01","11111","100","fasdfas","fdafad","08/08/2008","2008-08-08","SF001","1"
    "11-20080808-000001","01","xxxx","100","fasdfas","fdafad","08/08/2008","2008-08-08","SF001","1"
    读取结果也不理想,第1行和第9行的xxxx数据都丢失了

    问题分析:
      产生这种问题的根源与Excel ISAM(Indexed Sequential Access Method,即索引顺序存取方法)驱动程序的限制有关,Excel ISAM 驱动程序通过检查前几行中实际值确定一个 Excel 列的类型,然后选择能够代表其样本中大部分值的数据类型。也即Excel ISAM查找某列前几行(默认情况下是8行),把占多的类型作为其处理类型。例如如果数字占多,那么其它含有字母等文本的数据项就会置空;相反如果文本居多,纯数字的数据项就会被置空。
    现具体分析Extended Properties项中的HDR和IMEX所代表的含义。HDR用来设置是否将Excel表中第一行作为字段名,“YES”代表是,“NO”代表不是即也为数据内容;IMEX是用来告诉驱动程序使用Excel文件的模式,其值有0、1、2三种,分别代表导出、导入、混合模式。当我们设置IMEX=1时将强制混合数据转换为文本,但仅仅这种设置并不可靠,IMEX=1只确保在某列前8行数据至少有一个是文本项的时候才起作用,它只是把查找前8行数据中数据类型占优选择的行为作了略微的改变。例如某列前8行数据全为纯数字,那么它仍然以数字类型作为该列的数据类型,随后行里的含有文本的数据仍然变空。

    解决方案1
    前提,你的Excel包括Header,且Header都是文本,如案例1和案例2
    使用:Provider=Microsoft.Jet.OLEDB.4.0; Data Source=文件名; Extended Properties='Excel 8.0;HDR=NO;IMEX=1'
    IMEX=1将强制混合数据转换为文本,HDR=NO将第一行作为内容,由于第一行Header都是文本,因此所有列的类型都将转换成文本。
    解决方案2
    如果你的Excel不包含Header,且某列的数据又是混合数据,这种情况只能通过Com组件来读取Excel。

    使用OleDb读取Excel的条件总结:
    只有一条,保证Excel中每一列的前8行数据中至少有一个文本。(多适用于有Header的内容)

  • 相关阅读:
    冒泡排序
    MySql
    利用mybatis-generator自动生成代码
    SQL,HQL,CQL,JPQL了解
    mysql中的约束
    详解CRUD?
    什么是数据库ACID?
    建造者模式(Builder Pattern)
    Java中的设计模式
    java中的排序面试题
  • 原文地址:https://www.cnblogs.com/jintan/p/1287936.html
Copyright © 2020-2023  润新知