• 【Java/Csv/Regex】用正则表达式去劈分带引号的csv文件行,得到想要的行数据


    csv文件是用引号分隔的文本行,为了完善内容人们又用引号把每个区块的内容又包了起来,于是形成下面的文件:

    "1","2","3","4","5","6","7","8","9","10","11","12","13","14","15","16","傅宗龙","18","19","20"
    "1","2","3","4","5.55","6","7","8","9","10","朱由检","12","13","14","15","16,666,666","17","袁崇焕","19","20"
    "醉里挑灯看剑,梦回吹角连营","2","3","4","孙传庭","6","7","8","9","10","11","12","13","14","15","16","17","18","19","20"
    ",,,,,,,,,","2","3","4","熊廷弼","6","7","8","9","10","11","12","卢象升","14","15","16","17","18","19","20"

    要解析这样的文件也算简单,只用在劈分时加入一些细节就好,代码如下:

    import java.io.FileReader;
    import java.io.IOException;
    import java.io.LineNumberReader;
    import java.util.ArrayList;
    import java.util.List;
    
    /**
     * 解析一个csv文件,将其内容转化为一个嵌套链表
     * @author 逆火
     *
     * 2019年11月23日 上午8:51:15
     */
    public class CsvfileParser {
        private List<List<String>> contents;
        
        public CsvfileParser(String filename) throws IOException {
            contents=new ArrayList<List<String>>();
            LineNumberReader fileReader = new LineNumberReader(new FileReader(filename));
            String line = null;
    
            while ((line = fileReader.readLine()) != null) {
                System.out.println("Line " + fileReader.getLineNumber() +": " + line);
                contents.add(getArrayFromLine(line));
            }
            
            fileReader.close();
            
            
        }
        
        private List<String> getArrayFromLine(String line) {
            List<String> retval=new ArrayList<String>();
            
            // (^\s*")匹配每行开头的",这会产生数组第一项为零长度字符串,所以下面遍历时选择跳过
            // ("\s*,\s*")匹配中间的","
            // ("\s*$)匹配每行结尾的"
            String[] arr=line.split("(^\s*")|("\s*,\s*")|("\s*$)");
            
            for(int i=1;i<arr.length;i++) {// Jump first empty string
                retval.add(arr[i]);
            }
            
            return retval;
        }
        
        public void printContents() {
            for(List<String> ls:contents) {
                System.out.println(String.join("|", ls));
            }
        }
        
        public static void main(String[] args) throws IOException {
            CsvfileParser cp=new CsvfileParser("C:\Users\horn1\Desktop\sample.csv");
            System.out.println("---------------------------");
            cp.printContents();
        }
    }

    输出如下:

    Line 1: "1","2","3","4","5","6","7","8","9","10","11","12","13","14","15","16","傅宗龙","18","19","20"
    Line 2: "1","2","3","4","5.55","6","7","8","9","10","朱由检","12","13","14","15","16,666,666","17","袁崇焕","19","20"
    Line 3: "醉里挑灯看剑,梦回吹角连营","2","3","4","孙传庭","6","7","8","9","10","11","12","13","14","15","16","17","18","19","20"
    Line 4: ",,,,,,,,,","2","3","4","熊廷弼","6","7","8","9","10","11","12","卢象升","14","15","16","17","18","19","20"
    ---------------------------
    1|2|3|4|5|6|7|8|9|10|11|12|13|14|15|16|傅宗龙|18|19|20
    1|2|3|4|5.55|6|7|8|9|10|朱由检|12|13|14|15|16,666,666|17|袁崇焕|19|20
    醉里挑灯看剑,梦回吹角连营|2|3|4|孙传庭|6|7|8|9|10|11|12|13|14|15|16|17|18|19|20
    ,,,,,,,,,|2|3|4|熊廷弼|6|7|8|9|10|11|12|卢象升|14|15|16|17|18|19|20

    --END-- 2019年11月23日09:14:45

  • 相关阅读:
    LAMP实例搭建wordpress博客步骤
    MySQL Replication
    Mariadb源码和二进制安装
    Linux九阴真经之九阴白骨爪残卷13(LVM的备份还原,恢复最新状态)
    Linux九阴真经之九阴白骨爪残卷12(日志功能)
    Linux九阴真经之九阴白骨爪残卷11(并发访问控制和事务Transactions)
    Linux九阴真经之九阴白骨爪残卷10(MySQL架构、缓存及索引)
    Linux九阴真经之九阴白骨爪残卷9(存储引擎MyISAM、MySQL服务器变量)
    Linux九阴真经之九阴白骨爪残卷8(存储函数、存储过程、触发器)
    Linux九阴真经之九阴白骨爪残卷7(Mariadb的三种安装方法)
  • 原文地址:https://www.cnblogs.com/heyang78/p/11915324.html
Copyright © 2020-2023  润新知