• 2013年东三省数模A题第而问(Matlab读取xls文件并根据关键字检索)


    题目:

    1. 从这些数据中能否找出某些规律性的东西:如食品产地与食品质量的关系;食品销售地点(即抽检地点)与食品质量的关系;季节因素等等;

      一看到这个题目瞬间就感觉到eggache了,这怎么找啊,跟食品安全有关的东西,这不一大堆嘛,难道这道题要海底捞针?有地方捞针到也行啊,但是一看那一百九十多个几乎可以称之为没有规律的数据,顿时感觉不会再爱了。。

      虽然感觉不会再爱了,但还要做题的不是。其实冷静下来分析一下,看破一个点,这个题就迎刃而解了。这一点就是:从题目所给的表格中,我们能获得什么数据?

    然后我们根据获取的数据进行规律总结就可以了,这样这道题就可以变成了一道数据略复杂的统计题。下面是解题过程:

      由分析数据可得,我们从数据中获得以下关系:

      食品-季度;食品-种类;食品-环节;食品-抽检单位;食品-生产单位;

      得到这五个对应的关系(个人分析角度不同,或许不止这五个关系),得到关系就好说了,挨个统计作图分析就可以了。至此,题意分析结束。

      知道做什么以后就轮到怎么去做了。其中食品-季度;食品-种类;食品-环节;还是比较好说的,数据量不大,直接手动分析就可以了。但是食品-抽检单位;食品-生产单位;就让人蛋疼了,那可是190多个数据文件啊(xls、word and html),手动分析,估计等数模结束了你都未必能分出来。那怎么办呢?好吧,我们是程序猿~~~

    首先我们对于这个两个关系进行分析,其中抽检单位都是在深圳市内,而有一些生产单位是不在深圳室内的。很明显东三省的数模是给深圳市做城市建设的,所以为了减少工作量,对于这两个关系我们选择食品-抽检单位进行统计。

    方法为matlab读取xls文件(其他文件已手动转化为xls文件),并根据关键字检索;

    (1)首先是获取合格和不合格食品的受检单位,代码如下:

     1 clear all;
     2 
     3 Files = dir(fullfile('unqualified','*.xls'));
     4 
     5 str1 = '受检单位名称';
     6 str2 = '被检单位';
     7 str3 = '受检单位';
     8 
     9 str4 = '受检单位详细地址';
    10 str5 = '受检单位地址';
    11 count = 1;
    12 name = {str1,str4};
    13 sign = 0;
    14 sign_address = 0;
    15 
    16 LengthFiles = length(Files);
    17 for l = 1:LengthFiles;
    18     %%读取xls文件内容
    19     [N, T, rawdata] = xlsread(strcat('unqualified/',Files(l).name));
    20     %%获得文件内容大小,为二维矩阵
    21     data_size = size(rawdata);
    22     %%初始化公司名称元胞数组
    23     name_size = size(name);
    24     % 循环遍历xls文件中的数据,相当与二维数组
    25     for row = 1 : data_size(1)
    26         for col = 1 : data_size(2)
    27             %%查找适合的那一列
    28             if( isequal((rawdata{row, col}), (rawdata{row, col})) && ((strcmp(rawdata{row, col}, str1) == 1) || ...
    29                 (strcmp(rawdata{row, col}, str2) == 1) || (strcmp(rawdata{row, col}, str3) == 1) ))
    30                 %%判断是否有受检单位详细地址这一列,如果有则将 sign_address 设为 1
    31                 if(strcmp(rawdata{row, col+1}, str4) == 1 || strcmp(rawdata{row, col+1}, str5) == 1)
    32                     sign_address = 1;
    33                 end
    34                 %得到受检单位名称一列后,遍历获取那一列的数据
    35                 for i = (row + 1) : data_size(1)
    36                     % 查询名称是否已存在
    37                     for j = 2 : name_size(1)
    38                         if(strcmp(rawdata{i, col}, name{j}) == 1)
    39                             %如果存在,则将标记位标记为 1
    40                             sign = 1;
    41                             break;
    42                         end
    43                     end   
    44                     %%如果公司名不存在,则将公司名加入元胞数组中
    45                     if(sign == 0 && isequal((rawdata{i, col}), (rawdata{i, col})))
    46                         count = count + 1;
    47                         name{count,1} = rawdata{i, col};
    48                         name_size = size(name);
    49                         if(sign_address == 1 && isempty(name{count,2}) )
    50                             name{count,2} = rawdata{i, col+1};
    51                         end
    52                     end
    53                     %%如果公司存在但所遍历的这一项有受监单位详细地址这一项,则将受检单位详细地址这一项加入元胞数组中
    54                     if(sign == 1 && isequal((rawdata{i, col}), (rawdata{i, col})) && sign_address == 1 && isempty(name{count,2})) 
    55                         name{count,2} = rawdata{i, col+1};
    56                     end
    57                     
    58                     sign = 0;
    59                     
    60                     %end
    61                 end
    62                 break;
    63             end
    64         end
    65     end
    66     sign_address = 0;
    67 end

    运行后会得到所有公司的一个mat文件,得到公司名后,也不能全手动分地区啊(按地区进行分类,而且不算重复,一共得到了7528了公司)。

    然后我们还是程序猿~~读取上面获得的mat文件,进行关键字检索,怎么做呢?直接上代码~

      1 clear all;
      2 
      3 
      4 %load('D:/数模/A题/data/qualified/matlab.mat');
      5 load('D:/数模/A题/data/unqualified/unqualified.mat');
      6 
      7 strFuTianQu = {'福田区 ','园岭','南园','华富','莲花','福田','沙头','香蜜湖','福保','华强北', '梅林',...
      8     '福田','园岭街道','南园街道','华富街道','莲花街道','福田街道','沙头街道','香蜜湖街道','福保街道','华强北街道', '梅林街道'};
      9 strFuTianQuSize = size(strFuTianQu);
     10 
     11 strLuoHuQu = {'罗湖区','黄贝','东门','南湖','桂园','笋岗','清水河','翠竹','东湖','东晓','莲塘'...
     12     '罗湖','黄贝街道','东门街道','南湖街道','桂园街道','笋岗街道','清水河街道','翠竹街道','东湖街道','东晓街道','莲塘街道'};
     13 strLuoHuQuSize = size(strLuoHuQu);
     14 
     15 strNanShanQu = {'南山区','南头','南山','招商','蛇口','粤海','沙河','西丽','桃源', ...
     16     '南山','南头街道','南山街道','招商街道','蛇口街道','粤海街道','沙河街道','西丽街道','桃源街道'};
     17 strNanShanQuSize = size(strNanShanQu);
     18 
     19 strYanTianQu = {'盐田区','沙头角','梅沙','盐田','海山', ...
     20     '盐田','沙头角街道','梅沙街道','盐田街道','海山街道'};
     21 strYanTianQuSize = size(strYanTianQu);
     22 
     23 strBaoAnQu = {'宝安区','新安','西乡','福永','沙井','松岗','石岩','中心区' , ...
     24     '宝安','新安街道','西乡街道','福永街道','沙井街道','松岗街道','石岩街道','中心区街道'};
     25 strBaoAnQuSize = size(strBaoAnQu);
     26 
     27 strGuangMingQu = {'光明区','公明','光明', '公明街道', '光明街道'};
     28 strGuangMingQuSize = size(strGuangMingQu);
     29 
     30 strLongGuangQu = {'龙岗区','布吉','坂田','南湾','平湖','横岗','龙岗','龙城','坪地', ... 
     31     '龙岗','布吉街道','坂田街道','南湾街道','平湖街道','横岗街道','龙岗街道','龙城街道','坪地街道'};
     32 strLongGuangQuSize = size(strLongGuangQu);
     33 
     34 strPingShanQu = {'坪山区','坪山','坑梓', ...
     35     '坪山','坪山街道','坑梓街道'};
     36 strPingShanQuSize = size(strPingShanQu);
     37 
     38 FuShanQu = {'受检单位名称','受检单位详细地址','编号'};
     39 LuoHuQu = {'受检单位名称','受检单位详细地址','编号'};
     40 NanShanQu = {'受检单位名称','受检单位详细地址','编号'};
     41 YanTianQu = {'受检单位名称','受检单位详细地址','编号'};
     42 BaoAnQu = {'受检单位名称','受检单位详细地址','编号'};
     43 GuangMingQu = {'受检单位名称','受检单位详细地址','编号'};
     44 LongGuangQu = {'受检单位名称','受检单位详细地址','编号'};
     45 PingShanQu = {'受检单位名称','受检单位详细地址','编号'};
     46 Others = {'受检单位名称'};
     47 
     48 countFuShanQu = 1;
     49 countLuoHuQu = 1;
     50 countNanShanQu = 1;
     51 countYanTianQu = 1;
     52 countBaoAnQu = 1;
     53 countGuangMingQu = 1;
     54 countLongGuangQu = 1;
     55 countPingShanQu = 1;
     56 countOthers = 1;
     57 
     58 %%互斥信号量
     59 mutex = 1;
     60 
     61 %%mat文件内的元胞数组名为 name
     62 data_size = size(name);
     63 col = 1;
     64 for row = 2 : data_size(1)
     65     %%福山区
     66     if(mutex == 1)
     67         for i1 = 1 : strFuTianQuSize(2)
     68 
     69             %%用strfind函数判断指定字符串是否存在于改字符串中,并将结果分别存储在out1和out2中
     70             out1 = strfind(name{row, col}, strFuTianQu{1, i1});  
     71             out2 = strfind(name{row, col + 1}, strFuTianQu{1, i1});
     72 
     73             %%如果out1和out2中有一个不为空,则分类
     74             if(~isempty(out1) || ~isempty(out2))
     75                 FuShanQu{countFuShanQu, 1} = name{row, col};
     76                 FuShanQu{countFuShanQu, 2} = name{row, col + 1};
     77                 FuShanQu{countFuShanQu, 3} = countFuShanQu;
     78                 countFuShanQu = countFuShanQu + 1;
     79                 mutex = 0;
     80                 break;
     81             end
     82         end
     83     end
     84     %%罗湖区
     85     if(mutex == 1)
     86         for i1 = 1 : strLuoHuQuSize(2)
     87 
     88             out1 = strfind(name{row, col}, strLuoHuQu{1, i1});  
     89             out2 = strfind(name{row, col + 1}, strLuoHuQu{1, i1});
     90             %%如果out1和out2中有一个不为空,则分类
     91             if(~isempty(out1) || ~isempty(out2))
     92                 LuoHuQu{countLuoHuQu, 1} = name{row, col};
     93                 LuoHuQu{countLuoHuQu, 2} = name{row, col + 1};
     94                 LuoHuQu{countLuoHuQu, 3} = countLuoHuQu;
     95                 countLuoHuQu = countLuoHuQu + 1;
     96                 mutex = 0;
     97                 break;
     98             end
     99         end
    100     end
    101 
    102     if(mutex == 1)
    103         for i1 = 1 : strNanShanQuSize(2)
    104 
    105             out1 = strfind(name{row, col}, strNanShanQu{1, i1});  
    106             out2 = strfind(name{row, col + 1}, strNanShanQu{1, i1});
    107             %%如果out1和out2中有一个不为空,则分类
    108             if(~isempty(out1) || ~isempty(out2))
    109                 NanShanQu{countNanShanQu, 1} = name{row, col};
    110                 NanShanQu{countNanShanQu, 2} = name{row, col + 1};
    111                 NanShanQu{countNanShanQu, 3} = countNanShanQu;
    112                 countNanShanQu = countNanShanQu + 1;
    113                 mutex = 0;
    114                 break;
    115             end
    116         end
    117     end
    118 
    119     if(mutex == 1)
    120         for i1 = 1 : strYanTianQuSize(2)
    121 
    122             out1 = strfind(name{row, col}, strYanTianQu{1, i1});  
    123             out2 = strfind(name{row, col + 1}, strYanTianQu{1, i1});
    124             %%如果out1和out2中有一个不为空,则分类
    125             if(~isempty(out1) || ~isempty(out2))
    126                 YanTianQu{countYanTianQu, 1} = name{row, col};
    127                 YanTianQu{countYanTianQu, 2} = name{row, col + 1};
    128                 YanTianQu{countYanTianQu, 3} = countYanTianQu;
    129                 countYanTianQu = countYanTianQu + 1;
    130                 mutex = 0;
    131                 break;
    132             end
    133         end
    134     end
    135 
    136     if(mutex == 1)
    137         for i1 = 1 : strBaoAnQuSize(2)
    138             
    139             out1 = strfind(name{row, col}, strBaoAnQu{1, i1});  
    140             out2 = strfind(name{row, col + 1}, strBaoAnQu{1, i1});
    141             %%如果out1和out2中有一个不为空,则分类
    142             if(~isempty(out1) || ~isempty(out2))
    143                 BaoAnQu{countBaoAnQu, 1} = name{row, col};
    144                 BaoAnQu{countBaoAnQu, 2} = name{row, col + 1};
    145                 BaoAnQu{countBaoAnQu, 3} = countBaoAnQu;
    146                 countBaoAnQu = countBaoAnQu + 1;
    147                 mutex = 0;
    148                 break;
    149             end
    150         end
    151     end
    152 
    153     if(mutex == 1)
    154         for i1 = 1 : strGuangMingQuSize(2)
    155             
    156             out1 = strfind(name{row, col}, strGuangMingQu{1, i1});  
    157             out2 = strfind(name{row, col + 1}, strGuangMingQu{1, i1});
    158             %%如果out1和out2中有一个不为空,则分类
    159             if(~isempty(out1) || ~isempty(out2))
    160                 GuangMingQu{countGuangMingQu, 1} = name{row, col};
    161                 GuangMingQu{countGuangMingQu, 2} = name{row, col + 1};
    162                 GuangMingQu{countGuangMingQu, 3} = countGuangMingQu;
    163                 countGuangMingQu = countGuangMingQu + 1;
    164                 mutex = 0;
    165                 break;
    166             end
    167         end
    168     end
    169 
    170     if(mutex == 1)
    171         for i1 = 1 : strLongGuangQuSize(2)
    172             
    173             out1 = strfind(name{row, col}, strLongGuangQu{1, i1});  
    174             out2 = strfind(name{row, col + 1}, strLongGuangQu{1, i1});
    175             %%如果out1和out2中有一个不为空,则分类
    176             if(~isempty(out1) || ~isempty(out2))
    177                 LongGuangQu{countLongGuangQu, 1} = name{row, col};
    178                 LongGuangQu{countLongGuangQu, 2} = name{row, col + 1};
    179                 LongGuangQu{countLongGuangQu, 3} = countLongGuangQu;
    180                 countLongGuangQu = countLongGuangQu + 1;
    181                 mutex = 0;
    182                 break;
    183             end
    184         end
    185     end
    186 
    187     if(mutex == 1)
    188         for i1 = 1 : strPingShanQuSize(2)
    189             
    190             out1 = strfind(name{row, col}, strPingShanQu{1, i1});  
    191             out2 = strfind(name{row, col + 1}, strPingShanQu{1, i1});
    192             %%如果out1和out2中有一个不为空,则分类
    193             if(~isempty(out1) || ~isempty(out2))
    194                 PingShanQu{countPingShanQu, 1} = name{row, col};
    195                 PingShanQu{countPingShanQu, 2} = name{row, col + 1};
    196                 PingShanQu{countPingShanQu, 3} = countPingShanQu;
    197                 countPingShanQu = countPingShanQu + 1;
    198                 mutex = 0;
    199                 break;
    200             end
    201         end
    202     end
    203 
    204     if(mutex == 1)
    205         
    206         Others{countOthers, 1} = name{row, col};
    207         Others{countOthers, 2} = name{row, col + 1};
    208         Others{countOthers, 3} = countOthers;
    209         countOthers = countOthers + 1;
    210         mutex = 0;
    211     end
    212     mutex = 1;
    213 end

    原理为根据深圳市各个区名和区内街道名对公司名进行匹配,原理很简单的,就看你能不能想到了~~,至此,第二题结束

    注:由于做数模这个东西确实恶心了,所以这篇博客在思路和代码方面我只是说了个大概,不想费那么多时间去总结那么细了,所以如果您对我的博客感兴趣并有疑问的话,欢迎评论或者发邮件给我(lityangweiguang@gmail.com)

  • 相关阅读:
    云计算、大数据和人工智能简单概述
    Linux 文件管理命令语法、参数、实例全汇总(一)
    C#之继承
    类和结构(二)
    类和结构(一)
    C#基础语法(二)
    C#基础语法(一)
    dotnet体系结构
    九卷读书:刘润商学院学习笔记1
    Linux内存管理
  • 原文地址:https://www.cnblogs.com/LitLeo/p/3089927.html
Copyright © 2020-2023  润新知