• 统计一个英文文本中单词出现的频率


    任务:统计一个文本中单词出现的频率,并且输出频率最高的前十个单词及其出现次数

    思考:在编写程序前,我先确定了用C语言来编写代码

    程序源代码:

    #include <stdio.h>
    #include <string.h>
    //能统计的最大单词个数,可以自己改
    #define MAX_WORD_COUNT 500
    //结构体,保存每个单词及对应的个数
    typedef struct WordCount
     {
     char cWord[20];
     int  iCount;
    }T_WordCount;
     
    int CalcEachWord(const char *pText);//计算单词个数及输出信息等
    void LowerText(char *pText);//把单词变成小写形式
    void SwapItem(T_WordCount *ItemA, T_WordCount * ItemB);//交换两个元素
    void SortWord(T_WordCount *pWordSet);//排序
     
    int main(int argc, char *argv[])
    {
     //测试文本
     FILE *fp=NULL;
    fp=fopen("D:\text.txt","r");
    if(fp == NULL)
    {
        return -1;
    }
    char cBuf[1001]={0};
    fread(cBuf, 1, 1000,fp);
    printf("----------------------------------\n");
     printf("The top 10 words is :\n");
     
     CalcEachWord(cBuf); return 0;
    }
     
    int CalcEachWord(const char *cBuf)
    {
     char cTmp[20] = {0};
     int  i   = 0;
     char *pTmp   = cTmp;
     int  iFlag   = 0;
     
     T_WordCount tWordSet[MAX_WORD_COUNT];
     memset(tWordSet, 0, sizeof(tWordSet));
     
     while (*cBuf != '\0')
     {
      if ((*cBuf >= 'A' && *cBuf <= 'Z') || (*cBuf >= 'a' && *cBuf <= 'z'))
      {  
     
       *pTmp = *cBuf;
       pTmp++;
     
      }
      else if (*cBuf == '-')
      {
       ++cBuf;
       continue;
      }
      else
      {
     
       if (strlen(cTmp) > 0)
       {
        LowerText(cTmp);
        iFlag = 0;
        for (i = 0; i < MAX_WORD_COUNT; ++i)
        {
         if (strlen(tWordSet[i].cWord) > 0)
         {
          if (strcmp(tWordSet[i].cWord, cTmp) == 0)
          {
           iFlag = 1;
           tWordSet[i].iCount++;
           break;
          }    
         }
         else
         {
          strcpy(tWordSet[i].cWord, cTmp);
          tWordSet[i].iCount = 1;
          iFlag = 1;
          break;
         }
     
        }
        if (!iFlag)
        {
         printf("No more space to save word.\n");
        }
     
       }
       memset(cTmp, 0, 20);
       pTmp = cTmp;
      }
     
      ++cBuf;
     }
     
    //排序 SortWord(tWordSet);
     for (i = 0; i < 10; ++i)
     {
      if (strlen(tWordSet[i].cWord) > 0)
      {
       printf("%s:%d\n",tWordSet[i].cWord,tWordSet[i].iCount);
      }
     }
     
     return 0;
    }
     
    void LowerText(char *cBuf)
    {
     char *pTmp = cBuf;
     while (*pTmp != '\0')
     {
      if ((*pTmp >= 'A' && *pTmp <= 'Z'))
      {
       *pTmp += 32 ;
      }
     
      pTmp++; }
    }
     
    void SwapItem(T_WordCount *ItemA, T_WordCount * ItemB)
    {
     T_WordCount Tmp;
     memset(&Tmp, 0, sizeof(T_WordCount));
     strcpy(Tmp.cWord, ItemA->cWord);
     Tmp.iCount = ItemA->iCount;
     
     strcpy(ItemA->cWord, ItemB->cWord); ItemA->iCount = ItemB->iCount;
     strcpy(ItemB->cWord, Tmp.cWord); ItemB->iCount = Tmp.iCount;
    }
    //冒泡排序算法
    void SortWord(T_WordCount *pWordSet){
     int i,j;
     for (j = 0; j < MAX_WORD_COUNT - 1; j++)
     {  
      for (i = 0; i < MAX_WORD_COUNT - 1 - j; i++)
      {    
       if (pWordSet[i].iCount < pWordSet[i+1].iCount)    
       {                            
        SwapItem(&pWordSet[i], &pWordSet[i+1]);
       }    
      }
     }
    }

    文本内容:

    With the Games officially announced the beginning of each class followed by admission to the ranks of marching parade performances. At this point march suddenly sounded, one after another a class rank of neat formation, dance props set great strides in coming to the podium. Everyone in their bright clothes, smiling, hold our heads up, demonstrating the unique vitality of youth and vitality. The performances of these classes strengths and weaknesses, and some formation varied, and some clothing brightest, and some neat moves, and some arrangement was new, their graceful dance to attract everyone's attention. Props in the hands of these students can be described as great variety, variety. Some, and the dynamic music, holding a pair of chopsticks beating out the rhythm of sonorous; some flapping the ball like a cheerful bright wizard; some dress swayed, holding a dance fan, to draw a beautiful arc Road, and some Qiyuxuanang , hand-held gun salute in the air emitted by colorful fireworks. Their performance to the entire stint of the many splendours of color games, like spring flowers, summer sunshine, with a cool autumn wind blow against our faces, so that the presence of teachers and students are all touched, and are all delighted. It is understood that the road parade costumes and props  in many of them are the students themselves to select and purchase, and they show the formation and movement are also explored and arrangement of their own. This is totally reflects the student's enthusiasm and longing for the Games, but also fully demonstrated their ability to act independently and strong organizational skills

    测试结果:

  • 相关阅读:
    洛谷P1441 砝码称重(搜索,dfs+bitset优化)
    洛谷P1441 砝码称重(搜索,dfs+dp)
    洛谷P1242 新汉诺塔(dfs,模拟退火)
    洛谷P1415 拆分数列(dp)
    openstack-往已有集群中添加控制节点,实现控制节点的高可用
    第九步:使用nfs作为cinder-controller后端存储
    第八步:安装openstack-dashboard
    第七步(2):安装openstack-cinder服务-控制节点
    第七步(1):安装openstack-cinder服务-存储节点
    第六步:创建实例进行测试
  • 原文地址:https://www.cnblogs.com/hfxdaj/p/3575790.html
Copyright © 2020-2023  润新知