• Exdata cell 节点配置时遇到的一个问题




    问题描写叙述:

    [celladmin@vrh4 ~]$ cellcli
    CellCLI: Release 11.2.3.2.0 - Production on Sat Jun 14 09:11:08 EDT 2014

    Copyright (c) 2007, 2012, Oracle.  All rights reserved.
    Cell Efficiency Ratio: 1

    CellCLI> create celldisk all

    CELL-02559: There is a communication error between MS and CELLSRV.

    CellCLI> alter cell restart services all

    Stopping the RS, CELLSRV, and MS services...
    The SHUTDOWN of services was successful.
    Starting the RS, CELLSRV, and MS services...
    Getting the state of RS services...  running
    Starting CELLSRV services...
    The STARTUP of CELLSRV services was not successful.
    CELL-01547: CELLSRV startup failed due to unknown reasons.

    Starting MS services...
    The STARTUP of MS services was successful.

    CellCLI>


    rs。ms 服务起来了。但cellsrv 服务都起不来


    问题处理:

    alert.log:


    CELLSRV process id=3403
    CELLSRV cell host name=vrh4.oracle.com
    CELLSRV version=11.2.3.2.0,label=OSS_11.2.3.2.0_LINUX.X64_120713,Fri_Jul_13_12:37:13_PDT_2012
    OS Hugepage status:
       Total/free hugepages available=32/32; hugepage size=2048KB
    OS Stats: Physical memory: 497 MB. Num cores: 1
    CELLSRV configuration parameters:
    version=0.0
    Cellsrv max memory not set. Total physical mem: 497 MB is less than required minimum: 3891 MB.
    celldisk policy config read from /opt/oracle/cell11.2.3.2.0_LINUX.X64_120713/cellsrv/deploy/config/cdpolicy.dat with ver

    no. 1 and pol no. 0
    Auto Online Feature 1.3
    CellServer MD5 Binary Checksum: cf96327cbbec459c6ac80deaec94d5cd
    Sat Jun 14 09:12:00 2014
    [RS] Started Service MS with pid 3258
    OS Hugepage status:
       Total/free hugepages available=39/39; hugepage size=2048KB
    WARNING: System has fewer hugepages available than needed.
    Cache Allocation: Num 1MB hugepage buffers: 78 Num 1MB non-hugepage buffers: 822
    MS_ALERT HUGEPAGE WARNING 78 822
    ossmmap_map: mmap failed for Mmap memory len: 1624010752 errno: 12  --------------------mmap 无法映射内存
    Physical memory on the system might be low.            ---------------------------这里报错信息非常明白。物理内存不够啊
    Sat Jun 14 09:12:05 2014
    Errors in file /opt/oracle/cell11.2.3.2.0_LINUX.X64_120713/log/diag/asm/cell/vrh4/trace/svtrc_3403_0.trc  (incident=65):
    ORA-00600: internal error code, arguments: [Cache: map_failed], [], [], [], [], [], [], [], [], [], [], []
    Incident details in:

    /opt/oracle/cell11.2.3.2.0_LINUX.X64_120713/log/diag/asm/cell/vrh4/incident/incdir_65/svtrc_3403_0_i65.trc
    Sweep [inc][65]: completed
    CELLSRV error - ORA-600 internal error
    Sat Jun 14 09:12:16 2014
    [RS] monitoring process /opt/oracle/cell11.2.3.2.0_LINUX.X64_120713/cellsrv/bin/cellrsomt (pid: 0) returned with error: 126
    [RS] Monitoring process for service CELLSRV detected a flood of restarts. Disable monitoring process.
    Errors in file /opt/oracle/cell11.2.3.2.0_LINUX.X64_120713/log/diag/asm/cell/vrh4/trace/rstrc_3248_4.trc  (incident=73):
    RS-7445 [CELLSRV monitor disabled] [Detected a flood of restarts] [] [] [] [] [] [] [] [] [] []
    Incident details in:

    /opt/oracle/cell11.2.3.2.0_LINUX.X64_120713/log/diag/asm/cell/vrh4/incident/incdir_73/rstrc_3248_4_i73.trc
    Sweep [inc][73]: completed


    继续查看其他信息:

    [root@vrh4 trace]# more /opt/oracle/cell11.2.3.2.0_LINUX.X64_120713/log/diag/asm/cell/vrh4/trace/svtrc_3403_0.trc
    Trace file /opt/oracle/cell11.2.3.2.0_LINUX.X64_120713/log/diag/asm/cell/vrh4/trace/svtrc_3403_0.trc
    ORACLE_HOME = /opt/oracle/cell11.2.3.2.0_LINUX.X64_120713
    System name:    Linux
    Node name:      vrh4.oracle.com
    Release:        2.6.18-274.el5
    Version:        #1 SMP Mon Jul 25 13:17:49 EDT 2011
    Machine:        x86_64
    CELL SW Version:        OSS_11.2.3.2.0_LINUX.X64_120713

    *** 2014-06-14 09:11:53.184
    CellDisk Policy configuration:
    1 #version_ossp_cdperf_policy
    0 #uniq_pol_num_ossp_cdperf_policy
    2 #hang_hd_ossp_cdperf_policy
    2 #hang_fd_ossp_cdperf_policy
    2 #slow_abs_hd_ossp_cdperf_policy
    2 #slow_abs_fd_ossp_cdperf_policy
    2 #slow_rltv_hd_ossp_cdperf_policy
    2 #slow_rltv_fd_ossp_cdperf_policy
    2 #slow_lat_hd_ossp_cdperf_policy
    2 #slow_lat_fd_ossp_cdperf_policy
    0 #ioerr_hd_ossp_cdperf_policy
    2 #ioerr_fd_ossp_cdperf_policy
    0 #powercycle_hang_ossp_cdperf_policy
    0 #powercycle_hang_wtfc_ossp_cdperf_policy
    6 #lat_freq_ossp_cdperf_policy
    50 #asm_offline_freq_ossp_cdperf_policy
    30 #dmwg_avgrqsize_tolr_ossp_cdperf_policy
    30 #dmwg_avgnumreads_tolr_ossp_cdperf_policy
    30 #dmwg_avgnumwrites_tolr_ossp_cdperf_policy
    100 #dmwg_avgrqsize_min_ossp_cdperf_policy
    8 #dmwg_avgrqsizefl_min_ossp_cdperf_policy
    10 #dmwg_avgnumreads_min_ossp_cdperf_policy
    10 #dmwg_avgnumwrites_min_ossp_cdperf_policy
    3 #dmwg_lownumreads_ossp_cdperf_policy
    3 #dmwg_lownumwrites_ossp_cdperf_policy
    30 #dmwg_lowlatreads_ossp_cdperf_policy
    30 #dmwg_lowlatwrites_ossp_cdperf_policy
    1 #dmwg_avgqdepreads_min_ossp_cdperf_policy
    5 #dmwg_avgqdepreadsfl_min_ossp_cdperf_policy
    1 #dmwg_avgqdepwrites_min_ossp_cdperf_policy
    5 #dmwg_avgqdepwritesfl_min_ossp_cdperf_policy
    100 #dmwg_avgqdepreads_tolr_ossp_cdperf_policy
    100 #dmwg_avgqdepwrites_tolr_ossp_cdperf_policy
    100 #dmwg_avgqszreads_tolr_ossp_cdperf_policy
    100 #dmwg_avgqszwrites_tolr_ossp_cdperf_policy
    60 #dmwg_same_pct_ossp_cdperf_policy
    3 #conf_hd_max_num_ossp_cdperf_policy
    8 #conf_fd_max_num_ossp_cdperf_policy
    3 #proa_fail_hd_max_num_ossp_cdperf_policy
    8 #proa_fail_fd_max_num_ossp_cdperf_policy
    2 #hung_hd_max_num_reboot_ossp_cdperf_policy
    9 #hung_fd_max_num_reboot_ossp_cdperf_policy
    3 #numtriggers_thld_5hrs_ossp_cdperf_policy
    4 #numtriggers_thld_day_ossp_cdperf_policy
    5 #numtriggers_thld_week_ossp_cdperf_policy
    7 #numtriggers_thld_month_ossp_cdperf_policy
    8 #numtriggers_thld_quart_ossp_cdperf_policy
    6 #ioerr_numthld_near_ossp_cdperf_policy
    10 #ioerr_numnzero_near_ossp_cdperf_policy
    20 #ioerr_numthld_far_ossp_cdperf_policy
    50 #ioerr_numnzero_far_ossp_cdperf_policy
    50 #err_lat_timeout_ossp_cdperf_policy
    6 #err_lat_numthld_near_ossp_cdperf_policy
    10 #err_lat_numnzero_near_ossp_cdperf_policy
    20 #err_lat_numthld_far_ossp_cdperf_policy
    50 #err_lat_numnzero_far_ossp_cdperf_policy
    90000 95000 100 6 10 20 50 10000 300 200 7 10 30 50 20000 500 200 500 200 14 20 14 20 24 40 24 40

    #dmg_params_ossp_cdperf_policy[0]
    90000 95000 200 6 10 20 50 30000 300 200 7 10 30 50 60000 500 200 500 200 14 20 14 20 24 40 24 40

    #dmg_params_ossp_cdperf_policy[1]
    90000 95000 150 6 10 20 50 24000 300 200 7 10 30 50 48000 500 200 500 200 14 20 14 20 24 40 24 40

    #dmg_params_ossp_cdperf_policy[2]
    90000 95000 100 6 10 20 50 15000 300 200 7 10 30 50 30000 500 200 500 200 14 20 14 10 24 40 24 40

    #dmg_params_ossp_cdperf_policy[3]
    90000 95000 100 6 10 20 50 6000 300 200 7 10 30 50 12000 500 200 500 200 14 20 14 10 24 40 24 40

    #dmg_params_ossp_cdperf_policy[4]
    90000 95000 200 6 10 20 50 15000 300 200 25 40 30 50 20000 2000 1500 2000 1500 20 30 20 30 25 40 25 40

    #dmg_params_ossp_cdperf_policy[5]
    90000 95000 300 6 10 20 50 40000 300 200 25 40 30 50 80000 2000 1500 2000 1500 20 30 20 30 25 40 25 40

    #dmg_params_ossp_cdperf_policy[6]
    90000 95000 250 6 10 20 50 30000 300 200 25 40 30 50 60000 2000 1500 2000 1500 20 30 20 30 25 40 25 40

    #dmg_params_ossp_cdperf_policy[7]
    90000 95000 200 6 10 20 50 25000 300 200 25 40 30 50 40000 2000 1500 2000 1500 20 30 20 30 25 40 25 40

    #dmg_params_ossp_cdperf_policy[8]
    90000 95000 200 6 10 20 50 10000 300 200 25 40 30 50 20000 2000 1500 2000 1500 20 30 20 30 25 40 25 40

    #dmg_params_ossp_cdperf_policy[9]
    90000 95000 50 6 10 20 50 2000 300 200 20 30 30 50 4000 500 200 500 200 14 20 14 20 24 40 24 40

    #dmg_params_ossp_cdperf_policy[10]
    90000 95000 25 6 10 20 50 1000 300 200 7 10 30 50 2000 500 200 500 200 14 20 14 20 24 40 24 40

    #dmg_params_ossp_cdperf_policy[11]
    90000 95000 50 6 10 20 50 2000 300 200 7 10 30 50 4000 500 200 500 200 14 20 14 20 24 40 24 40

    #dmg_params_ossp_cdperf_policy[12]
    90000 95000 50 6 10 20 50 2000 300 200 7 10 30 50 4000 500 200 500 200 14 20 14 20 24 40 24 40

    #dmg_params_ossp_cdperf_policy[13]
    400000 410000 3000 6 10 20 50 50000 1000 800 7 10 30 50 100000 2000 2000 2000 2000 20 30 20 30 25 40 25 40

    #dmg_params_ossp_cdperf_policy[14]
    42346 #checksum_ossp_cdperf_policy
    LockPool name:Storage Index Lock Pool type:RWLOCK POOL group:35 numLocks:1024 nextLockIndex:0 totalLockRefs:0

    lockArray:0x2accba272660
    2014-06-14 09:11:53.898190*: Opened file

    /opt/oracle/cell11.2.3.2.0_LINUX.X64_120713/cellsrv/deploy/config/griddisk.owners.dat, version 11.2.2.4.0, descriptor 14
    2014-06-14 09:12:01.801656*: CELLSRV needs 463 hugepages, but there are only 32 available. 2014-06-14 09:12:01.838968*:  ----------------------这里的报错已经很明晰了

    CELLSRV trying to reserve 431 more hugepages.
    2014-06-14 09:12:02.021569*: Successfully allocated 78MB of hugepages for buffersWriting message type

    OSS_PIPE_ERR_FAILED_STARTUP_RESTART to OSS->RS pipe
    DDE: Flood control is not active
    Incident 65 created, dump file:

    /opt/oracle/cell11.2.3.2.0_LINUX.X64_120713/log/diag/asm/cell/vrh4/incident/incdir_65/svtrc_3403_0_i65.trc
    ORA-00600: internal error code, arguments: [Cache: map_failed], [], [], [], [], [], [], [], [], [], [], []

    2014-06-14 09:12:15.281868*: CELLSRV error - ORA-600 internal error


    看来cell 节点要加大内存才干解决这个问题啊

    
  • 相关阅读:
    经济--股票--深圳指数基金
    经济--年终奖理财攻略
    经济--股票--基金经理打死不肯说的赚钱黑幕
    经济--股票--基金定投
    经济--股票--基金分类的三个角度
    经济学--股票--必胜法则
    经济--降息对股市是好消息还是坏消息?
    PHP数组的排序函数
    使用回调函数处理数组的函数
    统计数组元素的个数和唯一性的函数
  • 原文地址:https://www.cnblogs.com/mfmdaoyou/p/6718134.html
Copyright © 2020-2023  润新知