• SQL Server自动化运维系列——监控性能指标脚本(Power Shell)


    需求描述

    一般在生产环境中,有时候需要自动的检测指标值状态,如果发生异常,需要提前预警的,比如发邮件告知,本篇就介绍如果通过Power shell实现状态值监控

    监控值范围

    根据经验,作为DBA一般需要监控如下系统能行指标

      cpu:
     
        Processor(_Total)% Processor Time
        Processor(_Total)% Privileged Time
     
        SQLServer:SQL StatisticsBatch Requests/sec
        SQLServer:SQL StatisticsSQL Compilations/sec
        SQLServer:SQL StatisticsSQL Re-Compilations/sec
        SystemProcessor Queue Length
        SystemContext Switches/sec
     
      Memory:
     
        MemoryAvailable Bytes
        MemoryPages/sec
        MemoryPage Faults/sec
        MemoryPages Input/sec
        MemoryPages Output/sec
        Process(sqlservr)Private Bytes
        SQLServer:Buffer ManagerBuffer cache hit ratio
        SQLServer:Buffer ManagerPage life expectancy
        SQLServer:Buffer ManagerLazy writes/sec
        SQLServer:Memory ManagerMemory Grants Pending
        SQLServer:Memory ManagerTarget Server Memory (KB)
        SQLServer:Memory ManagerTotal Server Memory (KB)
     
      Disk:
     
        PhysicalDisk(_Total)% Disk Time
        PhysicalDisk(_Total)Current Disk Queue Length
        PhysicalDisk(_Total)Avg. Disk Queue Length
        PhysicalDisk(_Total)Disk Transfers/sec
        PhysicalDisk(_Total)Disk Bytes/sec
        PhysicalDisk(_Total)Avg. Disk sec/Read
        PhysicalDisk(_Total)Avg. Disk sec/Write
     
      SQL Server:
     
        SQLServer:Access MethodsFreeSpace Scans/sec
        SQLServer:Access MethodsFull Scans/sec
        SQLServer:Access MethodsTable Lock Escalations/sec
        SQLServer:Access MethodsWorktables Created/sec
        SQLServer:General StatisticsProcesses blocked
        SQLServer:General StatisticsUser Connections
        SQLServer:LatchesTotal Latch Wait Time (ms)
        SQLServer:Locks(_Total)Lock Timeouts (timeout > 0)/sec
        SQLServer:Locks(_Total)Lock Wait Time (ms)
        SQLServer:Locks(_Total)Number of Deadlocks/sec
        SQLServer:SQL StatisticsBatch Requests/sec
        SQLServer:SQL StatisticsSQL Re-Compilations/sec

    上述指标含义,可以参照我上一篇文章:SQL Server需要监控哪些计数器 

    监控脚本

    $server = "(local)"
    $uid = "sa"
    $db="master"
    $pwd="password"
    $mailprfname = "SendEmail"
    $recipients = "787449667@qq.com"
    $subject = "数据库指标异常了!"
    $computernamexml = "f:computername.xml"
    $alter_cpuxml = "f:alter_cpu.xml"
    function GetServerName($xmlpath)
    {
        $xml = [xml] (Get-Content $xmlpath)
        $return = New-Object Collections.Generic.List[string]
        for($i = 0;$i -lt $xml.computernames.ChildNodes.Count;$i++)
        {
            if ( $xml.computernames.ChildNodes.Count -eq 1)
            {
                $cp = [string]$xml.computernames.computername
            }
            else
            {
                $cp = [string]$xml.computernames.computername[$i]
            }
            $return.Add($cp.Trim())
        }
        $return
    }
    
    function GetAlterCounter($xmlpath)
    {
        $xml = [xml] (Get-Content $xmlpath)
        $return = New-Object Collections.Generic.List[string]
        $list = $xml.counters.Counter
        $list
    }
    
    function CreateAlter($message)
    {
        $SqlConnection = New-Object System.Data.SqlClient.SqlConnection 
        $CnnString ="Server = $server; Database = $db;User Id = $uid; Password = $pwd" 
        $SqlConnection.ConnectionString = $CnnString 
        $CC = $SqlConnection.CreateCommand(); 
        if (-not ($SqlConnection.State -like "Open")) { $SqlConnection.Open() } 
        
        $cc.CommandText=" EXEC msdb..sp_send_dbmail 
                 @profile_name  = '$mailprfname'
                ,@recipients = '$recipients'
                ,@body = '$message'
                ,@subject = '$subject'
    " 
        $cc.ExecuteNonQuery()|out-null 
        $SqlConnection.Close();
    }
    
    $names = GetServerName($computernamexml)
    $pfcounters = GetAlterCounter($alter_cpuxml)
    foreach($cp in $names)
    {
        $p = New-Object Collections.Generic.List[string]
        $report = ""
        foreach ($pfc in $pfcounters)
        {
            $b = ""
            $counter ="\"+$cp+$pfc.get_InnerText().Trim()
            $p.Add($counter)
            
        }
        $count = Get-Counter $p
        for ($i = 0; $i -lt $count.CounterSamples.Count; $i++)
        {
            $v = $count.CounterSamples.Get($i).CookedValue
            $pfc = $pfcounters[$i]
            #$pfc.get_InnerText()
            $b = ""
            $lg = ""
            if($pfc.operator -eq "lt")
            {
                if ($v -ge [double]$pfc.alter)
                    {$b = "alter"
                    $lg = "Greater Than"}
            }
            elseif ($pfc.operator -eq "gt")
            {
                if( $v -le [double]$pfc.alter)
                    {$b = "alter"
                    $lg = "Less Than"}
            }
            if($b -eq "alter")
            {
                $path = "\"+$cp+$pfc.get_InnerText()
                
                $item = "{0}:{1};{2} Threshold:{3}" -f $path,$v.ToString(),$lg,$pfc.alter.Trim()
                $report += $item + "`n"
            }
            
        }
        if($report -ne "")
        {
            #生产警告 参数 计数器,阀值,当前值
            CreateAlter $report
        }
    }

    其中涉及到2个配置文件:computernamexml,alter_cpuxml分别如下:

    <computernames>
            <computername>
                    wuxuelei-pc
            </computername>
    </computernames>
    <Counters>
            <Counter alter = "10" operator = "gt" >Processor(_Total)\% Processor Time</Counter>
            <Counter alter = "10" operator = "gt" >Processor(_Total)\% Privileged Time</Counter>
            <Counter alter = "10" operator = "gt" >SQLServer:SQL StatisticsBatch Requests/sec</Counter>
            <Counter alter = "10" operator = "gt" >SQLServer:SQL StatisticsSQL Compilations/sec</Counter>
            <Counter alter = "10" operator = "gt" >SQLServer:SQL StatisticsSQL Re-Compilations/sec</Counter>
            <Counter alter = "10" operator=  "lt" >SystemProcessor Queue Length</Counter>
            <Counter alter = "10" operator=  "lt" >SystemContext Switches/sec</Counter>
    </Counters>

    其中 alter 就是阀值,如第一条,如果 阀值 > 性能计数器值,就会发出警告。

    其实这种自定义配置的方式,实现了灵活多变的自动化监控标准:

    1、比如可以检测磁盘空间大小

    2、检测运行峰值状态

    3、定时的根据历史运行值,更改生产系统中的阀值大小,也就是所谓的运行基线

    警告实现方式

    1、SQL Agent配置Job方式实现

    2、计划任务

    以上两种配置方式,可以灵活掌握,操作还是蛮简单的,如果不会,可自行google。当然,如果不想干预正常的生产系统,可以添加一个Server专门用来自动化运维检测来用,实现远程监控。

    后续文章中会分析关于Power Shell的远程调用,并且能实现事故当前状态下,自动化截图....自动Send Email......为DBA现场取证第一手材料...方便诊断问题...

    效果图如下

    以上只提供实现方式,如需要内容更新,自己灵活更新。

    脚本下载地址http://files.cnblogs.com/zhijianliutang/DBALter.zip

  • 相关阅读:
    namespace std 定义的位置
    [Struts]学习日记3 在页面中显示条目列表
    [Hibernate]关于ID的一个容易混淆的地方
    [Struts]"Cannot find bean in any scope"之一解
    [Struts]HibernatePlugIn for Struts(转贴)
    日志搬家了!
    [Struts]学习日记2 增加一些验证
    实验室的项目 讨论
    Struts常见异常信息和解决方法
    参加婚礼
  • 原文地址:https://www.cnblogs.com/zhijianliutang/p/4170488.html
Copyright © 2020-2023  润新知