SQL Server 2008性能数据收集（Data Collector)的一些扩展话题

SQL Server 2008性能数据收集（Data Collector)的一些扩展话题
Data Collector是SQL Server 2008 新增的一个特性，位列管理员需知的top 10列表中。该功能在SQL Server 2008 R2中没有太大的变化

什么是Data Collector

关于这个主题，请直接参考微软官方的说明 http://msdn.microsoft.com/zh-CN/library/bb677248.aspx

关于如何配置Data Collector的详细步骤，可以参考 http://www.qudong.com/soft/program/Sql%20Server/jichujiaocheng/20090106/28656.html

本文主要解释几个与该功能有关的扩展话题，也是我曾经被几次问到的

能不能收集多个实例的数据

很多管理员都关心这个话题，因为DBA需要管理多个实例，那么是不是需要在多个实例上面都去配置那个数据仓库呢？

应该不是这样的。数据收集器功能的架构是下面这样

也就是说，可以只有一个数据仓库（MDW:Management Data Warehouse)，然后在多个Target instance上面，配置收集，并且将其结果发送到这个中心的MDW中来。DBA们可以通过客户端机器，远程控制MDW，并且查看报表。

对性能的影响是怎么样的

既然数据收集是在每个需要收集的实例上面直接运行的，那么就有DBA问到，这样的话会不会对这个实例产生不利的影响呢？这个说法是这样，肯定是有影响的，因为性能收集说到底是一种查询，包括对DMV的查询，或者对性能计数器的查询。而它查询是定期运行的。例如Server Activity的话，默认都是60秒收集一次。据一般的估计，如果只是使用了默认的三个系统收集组，而且没有进行修改所有的默认收集或者上传的时间，那么配置了性能收集，对当前实例的影响主要体现在会加重CPU的一点点负担，具体大约是5%左右。数据的体积大约为300MB左右/天。

如何自定义数据集(Collection Set)

系统默认自带了3个（SQL SERVER 2008)或者4个（SQL Server 2008 R2)数据集

但是，如果我们需要自定义数据集，应该怎么做呢？下面有一个范例脚本

请注意，这个脚本是在msdb中进行工作
```
Use msdb
go

Declare @collection_set_id_1 int
Declare @collection_set_uid_2 uniqueidentifier
EXEC [dbo].[sp_syscollector_create_collection_set] 
    @name=N'Disk Performance and SQL CPU', 
    @collection_mode=1, 
    @description=N'Collects logical disk performance counters and SQL Process CPU', 
    @target=N'', 
    @logging_level=0, 
    @days_until_expiration=7, 
    @proxy_name=N'', 
    @schedule_name=N'CollectorSchedule_Every_5min', 
    @collection_set_id=@collection_set_id_1 OUTPUT, 
    @collection_set_uid=@collection_set_uid_2 OUTPUT
Select collection_set_id_1=@collection_set_id_1, collection_set_uid_2=@collection_set_uid_2

Declare @collector_type_uid_3 uniqueidentifier
Select @collector_type_uid_3 = collector_type_uid From [dbo].[syscollector_collector_types] Where name = N'Performance Counters Collector Type';
Declare @collection_item_id_4 int
EXEC [dbo].[sp_syscollector_create_collection_item] 
@name=N'Logical Disk Collection and SQL Server CPU', 
@parameters=N'<ns:PerformanceCountersCollector xmlns:ns="DataCollectorType">
    <PerformanceCounters Objects="LogicalDisk" 
        Counters="Avg. Disk Bytes/Read" 
        Instances="*" />
    <PerformanceCounters Objects="LogicalDisk" 
        Counters="Avg. Disk Bytes/Write" 
        Instances="*" />
    <PerformanceCounters Objects="LogicalDisk" 
        Counters="Avg. Disk sec/Read" 
        Instances="*" />
    <PerformanceCounters Objects="LogicalDisk" 
        Counters="Avg. Disk sec/Write" 
        Instances="*" />
    <PerformanceCounters Objects="LogicalDisk" 
        Counters="Disk Read Bytes/sec" 
        Instances="*" />
    <PerformanceCounters Objects="LogicalDisk" 
        Counters="Disk Write Bytes/sec" 
        Instances="*" />
    <PerformanceCounters Objects="Process" 
        Counters="% Privileged Time" 
        Instances="sqlservr" />
    <PerformanceCounters Objects="Process" 
        Counters="% Processor Time" 
        Instances="sqlservr" />
</ns:PerformanceCountersCollector>', 
@collection_item_id=@collection_item_id_4 OUTPUT, 
@frequency=5, 
@collection_set_id=@collection_set_id_1, 
@collector_type_uid=@collector_type_uid_3
Select @collection_item_id_4
go 
```
执行完之后，就有下面这样一个新的Collection Set出来

然后可以启用它，并且收集，上传
```
EXEC sp_syscollector_start_collection_set @collection_set_id = <collection_set_id_1>
-- replace <collection_set_id_1> with value from above 
```
最后，运行下面的脚本可以获得结果（注意，这个脚本不是在msdb中运行，而是在数据仓库中）
```
select spci.path as 'Counter Path', spci.object_name as 'Object Name',
spci.counter_name as 'counter Name', spci.instance_name,
spcv.formatted_value as 'Formatted Value',
spcv.collection_time as 'Collection Time',
csii.instance_name as 'SQL Server Instance' 
from snapshots.performance_counter_values spcv, 
snapshots.performance_counter_instances spci,
msdb.dbo.syscollector_collection_sets_internal scsi,
core.source_info_internal csii,
core.snapshots_internal csi
where spcv.performance_counter_instance_id = spci.performance_counter_id and
scsi.collection_set_uid=csii.collection_set_uid and
csii.source_id = csi.source_id and csi.snapshot_id=spcv.snapshot_id and
scsi.name = 'Disk Performance and SQL CPU'
order by spcv.collection_time desc
```
结果大致如下

希望对于大家有所帮助
相关阅读:
websocket 工作原理
 Flask中的wtforms使用
 DBUtils
Django模板语言与视图(view)
Django之图书管理系统
 Django的安装创建与连接数据库
 pymyspl模块
 多表查询与索引
 表的关系与查询
 mysql的数据类型与表约束
原文地址：https://www.cnblogs.com/chenxizhang/p/1986293.html

SQL Server 2008性能数据收集（Data Collector)的一些扩展话题

什么是Data Collector

能不能收集多个实例的数据

对性能的影响是怎么样的

如何自定义数据集(Collection Set)