• Split string in SQL Server 2005+ CLR vs. TSQL(转载)


    来源:

     http://blogs.lessthandot.com/index.php/DataMgmt/DBProgramming/split-string-in-sql-server-2005-clr-vs-t

    原文内容:

    I'm editing in a link to Adam Machanic's blog on this. In the comments on this topic here you will see there are imperfections found in my methods. Reading Adam's blog shows this in more detail.
    http://sqlblog.com/blogs/adam_machanic/archive/2009/04/26/faster-more-scalable-sqlclr-string-splitting.aspx

    Thanks Adam!

    I wrote this short CLR split function awhile back based on a few other articles I read when 2005 was released. I decided to play with it today and see if I could put it with the SQL split solutions.

    Let's get the basics out of the way on SQL CLR. SQL CLR is only good once it's in memory. The CLR function split basically won over the T-SQL split functions after it was cached. This is a critical variable to consider when thinking CLR vs. T-SQL options on coding. If you are doing heavy manipulation of data and heavy math, CLR will typically help you, but you should be very careful with CLR and memory management. You can run your server resources out and literally stop functionality. I highly recommend reading MrDenny's blog on CLR here. Denny touches on important topics on when to use CLR and why you shouldn't. After that, look into how SQL Server 32bit, 32bit with AWE and 64bit versions handle memory. Each handles memory differently. AWE enalbed instances will probably be the one that will cause you more headaches then the rest. I had severe memory issues a few months ago on a production database server that forced restarts nightly until I fixed the problem. I analyzed the problem and it came to be several factors that caused it and SQL CLR memory was one of those factors. Here is my chance to thank mrdenny and ptheriault again for the assisatnce on that strange problem.

    I went out and google'd "fast split function t-sql". Found a few and tested them against the CLR split method. I found a dozen or so split functions that looked good. I still went with a numbers table one after testing them out next to each other. Here is one of the functions I used. If you have a better one, post it in the comments and I can edit the post.

    1. ALTER FUNCTION [dbo].[Split] (
    2. @List VARCHAR(7998), --The delimited list
    3. @Del CHAR(1) = ',' --The delimiter
    4. )
    5. RETURNS @T TABLE (Item VARCHAR(7998))
    6. AS
    7. BEGIN
    8. DECLARE @WrappedList VARCHAR(8000), @MaxItems INT
    9. SELECT @WrappedList = @Del + @List + @Del, @MaxItems = LEN(@List)
    10.  
    11. INSERT INTO @T (Item)
    12. SELECT SUBSTRING(@WrappedList, Number + 1, CHARINDEX(@Del, @WrappedList, Number + 1) - Number - 1)
    13. FROM dbo.Numbers n
    14. WHERE n.Number <= LEN(@WrappedList) - 1
    15. AND SUBSTRING(@WrappedList, n.Number, 1) = @Del
    16.  
    17. RETURN
    18. END

    Here is my CLR split

    1. using System;
    2. using System.Data;
    3. using System.Collections;
    4. using System.Data.SqlClient;
    5. using System.Data.SqlTypes;
    6. using Microsoft.SqlServer.Server;
    7.  
    8. public partial class UserDefinedFunctions
    9. {
    10.     [SqlFunction(Name = "CLR_Split",
    11.     FillRowMethodName = "FillRow",
    12.     TableDefinition = "id nvarchar(10)")]
    13.  
    14.     public static IEnumerable SqlArray(SqlString str, SqlChars delimiter)
    15.     {
    16.         if (delimiter.Length == 0)
    17.             return new string[1] { str.Value };
    18.         return str.Value.Split(delimiter[0]);
    19.     }
    20.  
    21.     public static void FillRow(object row, out SqlString str)
    22.     {
    23.         str = new SqlString((string)row);
    24.     }
    25. };

    I loaded a text file with a huge amount of delimited data to really get a gauge on time this would take. The string is basically, "data%data%data%data%data" and on. Around 600 indexes. I restarted my local instance of SQL Server 2005 that I did these on to ensure you can see CLR before cache and after.

    So on a fresh restart you can see by checking type MEMORYCLERK_SQLCLR from dm_os_memory_clerks that we are clear

    After execution of the CLR function you can see the differences

    Call these functions as

    1. DECLARE @bigdamnvar VARCHAR(MAX)
    2. SET @bigdamnvar = (SELECT * FROM OPENROWSET ('MSDASQL', 'Driver={Microsoft Text Driver (*.txt; *.csv)};DBQ=C:\', 'SELECT * from Data.txt'))
    3. SELECT * FROM dbo.CLR_Split(@bigdamnvar, '%')

    Below you can see first execution and then how quickly performance picks up on the second cached plan

    and

    1. DECLARE @bigdamnvar VARCHAR(MAX)
    2. SET @bigdamnvar = (SELECT * FROM OPENROWSET ('MSDASQL', 'Driver={Microsoft Text Driver (*.txt; *.csv)};DBQ=C:\', 'SELECT * from Data.txt'))
    3. SELECT * FROM dbo.[Split](@bigdamnvar, '%')

    I executed this a few times to get it in cache as well. The odd increase was the server working on something else. I validated that so ignore it.

    You may notice the bytes recieved from the server are different and SQL CLR is much heavier. That is something to keep in mind. SQL will always be light compared to SQL CLR. That is my experience in using the two side by side.

    The differences are small but if the task you intend to perform is something cached typically, keep CLR in mind. Mostly when it come to complicated tasks that may be left to other languages you have available.

    About the Author

    Ted Krueger has been working in development and database administration for 13+ years. Specialties range from High Availability and Disaster / Recovery setup and testing methods down to custom assembly development for SQL Server Reporting Services. Ted blogs and is also one of the founders of LessThanDot.com technology community. Some of the articles focused on are Backup / Recovery, Security, SSIS and working on SQL Server and using all of the SQL Server features available to create stable and scalable database services. Recent community contributions have been to organize SQL Saturday events in the Chicago-land to Milwaukee-land area for 2010 where he will also be speaking about various SQL Server topics. @onpnt
  • 相关阅读:
    获取mysql数据库中的表信息
    Django Rest Framework --- 分页器组件
    Django Rest Framework --- 频率组件
    Django Rest Framework --- 权限组件
    Django Rest Framework --- 认证组件
    Django Rest Framework --- 视图组件
    序列化组件
    restful规范,cbv源码执行流程,drf之APIView及Respons源码分析
    前后台分离式开发(文件的上传下载),cookie(存放在前台),session(存放在后台)
    ajax,分页器
  • 原文地址:https://www.cnblogs.com/wghao/p/1614176.html
Copyright © 2020-2023  润新知