• Unicode转化为汉字


    + (NSString *)replaceUnicode:(NSString *)unicodeStr { 
    
    NSString *tempStr1 = [unicodeStrstringByReplacingOccurrencesOfString:@"\u"withString:@"\U"]; 
    NSString *tempStr2 = [tempStr1stringByReplacingOccurrencesOfString:@"""withString:@"\""]; 
    NSString *tempStr3 = [[@"""stringByAppendingString:tempStr2]stringByAppendingString:@"""]; 
    NSData *tempData = [tempStr3dataUsingEncoding:NSUTF8StringEncoding]; 
    NSString* returnStr = [NSPropertyListSerializationpropertyListFromData:tempData 
    mutabilityOption:NSPropertyListImmutable 
    format:NULL 
    errorDescription:NULL]; 
    
    return [returnStrstringByReplacingOccurrencesOfString:@"\r\n"withString:@"
    "]; 
    
    }


    汉字与utf8相互转化

    NSString* strA = [@"%E4%B8%AD%E5%9B%BD"stringByReplacingPercentEscapesUsingEncoding:NSUTF8StringEncoding];
    NSString *strB = [@"中国"stringByAddingPercentEscapesUsingEncoding:NSUTF8StringEncoding];

    NSString 转化为utf8

    NSString *strings = [NSStringstringWithFormat:@"abc"];
    
    NSLog(@"strings : %@",strings);
    
    CF_EXPORT
    CFStringRef CFURLCreateStringByAddingPercentEscapes(CFAllocatorRef allocator,CFStringReforiginalString,CFStringRef charactersToLeaved, CFStringReflegalURLCharactersToBeEscaped,CFStringEncoding encoding);
    
    NSString *encodedValue = (__bridge NSString*)CFURLCreateStringByAddingPercentEscapes(nil, (__bridgeCFStringRef)strings,nil, (CFStringRef)@"!*'();:@&=+$,/?%#[]",kCFStringEncodingUTF8);

    iso8859-1 到 unicode编码转换

    + (NSString *)changeISO88591StringToUnicodeString:(NSString *)iso88591String
    {
    
    NSMutableString *srcString = [[[NSMutableString alloc]initWithString:iso88591String] autorelease];
    
    [srcString replaceOccurrencesOfString:@"&" withString:@"&" options:NSLiteralSearch range:NSMakeRange(0, [srcString length])];
    [srcString replaceOccurrencesOfString:@"&#x" withString:@"" options:NSLiteralSearch range:NSMakeRange(0, [srcString length])];
    
    NSMutableString *desString = [[[NSMutableString alloc]init] autorelease];
    
    NSArray *arr = [srcString componentsSeparatedByString:@";"];
    
    for(int i=0;i<[arr count]-1;i++){
    
    NSString *v = [arr objectAtIndex:i];
    char *c = malloc(3);
    int value = [StringUtil changeHexStringToDecimal:v];
    c[1] = value &0x00FF;
    c[0] = value >>8 &0x00FF;
    c[2] = '';
    [desString appendString:[NSString stringWithCString:c encoding:NSUnicodeStringEncoding]];
    free(c);
    }
    
    return desString;
    }


    Q: Is there a standard method to package a Unicode character so it fits an 8-Bit ASCII stream?

    A: There are three or four options for making Unicode fit into an 8-bit format.

    a) Use UTF-8. This preserves ASCII, but not Latin-1, because the characters >127 are different from Latin-1. UTF-8 uses the bytes in the ASCII only for ASCII characters. Therefore, it works well in any environment where ASCII characters have a significance as syntax characters, e.g. file name syntaxes, markup languages, etc., but where the all other characters may use arbitrary bytes. 
    Example: “Latin Small Letter s with Acute” (015B) would be encoded as two bytes: C5 9B.

    b) Use Java or C style escapes, of the form uXXXXX or xXXXXX. This format is not standard for text files, but well defined in the framework of the languages in question, primarily for source files.
    Example: The Polish word “wyjście” with character “Latin Small Letter s with Acute” (015B) in the middle (ś is one character) would look like: “wyju015Bcie".

    c) Use the &#xXXXX; or &#DDDDD; numeric character escapes as in HTML or XML. Again, these are not standard for plain text files, but well defined within the framework of these markup languages.
    Example: “wyjście” would look like “wyjście"

    d) Use SCSU. This format compresses Unicode into 8-bit format, preserving most of ASCII, but using some of the control codes as commands for the decoder. However, while ASCII text will look like ASCII text after being encoded in SCSU, other characters may occasionally be encoded with the same byte values, making SCSU unsuitable for 8-bit channels that blindly interpret any of the bytes as ASCII characters.
    Example: “ wyjÛcie” where indicates the byte 0x12 and “Û” corresponds to byte 0xDB. [AF] & [KW]


    如c所描述,这是一种“未标准"但广泛采用的做法,说是山寨编码也行 :-)

    所以编码过程是

    字符串 -> Unicode编码 -> &#xXXXX; or &#DDDDD; 

    解码过程反过来即可 

    http://unicode.org/faq/utf_bom.html#General

  • 相关阅读:
    SpringBoot
    mysql 8版本使用注意
    RocketMQ服务搭建_1
    otter使用
    greenplum
    一、Linux概述 二、Linux的安装 三、Linux的常用命令(重点)
    一、DAO设计模式 二、DAO设计模式的优化 三、JDBC中的事务,连接池的使用
    一、JDBC的概述 二、通过JDBC实现对数据的CRUD操作 三、封装JDBC访问数据的工具类 四、通过JDBC实现登陆和注册 五、防止SQL注入
    一、MySQL中的索引 二、MySQL中的函数 三、MySQL数据库的备份和恢复 四、数据库设计和优化(重点)
    一、TCL事务控制语言 二、MySQL中的约束 三、多表查询(重点) 四、用户的创建和授权 五、MySQL中的索引
  • 原文地址:https://www.cnblogs.com/nuanshou/p/4398487.html
Copyright © 2020-2023  润新知