• HTML Clipboard Format [MSDN资料]


     This article discusses cutting and pasting fragments of an HTML document. The CF_HTML clipboard format allows a fragment of raw HTML text and its context to be stored on the clipboard as ASCII. This allows the context of the HTML fragment, which consists of all preceding and surrounding tags, to be examined by an application so that the surrounding tags can be noted with their attributes. Although it is up to an application to interpret such fragments, some basic guidelines are included here based on MSHTML implementations.

    This article covers the following topics:

    Overview of CF_HTML

    CF_HTML is entirely text format and uses the transformation format UTF-8. It includes a description, a context, and within the context, the fragment. The format name of a clipboard containing CF_HTML data is HTML Format. Therefore, if you use functions such as RegisterClipboardFormat, pass "HTML Format" in the format name parameter. This section gives a code example of a clipboard using CF_HTML and discusses the three components of the CF_HTML clipboard format.

    The following is an example of a clipboard fragment:

     
     
        Version:0.9
        StartHTML:71
        EndHTML:170
        StartFragment:140
        EndFragment:160
        StartSelection:140
        EndSelection:160
        <!DOCTYPE>
        <HTML>
        <HEAD>
        <TITLE>The HTML Clipboard</TITLE>
        <BASE HREF="http://sample/specs"> 
        </HEAD>
        <BODY>
        <!--StartFragment -->
        <P>The Fragment</P>
        <!--EndFragment -->
        </BODY>
        </HTML>
    

    Description

    The description includes the clipboard version number and offsets, indicating where the context and the fragment start and end. The description is a list of ASCII text keywords followed by a string and separated by a colon (:).

    Version vv Version number of the clipboard. Starting version is 0.9.
    StartHTML Byte count from the beginning of the clipboard to the start of the context, or -1 if no context.
    EndHTML Byte count from the beginning of the clipboard to the end of the context, or -1 if no context.
    StartFragment Byte count from the beginning of the clipboard to the start of the fragment.
    EndFragment Byte count from the beginning of the clipboard to the end of the fragment.
    StartSelection Byte count from the beginning of the clipboard to the start of the selection.
    EndSelection Byte count from the beginning of the clipboard to the end of the selection.

    The StartSelection and EndSelection keywords are optional because sufficient information for basic pasting is included in the fragment description. However, the selection information indicates the exact HTML area the user has selected. This adds more information to the fragment description.

    Other information may be added in the description section. For example, multiple pairs of StartFragment/EndFragment could be added to support noncontiguous selection of fragments. Also, Windows Internet Explorer places a SourceURL Property in the description section. This allows handlers of CF_HTML to resolve relative links within a file (such as when CF_HTML text is pasted into a DHTML Edit Control host).

    The only character set supported by the clipboard is Unicode in its UTF-8 encoding. Because the first characters of UTF-8 and ASCII match, the description is always ASCII, but the bytes of the context (starting at StartHTML) may use any other characters coded in UTF-8. Ends of lines may be represented in a clipboard format header as Carriage Return (CR), carriage return/line feed (CR/LF), or Line Feed (LF).

    Context

    When working with a complete document, the context is the entire HTML document. Because this discussion is limited to copying and pasting a fragment of a document, the context is the selected fragment and all preceding and surrounding start and end tags. These tags represent all the parent nodes of the fragment, up to the HTML node. The context also contains the complete head element and allows the base and title elements to be included. Sufficient information is included in the fragment for a basic pasting operation. However, if your application requires information concerning the tags surrounding the fragment, you must store the context on the clipboard.

    An application copying a fragment of HTML to the clipboard may choose to create a base element to include in the context so that partial URLs in the fragment can be resolved.

    Fragment

    The fragment contains valid HTML representing the area the user has selected. This includes the information required for basic pasting of an HTML fragment, as follows:

    • Selected text.
    • Opening tags and attributes of any element that has an end tag within the selected text.
    • End tags that match the included opening tags.

    The fragment should be preceded and followed by the HTML comments <!--StartFragment--> and <!--EndFragment--> (no space allowed between the !-- and the text) to indicate where the fragment starts and ends. So the start and end of the fragment are indicated by these comments as well as by the StartFragment and EndFragment byte counts. Though redundant, this makes it easier to find the start of the fragment (from the byte count) and mark the position of the fragment directly in the HTML tree.

    Scenarios

    The following scenarios describe how the MSHTML HTML editor handles HTML cutting and pasting; other applications may or may not follow these scenarios. The clipboard format described here is intended to allow flexibility for the way an application functions. These scenarios show only well-formed HTML.

    Simple Fragment of HTML

    HTML text:

     
     
    <BODY> This is normal. <B>This is bold.</B> <I><B>This is bold italic.</B></I> 
    <I>This is italic.</I> </BODY>
    

    This appears as:

    This is normal. This is bold. This is bold italic. This is italic.

    The text between the ** is selected and copied to the clipboard:

    This is normal. This is **bold. This is bold italic. This** is italic.

    In this scenario only the BODY element and the html element appear in the context as it precedes the selected fragment. Note that start tags and end tags are included in the context. The selection, delimited by StartSelection and EndSelection, is shown in bold. This is what will be on the clipboard (note this is MSHTML's interpretation).

     
     
     Version:0.9
     StartHTML:71
     EndHTML:160
     StartFragment:130
     EndFragment:150
     StartSelection:130
     EndSelection:150
     <!DOCTYPE ...>
     <HTML> 
     <BODY> 
     <!--StartFragment-->
     <B>bold.</B> <I><B>This is bold italic.</B></I> <I>This</I> 
     <!--EndFragment--> 
     </BODY> 
     </HTML>
    

    Fragment of a Table in HTML

    HTML text:

     
     
     <BODY><TABLE BORDER><TR><TH 
     ROWSPAN=2>Head1</TH><TD>Item 1</TD> 
     <TD>Item 2</TD> <TD>Item 3</TD> 
     <TD>Item 4</TD></TR><TR><TD>Item 
     5</TD> <TD>Item 6</TD> <TD>Item 
     7</TD> <TD>Item 
     8</TD></TR><TR><TH>Head2</TH><TD>Item 
     9</TD> <TD>Item 10</TD> <TD>Item 
     11</TD> <TD>Item 
     12</TD></TR></TABLE></BODY>
    

    This appears as:

    Head1 Item 1 Item 2 Item 3 Item 4
    Item 5 Item 6 Item 7 Item 8
    Head2 Item 9 Item 10 Item 11 Item 12

    The Item 6, Item 7, Item 10, and Item 11 elements of the table are selected as a block and copied to the clipboard. The following is an MSHTML interpretation of what will be on the clipboard.

     
     
     <!DOCTYPE ...>
     <HTML><BODY><TABLE BORDER> 
     <!--StartFragment-->
     <TR><TD>Item 6</TD> 
     <TD>Item 7</TD></TR><TR><TD>Item 
     10</TD> <TD>Item 11</TD></TR>
     <!--EndFragment-->
     </TABLE>
     </BODY></HTML>
    

    Pasting a Fragment of an Ordered List into Plain Text

    HTML text:

     
     
     <BODY><OL TYPE = 1><LI>Item 1<LI>Item 
     2<LI>Item 3<LI>Item 4<LI>Item 5<LI>Item 
     6</OL></BODY>
    

    This appears as:

    1. Item 1
    2. Item 2
    3. Item 3
    4. Item 4
    5. Item 5
    6. Item 6

    The user selects and copies items 3 through 5 to the clipboard. The following HTML is in the clipboard.

     
     
     <DOCTYPE...><HTML><BODY><OL TYPE = 1>
     <!-- StartFragment-->
     <LI>Item 3<LI>Item 4<LI>Item 5
     <!-- EndFragment-->
     </OL></BODY></HTML>
    

    If this fragment is now pasted into an empty document, the following HTML will be created:

     
     
    <BODY><OL TYPE = 1><LI>Item 3<LI>Item 4<LI>Item 5</OL></BODY>
    

    This appears as:

    1. Item 3
    2. Item 4
    3. Item 5

    Pasting a Partially Selected Region

    HTML text:

    <P> MSHTML is a WYSIWYG Editor that supports:</P>
    <UL><LI>Cut<LI>Copy<LI>Paste</UL> <P>This is a great tool!</P>
    

    This appears as:

    MSHTML is a WYSIWYG Editor that supports:

    • Cut
    • Copy
    • Paste

    This is a great tool!

    The user selects from "WYSIWYG" to "Cop". The following HTML is in the clipboard.

     <DOCTYPE...><HTML><BODY>
     <!-- StartFragment-->
     <P>WYSIWYG Editor that supports:</P>
     <UL><LI>Cut<LI>Cop</UL>
     <!-- EndFragment--> 
     </BODY></HTML>
    

    The user selects from "opy" to "great". The following HTML is in the clipboard.

     
     
     <DOCTYPE...><HTML><BODY>
     <!-- StartFragment-->
     <UL><LI>
     opy<LI>Paste</UL>
    <p> This is a great
     </P>
     <!-- EndFragment--> 
     </BODY></HTML>

    参考资料:
    http://msdn.microsoft.com/en-us/library/aa767917(v=vs.85).aspx 



    返回导读目录,阅读更多随笔



    分割线,以下为博客签名:

    软件臭虫情未了
    • 编码一分钟
    • 测试十年功


    随笔如有错误或不恰当之处、为希望不误导他人,望大侠们给予批评指正。

  • 相关阅读:
    Dephi XE 编译后执行文件的路径怎么改
    一名Delphi程序员的开发习惯
    Delphi AnimateWindow 用法 淡入淡出窗口
    Delphi开发DLL
    delphi 中配置文件的使用(*.ini)
    Delphi中根据分类数据生成树形结构的最优方法
    Delphi語法筆記
    2015年10月19日 做过的面试题(四)
    ios 客户端定位的3种方法
    常用开源镜像站整理android sdk manager
  • 原文地址:https://www.cnblogs.com/08shiyan/p/2258779.html
Copyright © 2020-2023  润新知