序列化是WCF的核心内容之一. 原文地址: http://msdn.microsoft.com/msdnmag/issues/06/08/Ser...
Windows Communication Foundation has been built from the ground up around the tenets of service orientation. It supports several serialization mechanisms that make it easy to bring existing types forward and provides a simple, interoperable foundation for future service-oriented applications. Windows® Communication Foundation (which is still in beta) embraces XML as a key enabling technology. While you can use it to build services that process XML directly, most developers prefer to leverage serialization mechanisms that automate moving between objects in the Microsoft® .NET Framework and XML Infosets.
There are two general approaches you can take when implementing a Web service. One is to embrace XML and program directly against the messages. This offers a high degree of flexibility, especially when tackling tough challenges like versioning, where core XML technologies like XPath, XSLT, and XQuery are indispensable. But many developers may find this technique tedious and overwhelming. The other approach is to predefine a mapping between .NET and XML, and then rely on automated serialization mechanisms. This hides the various XML details in order to simplify the developer experience. Windows Communication Foundation supports both approaches, with equal depth.
Internally, Windows Communication Foundation represents all messages with the Message class, which is found in System.ServiceModel.Channels. The Message class models a SOAP message, commonly referred to as a SOAP envelope, which has a header section and a body that carries the payload. The Message class provides an interface for interacting with the header section and body section using either System.Xml classes or type-based serialization.
When working at the message level, you can explicitly choose which technique to use. However, the most common way to make use of serialization in Windows Communication Foundation is to author service contracts in terms of serializable types, as shown here:
[ServiceContract] public interface IEchoService { [OperationContract] Person EchoPerson(Person person); }
By annotating the .NET interface with [ServiceContract], you indicate that the .NET type definition will also serve as a service contract. Think of Web Services Description Language (WSDL). Annotating the method signature with [OperationContract] indicates that you want the method to be included in the service contract. At run time, Windows Communication Foundation automatically maps the method signature to a pair of messages behind the scenes, each containing a Person in the SOAP body. It then uses a serializer to map the Person object into the message. (For complete control over what goes where in the SOAP envelope, you can use [MessageContract] types in the signature.)
Windows Communication Foundation supports three serializers: XmlSerializer, DataContractSerializer, and NetDataContractSerializer. Each of these comes with different mapping algorithms and customization techniques. (See Figure 1 for a comparison.) Nevertheless, each performs the same fundamental task—mapping between .NET objects and XML Infosets.
DataContractSerializer is the default and is always used unless specified otherwise. You can choose a different serializer by annotating the service contract with an attribute, as shown here:
[XmlSerializerFormat] [ServiceContract] public interface IEchoService { [DataContractFormat] [OperationContract] Person EchoPerson(Person person); [OperationContract] Address EchoAddress(Address address); [OperationContract] Phone EchoPhone(Phone phone); }
In this example, [XmlSerializerFormat] specifies XmlSerializer as the default serializer for all methods on the contract. However, I've overridden the serializer on EchoPerson by annotating the method with [DataContractFormat]. There isn't an attribute for NetDataContractSerializer, but you can write a custom attribute to apply it in the same way as the others if needed. The serializer is considered part of the service contract because it directly impacts your code.
Windows Communication Foundation also lets you specify the encoding to be used. Whereas serialization defines how .NET objects map to XML Infosets, the encoding defines how the XML Infoset is written out to a stream of bytes. Windows Communication Foundation currently supports the following encodings: text, binary, and Message Transmission Optimization Mechanism (MTOM). However, more encodings, including your own custom encodings, could be added down the road.
If you use each of the three encodings to write out the same XML Infoset, you'll get three very different byte streams, but they'll all represent the same logical data. The encoding isn't considered part of the service contract, but rather a configuration detail since it doesn't impact your code—you control the encoding by configuring the endpoint's binding.
The separation of serialization from encoding makes it possible to build your applications on a consistent data model (the XML Infoset) while providing flexibility in representation (the encoding). This is a key feature when applying Windows Communication Foundation to a variety of real-world scenarios. If you care about interoperability, you can choose to use the text encoding. When performance is more of a concern, you can choose to use the binary encoding. In either case, the encoding choice is decoupled from the serialization mechanism.
At first, Windows Communication Foundation serialization might remind you of ASMX. Once you start to peel off the layers, however, you'll see an amazing world of new possibilities.
When Windows Communication Foundation creates a message to send at run time, it chooses the serializer based on the service contract and the encoding based on the binding configuration. Let's look at how this is accomplished. Figure 2 shows how to create and write a message containing a Person object.
The last parameter to CreateMessage specifies the serializer to use when creating the message. The serializer controls how the Person object is added to the message body as an XML Infoset. I then create a message writer and pass it to WriteMessage. The writer determines which encoding will be used to write the message out to a byte stream. The new XmlDictionaryReader and XmlDictionaryWriter classes allow you to create readers and writers based on any of the supported encodings (text, binary, or MTOM).
For the example shown in Figure 3, I used a binary XmlDictionaryWriter. If you open the generated file in Visual Studio®, you'll see that it's binary; you won't find the angle brackets or XML tags you're used to seeing in traditional text encoding.
Figure 3 Binary Encoding
The code in Figure 4 shows how to accomplish the reverse–deserialize a Person from the binary file on disk. The reader supplied to CreateMessage determines which encoding to use when reading bytes from disk; in this case a binary XmlDictionaryReader. Then the call to GetBody<Person> indicates that I want to use DataContractSerializer to deserialize the XML Infoset back into a Person object.
In cases where you prefer to avoid serialization, the Message class lets you work directly with the XML Infoset. Instead of calling GetBody<T> to deserialize the body into a .NET object, you can call GetReaderAtBodyContents and use the returned XmlReader to process the body (see Figure 5). By working directly with the XML Infoset, you can make use of your favorite XML processing technique. In Figure 5, I simply advance to the name and age elements and print their values to the console.
Defining Serialization Mapping
Although you can access the XML Infoset directly, a great deal of engineering effort has gone into making the serialization concepts easy and approachable throughout the programming model. The main thing you must focus on when using serialization is how the mapping works between your .NET objects and XML Infosets.
When you use .NET types in the context of serialization, there are two contracts you must consider. There's the local .NET contract, which defines the data structure along with corresponding behavior (you'll have constructors, properties, and other helpers that make the type easier to use). And then there's the external data contract, which identifies precisely what data to serialize and how to do this. Windows Communication Foundation refers to this as the data contract of a .NET type.
Since a data contract ultimately defines the structure of an XML Infoset, it's natural to represent the data contract using an XML Schema definition. In fact, XML Schema is always used to share data contracts with non-Windows Communication Foundation applications. However, data contracts are also defined in .NET type definitions. Each serializer comes with a default algorithm that defines most of the mapping details, but includes attributes that let you customize the mapping. This information is stored along with the .NET type as metadata which serializers can access at run time to determine specific mapping details.
Windows Communication Foundation provides a new command-line tool named svcutil.exe for moving between these different data contract representations (you can also use xsd.exe when working with XmlSerializer types). If you pass an XML Schema definition to svcutil.exe, it will automatically produce the corresponding .NET serializable types annotated with all the right attributes. If you pass a .NET assembly to svcutil.exe, it will automatically generate the corresponding XML Schema definitions for all serializable types (use the /dataContractOnly switch).
The result is great flexibility. You're free to define new data contracts in either XML Schema or .NET code and you can easily convert to the other. If you're working in a situation where the XML Schema definitions already exist and you need to support them, you can simply start with svcutil.exe. If you're defining new contracts and need to quickly, start by writing class definitions.
As most of you know, XmlSerializer is the serializer currently used in ASMX (found in System.Xml.Serialization). The fact that Windows Communication Foundation supports XmlSerializer is good news for anyone who has made significant .NET Web services investments over the years and plans to eventually migrate them forward. You can easily use your XmlSerializer-based types in new service contracts by applying [XmlSerialzerFormat] to the service contract as shown earlier.
Let's review the basics of how XmlSerializer works. First, you instantiate XmlSerializer and specify the type you intend to serialize. Then you call the Serialize and Deserialize methods to move between instances of the .NET type and the corresponding XML Infoset. XmlSerializer defines a default data contract mapping algorithm for moving between these representations.
XmlSerializer can operate on any public type without any special attributes. With XmlSerializer, a type's public data interface maps directly to what goes in the XML Infoset. Hence, XmlSerializer automatically includes all public read/write fields and properties in the mapping, and it ignores anything that's private, protected, or so on. The .NET class name maps to the root element while the public field and property names map to local element names. The element order is the same as the order of the members in the class, with fields grouped first followed by properties. XmlSerializer does not use XML namespaces by default.
Figure 6 shows a simple class definition. Since this type is public, it can be serialized using XmlSerializer without any special attributes. But only the public read/write fields and properties will be included during the mapping, which includes the sensitiveData and spouse fields, as well as the Name and Age properties.
Figure 7 shows how to serialize a Person with XmlSerializer. Notice the call to Serialize also accepts an XmlWriter, which means I can specify any XmlDictionaryWriter implementation and choose the desired encoding. When this code executes, it produces the following person.xml document:
<Person> <sensitiveData>secret</sensitiveData> <spouse> <sensitiveData>secret</sensitiveData> <Name>Jane</Name> <Age>33</Age> </spouse> <Name>Bob</Name> <Age>34</Age> </Person>
Note that the name of the type (Person) became the name of the root element while the field and property names became the local element names. Also note that the private fields, Name and Age, were not included except via the properties whose get methods are called. Fields are grouped first, followed by properties (each group is in the same order as they were defined in the class definition). The elements are not namespace qualified.
In this case, the data contract defining the actual mapping can be published as an XML Schema definition using the xsd.exe tool from the .NET Framework. The schema shown in Figure 8 accurately describes the XML instance that I produced with the serialization code shown in Figure 7 and can be shared with other parties. You can supply this same schema to xsd.exe /classes and it will generate a class definition similar to the one we started with, and it will have an equivalent data contract.
Figure 9 illustrates how to deserialize a Person object from person.xml. The serializer knows how to read the XML according to the data contract. During deserialization, the default constructor is called (providing an opportunity to do any necessary initialization) and the set methods are called for properties.
I didn't have to do anything special to define this two-way mapping; it was automatically derived from the type's public data interface. XmlSerializer provides a suite of customization attributes (also found in System.Xml.Serialization) that you can use to influence the mapping for a particular type. Figure 10 shows how to use some of these attributes. I've made several changes to the default mapping in this example. I specified an XML namespace to qualify the root element, modified the order of the elements, mapped the Age property to an attribute instead of an element, and told the serializer to ignore the sensitiveData field. Now, when an object of this type is serialized you'll get the following XML:
<Person Age="34" xmlns="http://example.org/person"> <Name>Bob</Name> <spouse Age="33"> <Name>Jane</Name> </spouse> </Person>
If I generate the XML Schema again using xsd.exe, it will reflect all of these customizations. In addition, you can completely override the default mapping by implementing IXmlSerializable on the type, in which case you're defining your own mapping algorithm with custom reader/writer code.
There are a few things that make XmlSerializer unique. First, the data contract isn't explicit; it's implicitly derived from the public data interface, which may give you more than you're bargaining for (like the sensitiveData field). You have to use opt-out techniques (such as [XmlIgnore]) to address those situations. Second, XmlSerializer gives you a great deal of flexibility when it comes to XML Schema. We used an attribute instead of an element, for example. There are several more sophisticated XML Schema concepts you can apply when using XmlSerializer, but using these more advanced features can often lead to interoperability problems across frameworks.
Working with DataContractSerializer
When the Windows Communication Foundation architects contemplated the type of serializer that would make the most sense, they decided that it should use a very explicit data contract model ("boundaries are explicit") and constrain developers to a subset of XML Schema to improve interoperability results. Since XmlSerializer didn't provide either of these characteristics, the architects had to provide a new serializer. That's where DataContractSerializer comes in.
DataContractSerializer has been designed from the ground up for Windows Communication Foundation. It benefits from the lessons learned from working with XmlSerializer, and should be just as easy to use. The class is found in the System.Runtime.Serialization namespace because the implementation is quite similar to the IFormatter approach used in .NET remoting. In fact, this is meant to replace that mechanism moving forward.
The mechanics for using DataContractSerializer are similar to those for XmlSerializer. First, you instantiate DataContractSerializer and specify the type you intend to serialize. Then you call the WriteObject and ReadObject methods to move between instances of the .NET type and the corresponding XML Infoset. DataContractSerializer defines a default mapping algorithm for moving between these representations.
DataContractSerializer can operate on any .NET type annotated with either the [DataContract] or [Serializable] attributes. It supports backwards compatibility with [Serializable] types to simplify moving .NET remoting code to Windows Communication Foundation. The [Serializable] mechanism was designed to serialize the entire object by value so you can reconstitute the same object on the other side of a .NET remoting call.
In this sample, I use the same Person class as before, but without the XmlSerializer customization attributes, and I annotate the class with [Serializable]:
[Serializable] public class Person { private string name; private double age; private string sensitiveData; public Person spouse; ... // constructors and properties }The default mapping for [Serializable] is different from the one used with XmlSerializer. Here, all fields are included in the mapping, whether public or private, and properties are never included. The .NET class name maps to the root element, while the field names map to local element names. The elements are ordered alphabetically in the data contract and a namespace is derived from the .NET namespace in use.
The code in Figure 11 shows how to serialize a Person object with DataContractSerializer. The generated XML will now look a bit different given the differences in the [Serializable] mapping:
<Person xmlns= "http://schemas.datacontract.org/2004/07/DataContractSamples" xmlns:i="http://www.w3.org/2001/XMLSchema-instance"> <age>34</age> <name>Bob</name> <sensitiveData>secret</sensitiveData> <spouse> <age>33</age> <name>Jane</name> <sensitiveData>secret</sensitiveData> <spouse i:nil="true"/> </spouse> </Person>Notice that only fields are included, the elements are ordered alphabetically, and the element is automatically qualified with an XML namespace (this namespace is a combination of "http://schemas.datacontract.org/2004/07/" and the .NET namespace containing the type, which in this case was "DataContractSamples"). You can produce the XML Schema that describes this format by using svcutil.exe /dconly. Figure 12 shows the code used for deserializing a Person object.
Another difference from XmlSerializer is that constructors are not called during deserialization. However, if you need to perform initialization, DataContractSerializer supports the [OnDeserializing], [OnDeserialized], [OnSerializing], and [OnSerialized] callbacks that were also supported on the Binary/SoapFormatter classes.
The only customization you can make to the default [Serializable] mapping is to exclude a field from the data contract using the [NonSerialized] attribute, as shown here:
[Serializable] public class Person { private string name; private double age; [NonSerialized] private string sensitiveData; public Person spouse; ... // constructors and properties }However, you can completely override the mapping algorithm by implementing ISerializable on the type. In this case, DataContractSerializer will call your code to perform the mapping, giving you complete control.
DataContractSerializer also supports a more explicit mapping mechanism through the [DataContract], [DataMember], and [EnumMember] attributes. When using this approach, you annotate the type with [DataContract] to make it serializable and then you annotate any fields or properties you wish to map with [DataMember]. Similarly, you annotate the enum values you wish to map with [EnumMember]. In this scenario, you are explicitly defining the data contract—nothing maps by default.
The code that is shown in Figure 13 presents a Person class that uses the [DataContract] mapping mechanism. When I serialize an instance of this type using the same code shown in the previous example, I get the following XML:
<Person xmlns= "http://schemas.datacontract.org/2004/07/DataContractSamples" xmlns:i="http://www.w3.org/2001/XMLSchema-instance"> <Age>34</Age> <name>Bob</name> <spouse> <Age>33</Age> <name>Jane</name> <spouse i:nil="true"/> </spouse> </Person>The mapping details are identical to [Serializable]. The only difference is how I indicated what should be included in the data contract. With [Serializable], all fields become part of the data contract (unless they are marked with [NonSerialized]). With [DataContract], only members marked with [DataMember] are included. Note that if a type has both [DataContract] and [Serializable] attributes on it, it will use the [DataContract] mapping.
The [DataContract] mapping is more customizable than that offered by [Serializable], but it's also more constrained than XmlSerializer. The code in Figure 14 shows how to use a few of the customization properties found on [DataContract] and [DataMember]. In this example, I've customized the XML namespace, some of the element names, and the element order. Setting IsRequired=true causes the serializer to verify that the element is present during deserialization (and it shows up as minOccurs="1" in the schema). Here's what the resulting XML looks like:
<Person xmlns="http://example.org/person" xmlns:i="http://www.w3.org/2001/XMLSchema-instance"> <Name>Bob</Name> <Age>34</Age> <Spouse> <Name>Jane</Name> <Age>33</Age> <Spouse i:nil="true"/> </Spouse> </Person>
More sophisticated schema customizations are not possible via attribution. You cannot, for example, change elements into attributes or change from sequence to choice compositors. If you want complete control over the XML mapping, you can avoid the [DataContract] attributes altogether and implement IXmlSerializable or ISerializable (the former takes precedence over the latter). This approach, however, requires more work.
When you are using DataContractSerializer, you can only work with XML Schema definitions that meet the [DataContract] mapping constraints. If you pass a more complex XML Schema to svcutil.exe, it will warn you when it doesn't meet the [DataContract] constraints and encourage you to use xsd.exe /XmlSerializer.
Despite its constraints, [DataMember] makes it possible to implement some interesting data versioning heuristics via both the Order parameter and the IsRequired parameter. (Basically, doing this allows you to safely add new fields in future versions of the contract.) You can also implement IExtensibleData on your data contracts to enable the serializer to round-trip unrecognized data during serialization—when, for instance, you need to pass a new version of a message to an old implementation.
You always have to provide DataContractSerializer with type information before it can serialize/deserialize instances. This is different from .NET remoting, where type information is identified at run time and serialized into the message in order to provide type fidelity across the wire. When you need support for this type of scenario, you should use NetDataContractSerializer.
Working with NetDataContractSerializer
The major difference between DataContractSerializer and NetDataContractSerializer is that the latter serializes .NET type information into the XML. NetDataContractSerializer supports [Serializable] and [DataContract], the same mapping algorithms, and the same customization attributes. However, when you use NetDataContractSerializer, you no longer have to supply type information ahead of time, as illustrated here:
NetDataContractSerializer serializer = new NetDataContractSerializer(); // no type specified serializer.WriteObject(writer, p);The generated XML contains the .NET type information in attributes placed on the root element (z:Type and z:Assembly) as illustrated here:
<Person z:Id="1" z:Type="DataContractSamples.Person" z:Assembly="DataContractSamples, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null" xmlns="http://example.org/person" xmlns:i="http://www.w3.org/2001/XMLSchema-instance" xmlns:z="http://schemas.microsoft.com/2003/10/Serialization/"> <Name z:Id="2">Bob</Name> <Age>34</Age> <Spouse z:Id="3"> <Name z:Id="4">Jane</Name> <Age>33</Age> <Spouse i:nil="true"/> </Spouse> </Person>Since the .NET type information is found in the XML, you don't need to specify the .NET type when deserializing either.
Microsoft has provided this ability specifically to address situations where you need type fidelity across the wire. But for the most part, Windows Communication Foundation encourages developers to embrace the more explicit DataContractSerializer approach. This is why you can't enable NetDataContractSerializer on a service contract without writing your own behavior/attribute.
Advanced Serialization Concepts
When using DataContractSerializer or NetDataContractSerializer, there are a few concepts to consider. First, when serializing types that use inheritance, make sure all types in the hierarchy are serializable (either via [DataContract] or [Serializable]). If they aren't, you'll get exceptions at run time. Likewise, you must ensure that any contained types are also serializable.
If you find yourself stuck in a situation where you cannot make some of the types serializable, you can use what is known as a serialization surrogate. This is basically a serializable type that can take the place of the non-serializable type without requiring you to modify the original class definition. You do this by implementing the IDataContractSurrogate and handing it to the serializer. The serializer will call your surrogate during the serialization process, at which point you can replace the non-serializable object with a different object.
The Windows Communication Foundation serializers also provide a way to specify known type substitutions that you want the serializers to recognize at run time. This lets you type a serializable field as some base type and actually serialize different derived types at run time. This is done by annotating base types with the [KnownType] attribute (similar to how [XmlInclude] works in XmlSerializer). There is also [ServiceKnownType] for use on service contract definitions.
These new serializers can also maintain object references and deal with issues like circular references (see the constructor with a preserveObjectReferences flag). This is a nice improvement over XmlSerializer, which choked on such object graphs. I can, for example, serialize a circular spouse reference in the previous examples. I've included such an example in the downloadable sample code.
Windows Communication Foundation is built on a consistent data model (XML Infoset) and flexible representations (encodings). When working with it you can choose to work with the XML directly or you can use the built-in serialization techniques to automate mappings. You'll rarely need to program against the serializer classes directly since you can simply annotate your service contracts to choose a serializer. It's important, however, to understand how each serializer works.
It's also important to understand which serializer is best for a given scenario. You should use DataContractSerializer whenever possible in Windows Communication Foundation. Its constrained mapping improves interoperability when starting with code. NetDataContractSerializer should only be used when you absolutely need type fidelity across the wire. Alternatively, you should use XmlSerializer when you need backwards compatibility with ASMX types, you need more flexibility in the XML representations, or you're starting with existing schemas that don't meet the [DataContract] mapping constraints.