Introduction
Back when I started using XML documents in C++, parsing and getting
data out of the document was not what I would have called friendly, nor
was generating a new document from scratch. Around that time, I also
started playing with this new language called C#. At some point, I was
doing XML parsing with C# and discovered the XML serialization features (XMLSerializer
).
Needless to say, this was a great answer to my problems because I was
often working with in-memory object relationships and would spend quite a
bit of code translating between the two.
Over time, I found myself writing a set of utility functions to load and save the XML documents to and from different formats (files, string, memory streams, etc.). One day it occurred to me that I could use Generics to make this much easier for me over copying, pasting, and modifying the same method form a different class over and over.
Background on XML Serialization
This article is not going to explain in detail how XML serialization works (these are ideas for possible future articles). I will, however, introduce a very simple Hello World class that will be used in this example. The basic XML document looks like:
<?
xml
version
="
1.0"
?
>
<
hello
>
<
message
>
Hello World<
/
message
>
<
/
hello
>
The C# class we use for serialization looks like this:
namespace
BackgroundCode
{
public
class
Hello
{
public
string
Message { get
; set
; }
}
}
This is pretty straightforward and, in fact, looks just like a normal C# object. Here is where the magic comes in; say, we want to read in the XML document from above, you could use the following code:
public
static
Hello ReadDocument(string
fileName)
{
XmlSerializer xs = new
XmlSerializer(typeof
(Hello));
using
(Stream stream = File.OpenRead(fileName))
{
return
xs.Deserialize(stream) as
Hello;
}
}
Now, you may notice that there are a number of places where things
can fail here, and anyone calling this function should expect that you
may get exceptions from File.OpenRead
and xs.Deserialize
.
You can also write a similar function for writing out to a file:
public
static
void
WriteDocument(Hello xml, string
fileName)
{
XmlSerializer xs = new
XmlSerializer(typeof
(Hello));
using
(Stream stream = File.OpenWrite(fileName))
{
xs.Serialize(stream, xml);
}
}
Iteration 1
The actual work done in reading and writing a document seems pretty easy, but as a proponent of DRY (Don't Repeat Yourself), it seems that the most logical place for this functionality to live is with the data definition class itself. My updated class looks like this:
using
System.IO;
using
System.Xml.Serialization;
namespace
Iteration1
{
public
class
Hello
{
public
string
Message { get
; set
; }
public
static
Hello ReadDocument(string
fileName)
{
XmlSerializer xs = new
XmlSerializer(typeof
(Hello));
using
(Stream stream = File.OpenRead(fileName))
{
return
xs.Deserialize(stream) as
Hello;
}
}
public
void
WriteDocument(string
fileName)
{
XmlSerializer xs = new
XmlSerializer(typeof
(Hello));
using
(Stream stream = File.OpenWrite(fileName))
{
xs.Serialize(stream, this
);
}
}
}
}
There are some minor changes from what we had in the background, and a
few things to note. The first thing to note is that ReadDocument
is a static function. This allows you to use a nice trick to load your
document:
Hello read = Hello.ReadDocument(fname);
The next thing of note is WriteDocument
, it now only
takes a file name and not an instance of a Hello
object.
This is because we can just use the keyword “this
” to write out the document. A
call to save this out would look something like this:
Hello data = new
Hello { Message = "
Hello World"
};
data.WriteDocument(fname);
Now, for the longest time, this is where I would end my work. After all, when I needed to do this for a new class, I could just copy and paste these functions and modify them. Now, let me show you the fallacy of that statement. Say, we have a new class (this one is a goodbye world class). So, we start off with something like:
namespace
Iteration1
{
public
class
Goodbye
{
public
string
Reason { get
; set
; }
}
}
Now, we copy in our utility functions and modify them:
using
System.IO;
using
System.Xml.Serialization;
namespace
Iteration1
{
public
class
Goodbye
{
public
string
Reason { get
; set
; }
public
static
Goodbye ReadDocument(string
fileName)
{
XmlSerializer xs = new
XmlSerializer(typeof
(Goodbye));
using
(Stream stream = File.OpenRead(fileName))
{
return
xs.Deserialize(stream) as
Goodbye;
}
}
public
void
WriteDocument(string
fileName)
{
XmlSerializer xs = new
XmlSerializer(typeof
(Hello));
using
(Stream stream = File.OpenWrite(fileName))
{
xs.Serialize(stream, this
);
}
}
}
}
Do you see the error? It won't cause a compile error, and as long as all you do is read from files, you may not notice this error for a long time, until you finally go to do a save and get an exception like this one:
"There was an error generating the XML document."
Iteration 2
Refactor is the word here, and now we introduce Generics to help us.
As a logical extension to our second example, let's look at the next
iteration of this solution with our first XMLSupport
class.
First of all, we revert back to the initial implementation of Hello.cs
:
Now, we introduce the utility class:
using
System.IO;
using
System.Xml.Serialization;
namespace
Iteration2
{
public
sealed
class
XMLUtility
{
public
static
T ReadDocument<t>(string
fileName)
{
XmlSerializer xs = new
XmlSerializer(typeof
(T));
using
(Stream stream = File.OpenRead(fileName))
{
return
(T) xs.Deserialize(stream);
}
}
public
static
void
WriteDocument<t>(T xmlObject, string
fileName)
{
XmlSerializer xs = new
XmlSerializer(typeof
(T));
using
(Stream stream = File.OpenWrite(fileName))
{
xs.Serialize(stream, xmlObject);
}
}
}
}
You can now read from a file using a statement like:
Hello read = XMLUtility.ReadDocument<hello>(fname);
You can write out to a file using a statement like:
Hello data = new
Hello { Message = "
Hello World"
};
XMLUtility.WriteDocument<hello>(data, fname);
Iteration 3
While the previous solution (with a little augmentation) is a good step forward (and possibly the correct solution in a number of scenarios), I think there is still more we can do with it. As opposed to using a class with generic functions, we introduce a generic class:
using
System.IO;
using
System.Xml.Serialization;
namespace
Iteration3
{
public
sealed
class
XMLUtility<t>
{
public
static
T ReadDocument(string
fileName)
{
XmlSerializer xs = new
XmlSerializer(typeof
(T));
using
(Stream stream = File.OpenRead(fileName))
{
return
(T)xs.Deserialize(stream);
}
}
public
static
void
WriteDocument(T xmlObject, string
fileName)
{
XmlSerializer xs = new
XmlSerializer(typeof
(T));
using
(Stream stream = File.OpenWrite(fileName))
{
xs.Serialize(stream, xmlObject);
}
}
}
}
Now honestly, this doesn't make things much better yet, since the calls still look like:
Hello data = new
Hello { Message = "
Hello World"
};
XMLUtility<hello>.WriteDocument(data, fname);
and:
Hello read = XMLUtility<hello>.ReadDocument(fname);
Iteration 4
Now that we have abstracted out to a generic class, let's introduce some inheritance to the mix. First, the utility class:
using
System;
using
System.IO;
using
System.Xml.Serialization;
namespace
Iteration4
{
public
abstract
class
XMLSupport<t>
{
public
static
T ReadDocument(string
fileName)
{
XmlSerializer xs = new
XmlSerializer(typeof
(T));
using
(Stream stream = File.OpenRead(fileName))
{
return
(T)xs.Deserialize(stream);
}
}
public
void
WriteDocument(string
fileName)
{
XmlSerializer xs = new
XmlSerializer(typeof
(T));
using
(Stream stream = File.OpenWrite(fileName))
{
xs.Serialize(stream, this
);
}
}
}
}
Now, let's take a look at our data class:
namespace
Iteration4
{
public
class
Hello : XMLSupport<hello>
{
public
string
Message { get
; set
; }
}
}
So this makes a call to read look like:
Hello read = Hello.ReadDocument(fname);
And the call to write look like:
Hello data = new
Hello { Message = "
Hello World"
};
data.WriteDocument(fname);
Which is considerably easier to read and use over previous iterations.
Other Functionality
The final version of XMLSupport
adds new pieces of
functionality, and also entails some name changes. Because there are
times when you want to read and write to strings and generic streams (as
opposed to files), there are now three read and three write operations:
Read | Write |
FromXmlString
|
ToXmlString
|
FromStream
|
ToStream
|
FromFile
|
ToFile
|
For other advanced reasons, two events have been added: BeforeSave
and AfterLoad
. They are generic event handlers (EventHandler<
eventargs>
)
that you can add your own events to.
Using the Code
Making use of this library is pretty simple. Take a look at the
included test code, and you can see a Hello
class:
public
class
Hello : XMLSupport<hello>
{
public
Hello()
{
BeforeSave += BeforeSaveEvent;
AfterLoad += AfterLoadEvent;
}
public
string
Message { get
; set
; }
[XmlIgnore]
public
bool
beforeSaveCalled = false
;
[XmlIgnore]
public
bool
afterLoadCalled = false
;
protected
void
BeforeSaveEvent(object
sender, EventArgs args)
{
beforeSaveCalled = true
;
}
protected
void
AfterLoadEvent(object
sender, EventArgs args)
{
afterLoadCalled = true
;
}
}
This class makes use of the BeforeSave
and AfterLoad
events for testing purposes, but gives an example of how to use it.
Here is a simple example of reading in from a string using the above class:
Hello ret = Hello.FromXmlString("
<hello><message>Hello World</message></hello>"
);
The functions, for the most part, work just like iteration 4 described above, with just slightly different method names.
Conclusion
The Microsoft XmlSerializer
is an extremely useful
utility when working with XML files. Using some of the advanced features
of C# and Generics, it is possible to make adding load and save support
a very simple operation. Using the XMLSupport library can make this a
very simple process.