We want to store xml message in a database for batch processing, logging and retention. For some time now Sql Server offers the xml data type for columns. This way we could index the xml, validate it with an xsd and write a blogpost about the data type
Image courtesy of Stuart Miles / FreeDigitalPhotos.net
TLDR
After some testing we decided not to use the xml data type but nvarchar(max).
Pros
Microsoft has written a msdn article about when to consider the xml data type.
We can index the xml and query a group of messages with the same xml elements.
With xsd validation the columns would only accept xml that we can process.
Cons
Entity Framework recognizes the xml column as a string. There are ways to get EF to play ball, but we’re only interested in the out-of-the-box solution. This would mean conversion to and from string on every database action.
While putting some xml documents into the column (using EF) we got errors. When changing the encoding in the xml header between UTF8 and UTF16 the database would complain.
All our processing is done in dotNET code. The xsd validation in Sql Server would be an extra guard for inserts from other applications or manual scripts. This means some extra code in our program to validate the xml (again) after reading.
Conclusion
We decided to go for the nvarchar(max) data type and do the xml stuff in dotNET. This was the best solution for our problem, you might come to a different conclusion.