What is the best way to compare XML files for equality?

I'm using .NET 2.0, and a recent code change has invalidated my previous Assert.AreEqual call (which compared two strings of XML). Only one element of the XML is actually different in the new codebase, so my hope is that a comparison of all the other elements will give me the result I want. The comparison needs to be done programmatically, since it's part of a unit test.

At first, I was considering using a couple instances of XmlDocument. But then I found this: http://drowningintechnicaldebt.com/blogs/scottroycraft/archive/2007/05/06/comparing-xml-files.aspx

It looks like it might work, but I was interested in Stack Overflow feedback in case there's a better way.

I'd like to avoid adding another dependency for this if at all possible.

Similar questions
  • Is there an XML asserts for NUnit?
  • How would you compare two XML Documents?

It really depends on what you want to check as "differences".

Right now, we're using Microsoft XmlDiff: http://msdn.microsoft.com/en-us/library/aa302294.aspx

Visually compare two XML files to locate the exact locations where the editing has been done. Editing an XML template feels good until an error occur and you are in a hurry of searching the location of the code change. Manually making a side by side comparison is a tedious process and keeping track of each and every changes consumes a lot of time.

You might find it's less fragile to parse the XML into an XmlDocument and base your Assert calls on XPath Query. Here are some helper assertion methods that I use frequently. Each one takes a XPathNavigator, which you can obtain by calling CreateNavigator() on the XmlDocument or on any node retrieved from the document. An example of usage would be:

     XmlDocument doc = new XmlDocument( "Testdoc.xml" );
     XPathNavigator nav = doc.CreateNavigator();
     AssertNodeValue( nav, "/root/foo", "foo_val" );
     AssertNodeCount( nav, "/root/bar", 6 )

    private static void AssertNodeValue(XPathNavigator nav,
                                         string xpath, string expected_val)
    {
        XPathNavigator node = nav.SelectSingleNode(xpath, nav);
        Assert.IsNotNull(node, "Node '{0}' not found", xpath);
        Assert.AreEqual( expected_val, node.Value );
    }

    private static void AssertNodeExists(XPathNavigator nav,
                                         string xpath)
    {
        XPathNavigator node = nav.SelectSingleNode(xpath, nav);
        Assert.IsNotNull(node, "Node '{0}' not found", xpath);
    }

    private static void AssertNodeDoesNotExist(XPathNavigator nav,
                                         string xpath)
    {
        XPathNavigator node = nav.SelectSingleNode(xpath, nav);
        Assert.IsNull(node, "Node '{0}' found when it should not exist", xpath);
    }

    private static void AssertNodeCount(XPathNavigator nav, string xpath, int count)
    {
        XPathNodeIterator nodes = nav.Select( xpath, nav );
        Assert.That( nodes.Count, Is.EqualTo( count ) );
    }

I need to compare two huge xml files, I can't read by line and compare using io, so I need to parse it and compare each object. Another thing, After generating JAXB pojo classes, I can see there are no equals methods in those classes, so any way whether we can generate that equals method or not.

Doing a simple string compare on a xml string not always work. Why ?

for example both :

<MyElement></MyElmennt> and <MyElment/> are equal from an xml standpoint ..

There are algorithms for converting making an xml always look the same, they are called canonicalization algorithms. .Net has support for canonicalization.

// Reads two XML files into two strings String s1 = readFile("orders1.xml"); String s2 = readFile("orders.xml"); // Loads options saved in a property file Options.loadOptions("options"); // Compares two Strings representing XML entities System.out.println( ExamXML.compareXMLString( s1, s2 ) );

I wrote a small library with asserts for serialization, source.

Sample:

[Test]
public void Foo()
{
   ...
   XmlAssert.Equal(expected, actual, XmlAssertOptions.IgnoreDeclaration | XmlAssertOptions.IgnoreNamespaces);
}

Best way to compare 2 XML documents in .NET [duplicate] Ask Question Asked 11 years, 7 months ago. Comparison of two XML files node by node in .NET.

Because of the contents of an XML file can have different formatting and still be considered the same (from a DOM point of view) when you are testing the equality you need to determine what the measure of that equality is, for example is formatting ignored? does meta-data get ignored etc is positioning important, lots of edge cases.

Generally you would create a class that defines your equality rules and use it for your comparisons, and if your comparison class implements the IEqualityComparer and/or IEqualityComparer<T> interfaces, then your class can be used in a bunch of inbuilt framework lists as the equality test implementation as well. Plus of course you can have as many as you need to measure equality differently as your requirements require.

i.e

IEnumerable<T>.Contains
IEnumerable<T>.Equals
The constructior of a Dictionary etc etc

I believe käµfm³d's suggestion about textual comparison is OK for such files. However, I'd write a utility that compares element (text nodes) values. This should be pretty straightforward. We went this way (some years ago) when we had to compare pretty big and rather complex xml files.

You can use it’s compareTo() method to compare to BigDecimal numbers. It ignore the scale while comparing. a.compareTo(b); Method returns:-1 – if a . b)0 – if a == b. 1 – if a > b

Copy and paste, drag and drop a XML file or directly type in the editors above, and then click on "Compare" button they will be compared if the two XML are valids. You can also click on "load XML from URL" button to load your XML data from a URL (Must be https).

Regarding the Compare Plugin , it was because of it Ive started using NP++ , its free , its fast , it helped me learn codeing from examples , so Ive used it at home an work. But right now Im looking for something that could easly compare my 4700 old files with the new ones , the ability to compare the files from 2 folders(or with the same name

Comments
  • Since this question was first asked, a duplicate was raised with a better answer: stackoverflow.com/a/2924439/361842: Use Linq: XNode.DeepEquals(doc1, doc2)
  • I came across this link via the blog post I linked to in the question. I was hoping to avoid adding another dependency, if at all possible.
  • Then I guess you have to decide what's a "difference" for you, and develop your own algorythm with XmlDocuments, XpathNavigators, etc... But I think you are looking for a "business difference", not an "xml difference"
  • In this case, I know the structure between the two documents will be the same (no missing or extra elements). It's just whether the values of elements are the same or not.
  • Thanks Jeremy, this looks like a good solution. I'll try it out.