Wednesday, 26 March 2014

LINQ to XML

Recently I spent a lot of time working with a SOAP web service in C#. In doing this I was looking for a simple way to parse the XML returned back from this service, this is where LINQ came in.

LINQ (Language-Integrated Query) was introduced natively into .NET 3.5 and provides a unified language to transform your data into objects. As well as XML, LINQ can also be used to work with SQL.

Take this example XML:
<soap:envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <soap:body>
    <getcollectionbyuprnanddateresponse xmlns="http://webservices.example.com/">
      <errorcode>0</errorcode>
      <errordescription>Success</errordescription>
      <successflag>true</successflag>
      <collections>
        <collection>
          <service>Data</service>
          <day>Monday</day>
        </collection>
        <collection>
          <service>Data</service>
          <day>Tuesday</day>
        </collection>
        <collection>
          <service>Data</service>
          <day>Wednesday</day>
        </collection>
      </collections>
    </getcollectionbyuprnanddateresponse>
  </soap:body>
</soap:envelope>
Looking at this example xml returned from the web service, we can see that this returns a collections node, with multiple collection child nodes.

Lets say we wanted to parse this XML so we can now work with the collection data returned, this is where LINQ to XML can make life really simple. Previously, before LINQ, we could have used an XML document and run a foreach loop over each node checking for the nodes we want.

Fortunately with LINQ we do not have to do this, firstly we need to create an XDocument of our XML. To do this I have parsed a string of XML:
   
   var reader = response.GetResponseStream();
   string xmlString = reader.ReadToEnd();
   XDocument xmlDoc = XDocument.Parse(xmlString);
This can also be done by loading the Response Stream from my HttpWebResponse:
   
   var reader = response.GetResponseStream();
   XDocument xmlDoc = XDocument.Load(reader);
Now we have our XDocument we are almost ready to query it with LINQ, however we first need one crucial part for our LINQ query to work and this is called an XNamespace:
   XNamespace ns = "http://webservices.example.com/";
The reason we need the above namespace is due to the layout of our XML:
   <getcollectionbyuprnanddateresponse xmlns="http://webservices.example.com/">
As you can see from the above this node has it's own xml namespace, so without declaring this in our LINQ query, no results will be found. Now we have all our objects created we are ready to query these with LINQ, lets say we want to select the service and day properties from our first collection node in the XML. To start we will select all the collection descendants in our XDocument, then we select the properties into an array:
var query = xmlDoc.Descendants(ns + "Collection")
                  .Select(x => new
                  {
                      Services = x.Element(ns + "service").Value,
                      Day = x.Element(ns + "day").Value
                  });
Finally we will call FirstOrDefault on this query to only return the first collection, if no collections are found, then this will return null:
var query = xmlDoc.Descendants(ns + "Collection")
                  .Select(x => new
                  {
                      Service = x.Element(ns + "service").Value,
                      Day = x.Element(ns + "day").Value
                  }).FirstOrDefault();
Now we have our data stored in our query variable can we can access this via query.Day or query.Service. This is one of many ways that you can query data with LINQ, which can become a powerful asset when working with lots of data in C# or VB.

You can read more about LINQ in the Microsoft Developer Network here.