Pulled Apart - Part XII: Parsing feeds (ATOM & RSS) in .NET

Submitted by Robert MacLean on Mon, 08/30/2010 - 13:07

onebit_26

Note: This is part of a series, you can find the rest of the parts in the series index.

I’ve mentioned that a podcatcher is really just two things put together, a download manager and a feed parser. Feed parsing is not the easiest item to build, just look at my attempt many years ago to build a Delphi RSS parser called SimpleRSS – it works well, but there are many edge cases which can kill it.

The key things that trip you up when writing a parser is are:

  • RSS and ATOM – There is two major formats for feeds, RSS and ATOM which are very different.
  • Versioning – RSS and ATOM both have a number of versions which requires completely different parsing going on.
  • Errors – It is easy to produce these, it’s just XML, and so there is a lot of feeds which do not validate.

With that in mind I am really happy that the .NET Framework (since 3.5), includes it’s own parser for feeds: SyndicationFeed.

SyndicationFeed

System.ServiceModel.SyndicationFeed supports both ATOM (version 1.0) and RSS (version 2.0) and to use it you need to add a reference to System.ServiceModel.dll. It only handles the parsing, and creation although I don’t care about that functionality in Pull, of feeds. To parse the feed you parse in a XmlReader to the Load method and it takes care of the parsing.

using (XmlReader reader = XmlReader.Create(podcastUrl))
{
    return SyndicationFeed.Load(reader);
}
That really is as complex as this gets Smile