first_page

News Flash: Select Distinct Operations in XSLT are Slow

The following selects distinct SubjectId nodes in the entire set:/descendant::SubjectId[not(preceding::SubjectId = .)] Buy this book at Amazon.com!The catch is that this operation is very slow—even for sets with less than 500 element nodes off the root node. My workaround was to build a ‘custom’ set with XmlTextWriter to deliberately append lookup “data islands” in the main set. I can see this workaround being a problem in a solution dominated by ‘hard’ schemas—schemas that are locked by organizational politics. Fortunately my code is not running for president.

Related Resources

“[How can I write an XPath expression that behaves like the SQL SELECT DISTINCT statement?](http://msdn.microsoft.com/msdnmag/issues/02/01/xml/)” Scroll down to this user question in “[Object Graphs, XPath, String Comparisons, and More](http://msdn.microsoft.com/msdnmag/issues/02/01/xml/)” at MSDN under “The XML Files.” For the year 2002 the answer to this question should have come with the performance-hit caveat.
“[Grouping is a common problem in XSLT stylesheets.](http://www.jenitennison.com/xslt/grouping/index.html)” This page refers to the famous Muenchian Method and legally lifts verbatim from Appendix B of [*XSLT and XPATH: A Guide to XML Transformations*](http://www.amazon.com/exec/obidos/ISBN=0130404462thekintespacec00A/ "Buy this book at Amazon.com"). I’m still trying to figure this one out…
“[XSLT key() Function](http://www.w3schools.com/xsl/func_key.asp)” Explains the `key()` function used in the Muenchian Method.

rasx()