|
|
Rank: Devotee
Joined: 10/30/2007 Posts: 85 Location: Israel
|
Hi all, Recently I have came accross a new XML parser, which offers much better perfomance - both in parsing speeds and in memory consumption. They are talking about beting 3-5 times better than MSDOM (MSXML parser) - the one Umbraco uses by default. And it supports XPath as well (which is what Umbraco uses most) natively. Here is the URL for the parser - called VTD-XML: http://vtd-xml.sourceforge.net/. Perfomance comparisons can be found at http://vtd-xml.blogspot.com/2008/02/latest-benchmark-results.html. Read their spec carefully to understand how they are doing it. I highly recommend considering updating the core so Umbraco uses it instead of MSXML (this will enable large Umbraco installations to be even larger for a much lower perfomance and memory penalties). They have uploaded some C# code along with the Java/C samples. I'm not too familiar with the actual Umbraco code, but am willing to aid should any help be needed. Open source CMS using an open source XML parser - both are the best in their field. What else could one ask for? Itamar.
|
|
 Rank: Addict
Joined: 7/19/2006 Posts: 649 Location: Preston, UK
|
Itamar,
I also came across this about 2 weeks ago and showed it to Ryan ( umbraco search tools fame ) he was going to speak to Niels about it.
Regards
Ismail
Level 2 certified. If it aint broke dont fix.
|
|
Rank: Devotee
Joined: 10/30/2007 Posts: 85 Location: Israel
|
Great!
Let me know shall you need any help with it.
Itamar.
|
|
 Rank: Aficionado
Joined: 8/28/2007 Posts: 129 Location: Bavaria
|
That would be Great! :thumbup:
I Love umbraco
|
|
Rank: Enthusiast
Joined: 11/1/2007 Posts: 32
|
I would prefer to be able to choose engine from web.config. If I have debugged and released a site I would like to know that the core engine stays the same. If there's a slightest detail that differs in output or functionality it may break existing stuff.
|
|
Rank: Devotee
Joined: 10/30/2007 Posts: 85 Location: Israel
|
Here's what I was thinking about doing:
1. Create an IUmbracoXmlDocument interface, and implement XmlDocument and VTDXmlDocument upon it.
2. Make umbraco.content XmlContent() and XmlContentInternal() return IUmbracoXmlDocument, and by default return and XmlDocument instance. Make _xmlContent of IUmbracoXmlDocument instead of XmlDocument.
3. Have a web.config entry to optionally have Umbraco return VTDXmlDocument.
What do you think?
Any core members care to comment on this?
Itamar.
|
|
Rank: Aficionado
Joined: 10/2/2007 Posts: 165 Location: Czech Republic
|
|
|
Rank: Devotee
Joined: 10/30/2007 Posts: 85 Location: Israel
|
All I offer is to add one optional XML parser, that will do much better than MSXML. I'm not sure why a Provider is needed for that? as far as I can tell, all that is required is some core code hack to make it use an interface, so we can switch between XmlDocument and VTDXmlDocument with one simple configuration.
I'm really interested in hearing one of the core team's thoughts on this. I don't mind doing the dirty work, but I want first to make sure I got support and someone to cry to if I will get into trouble :)
I honestly think this can boost Umbraco's performance through the roof, and is worth the time. Plus I'm not sure it is that complicated to do.
Itamar.
|
|
 Rank: Addict
Joined: 3/17/2008 Posts: 953 Location: Nyborg, Denmark
|
I'd love to see a working prototype. If it just works, I think it's a great idea, even though it would need thorough testing on multiple cultures, etc.
Jeeeez, did I really start this :-)
|
|
Rank: Devotee
Joined: 10/30/2007 Posts: 85 Location: Israel
|
Being totally new to the inner-workings, can you point me to all the places XmlDocument or internal caching is being referenced?
Also, what source version should I base the prototype on, 3.1 being so close out?
Itamar.
|
|
 Rank: Addict
Joined: 3/17/2008 Posts: 953 Location: Nyborg, Denmark
|
Take the latest from cp - it's pretty stable now. It's probably the /content.cs and /macro.cs classes in the presentation project that's most interesting.
Jeeeez, did I really start this :-)
|
|
Rank: Devotee
Joined: 10/30/2007 Posts: 85 Location: Israel
|
OK, I will be working on the umbraco branch (not the 3.1 one). I will be in touch.
BTW - forum email notifications are not working for some reason...
|
|
Rank: Enthusiast
Joined: 7/25/2006 Posts: 13
|
Wow, it's great to see someone try this. My instinct would be to stick to nice clean interfaces rather than the provider pattern that always seem to leak abstractions in practice. Could even get some duck-typing of document nodes for neat templating syntax in there if we can intercept the implementation.
|
|
 Rank: Addict
Joined: 7/19/2006 Posts: 649 Location: Preston, UK
|
Ryan or Itamar,
Probably dumb ass question. What would need to be done regarding all the xpath calls ie selectnode etc would that be wrapped with interfaces as well?
Regards
Ismail
Level 2 certified. If it aint broke dont fix.
|
|
Rank: Devotee
Joined: 10/30/2007 Posts: 85 Location: Israel
|
Ismail,
I'm not sure I understand what you're asking? In the first phase I will just replace XmlDocument with a custom class, implementing IUmbracoXMLDocument, that interfaces with XmlDocument, so the core will not be tied to XmlDocument directly. Only then will I go and work on VTD-XML. I'm not terribly familiar with the source code yet, so I still am not sure how hard that would be... I forsee some trouble with getting XSLT transformation to work, I hope that won't be too bad.
Also, btw, commented on your Lucene post.
Itamar.
|
|
 Rank: Addict
Joined: 7/19/2006 Posts: 649 Location: Preston, UK
|
Itamar,
Many thanks for the lucene tip. I think i need to read a bit more about vtd in order to get clear picture of how it will all fit together. I am excited by the vtd and also looking at your clucene credentials reckon you will pull it off!
Regards
Ismail
Level 2 certified. If it aint broke dont fix.
|
|
Rank: Devotee
Joined: 10/30/2007 Posts: 85 Location: Israel
|
Put off what exactly (or you meant kick off)?!
It will take me some time to dig into the Umbraco core code, taking a vacation next week and the other stuff going on, but would love to hear any further insights on this matter.
Itamar.
|
|
|
Guest |