Like all technologies, XML has its strong and weak points. It is useful in many situations. However it is often used in places where it is not appropriate. The performance cost of XML mis-use or over-use can be enormous as shown in test results I recently reviewed.
Test Parameters
The test was a comparison of data retrieval times by relational and xml methods. The data consisted of approximately 100,000 orders and 1.2 million order line items. In the relational model they were stored as an Orders table and an OrderDetail table. XML was tested separately in element format and in attribute format. Both relational and xml data had similar indexes.
Below is a table summarizing the results. A cold execution means the query was executed after the buffers were flushed and no data was in cache. A hot execution was done after the data was in cache:
Data Format | Cold Exec (ms} | Hot Exec (ms) | Storage |
XML Element | 59373 | 37054 | 397 |
XML Attribute | 3190 | 31 | 407 |
Relational Tables | 250 | 1 | 214 |
As you can see there is a HUGE performance degradation using element-style XML. It is on the order of 37,000% slower when hot execution times are compared with relational data retrieval. Attribute style XML fares better but it is still many times less efficient than data retrieval from relational tables.
As well as having performance issues, both forms of xml take almost twice the disk space of relational data as can be seen the Storage column of the table above.
XML has many legitimate uses in a database environment but manipulation of large datasets is not one of them.