Tuesday, December 27, 2011

HTML5 is not XML, Time To Get Over It

I feel like Captain Malcom Reynolds who fought for the resistance against the Alliance, but failed and now lives in the shadows of the all powerful Alliance.

I love XML ( ... I know there's 12 step programs for people like me). XML makes it so easy parsing well-structured, schema-verified XML. I applauded the efforts of the W3C to make the next generation HTML to be XHTML2.0 with all the clean syntax.
But the fight for XML championed by W3C.org has failed. WhatWG (re: the Alliance) has won over W3C.org (re: the Browncoats ) and HTML5 is not XML. It's time to get over it.
What does this mean?
1. Well, we shouldn't close void tags (those who can't have nested elements) like "<br />, it should be "<br> (Although some people I respect disagree like Estelle Weyl.) Reading WhatWG docs, I sense they really don't want us to close void tags.

The w3's documentation in section 8.1.2 Elements defines void elements to be

area, base, br, col, command, embed, hr, img, 
input, keygen, link, meta, param, source, track, wbr

2. Attributes do not have to have a value
<input type='checkbox' checked='checked' ...>
This looked a little silly anyway. Now you can have an attribute without a value.
<input type='checkbox' checked ...>

3. Attributes values usually do not have to be quoted
<input type=checkbox ...>

4. We can't process html5 pages with an XML parser. Really in the old days, you could only parse your pages anyway, and any included files from Google or your ad server would probably break your XML anyway.

In the WhatWG FAQ they recommend not using an XML parser on pages, but using an HTML to XML parser.

We fought the Alliance and they won. Come on all you XML-loving Browncoats like me, it's time to move on.

1 comment:

rking said...

Cool. Good to have it explained this way.

I was playing with the validator, and got some very ugly stuff to validate as HTML5. Now I understand why.