Tiffany B. Brown

HTML is not XML

The problem above was caused by three things working in concert.

  1. An HTML page was served with a Content-Type: application/xhtml+xml; response header. Here’s what the server response headers look like:
    HTTP/1.1 200 OK
    Cache-Control: no-cache
    Pragma: no-cache
    Content-Length: 4001
    Content-Type: application/xhtml+xml; charset=utf-8
    Expires: -1
    Server: Microsoft-IIS/7.0
    Set-Cookie: ASP.NET_SessionId=x0zeoofiwnk2lmz5q4pcso45; path=/; HttpOnly
    X-AspNet-Version: 2.0.50727
    X-Powered-By: ASP.NET
    Date: Wed, 09 Mar 2011 23:17:34 GMT

    Opera then parsed it according to X(HT)ML rules instead of HTML rules. (Note: this seems to be a problem with IIS servers.)

  2. An empty element was not self-closed. That <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> tag should be <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />. Remember your space-slashes people!
  3. A missing DOCTYPE means that even if errors 1 and 2 were fixed, Opera would still parse the document as XML instead of as XHTML. Opera users (and Firefox users) would get that lovely “This document had no style information” screen.

While many browsers ignore mismatches between the document’s markup and the server response header, Opera does not. You also can’t be sure that future versions of common browsers won’t as more browsers adopt HTML5 parsing rules.*

In the current version of the HTML specification, a document served as text/html is supposed to be parsed according to HTML rules. A document served with an application/xhtml+xml MIME type must be parsed according to the stricter markup rules of X(HT)ML.

Let me emphasize the above point: server MIME types — not DOCTYPES — determine which parsing rules the browser applies.**

So what does this mean? It means you need to be a better coder. Make sure that what your server says agrees with what your document delivers. If you can’t configure the server, make sure you code your documents according to the MIME type your server says it is. Use a DOCTYPE. Understand the differences between XHTML and HTML. And, as always, validate your code.

* This is a bad name for it, but I can’t think of a better one.

** That’s how it’s supposed to work at least.

Comments are closed.