Get weekly top stories updates!
Those readers familiar with the original morons.org are familiar with a slow site, inconsistent pages, vast amounts of clutter, and incompatability with many popular browsers, most notably Netscape 4.x. Those familiar with the back-end code (pretty much just Nick) are familiar with code organized much like boiled angel-hair pasta, written in PHP with the best of intentions. Sadly, even with the best of intentions, two years of code evolution in a language like PHP can result in a mess of catastrophic proportions.
It became clear as morons.org was reaching its 1000th article that the site was in serious need of an overhaul. The site was growing up, reaching more readers than ever, and needed a cleaner, more professional, more consistent look. The code needed a rewrite into something that had hope of being maintained. A new platform was in order.
And so in spite of harsh criticism from many of my peers, I made the choice to convert the site to Java, and as more time goes by, I'm more convinced that this was the right decision. Many old-school Unix hacks don't like Java because it's an object language, and it has a reputation for being slow and consuming a lot of resources. In addition to that, most people got their first impression of Java from crashy, slow applets. Most of the reasons not to use it, however, have gone away over the last 5 years, and I'm convinced it is now a stable, mature platform upon which to build web applications (not applets though). Recent developments with JVM optimizations and JIT compilers have done wonders for speed, and resource usage isn't as big a deal anymore, as it is common for even home PCs to have half a gig of RAM these days.
One of the earlier hurdles to development was the JVM being more or less unavailable for my platform of choice, FreeBSD. Fortunately, many of those issues have been worked out, and there is a fairly stable 1.3.1 JDK now. Unfortunately, the HotSpot JIT Compiler isn't functional yet, but OpenJIT has made a reasonable replacement with performance gains over the purely interpreted JVM.
Along with the developments of the language itself, Java servlet containers have come a long way. After experimenting with Tomcat, Orion, and Resin, I chose Resin for its performance and compliance with most of the Servlet spec that I needed. Resin's license is compatible with my efforts- it's free for development and non-profit use. Further, Resin's XSLT classes do faster transformation than other packages. More on that in a bit.
PostgreSQL would be the database of choice. After years of using mySQL and never once liking it, I finally had an opportunity to choose something else for the new platform. PostgreSQL is fast enough, and more importantly, it's stable and reliable.
After our former web host caused our site to be unavaible one-too-many times, I decided I'd had it with shared hosting services. They all suck. I came to the conclusion that it was time to colocate. This meant buying a 1U PC with some customization for redundancy. I found what I was looking for from eRacks.com: a 1.5 GHZ Athlon box with a gig of RAM and two 40-gig drives. A FreeBSD install was customized to put swap on the secondary drive and make redundant writes for the web partition onto both drives. That way, should one drive fail, all that needs to be done to restore the system is to swap to the other drive. I went the simple route and used ccd for this task- vinum was way too convoluted for my taste.
With everything in place, development could begin. That's always the hard part, and there's always one major decision to be made: how will the code be structured? I wanted the site to be compatible with almost every browser out there, but I didn't want people using Mozilla to suffer because someone would be using a cel phone. I wanted the back-end code to be robust, and more importantly, I wanted to keep functional logic separated from display logic. My initial plan was to code custom tag libraries, but I quickly changed my mind. I still might use them later, but in the interest of delivering the site before I retire, they'll wait for now. Besides, what does it really buy you to say <foo/> versus <%= foo.bar %>?
Now comes the really interesting part. HTML sucks. It just does. Everything about HTML is crap. If you don't believe me, try writing one document that uses anything other than the most primitive markup and getting it to display as you intended on more than one browser. See if you can view it on WebTV, Netscape 3, and Lynx. I hate coding HTML because I hate unnecessary pain in my ass. Fortunately, I discovered XSLT.
Now when I say "XML" there will be a certain percentage of people who will make stupid groaning noises or accuse me of "leveraging synergy" or other such bullshit. I'm not interested in any of the hype behind XML. It is what it is, and that's not very much. It's just a handy way of expressing structured data in a consistent fashion. If you don't like that, tough. I don't care.
So, XML. XSLT means "XML Style Language Transforms" and basically it's nothing more than a way of describing how to translate from one type of XML to another. The description itself is done in XML (no surprise there). I found that rather than describing the structure of a page in the traditional HTML-ish way with generic entities like "table" and "p", it worked much better for me to describe my entities on the page and leave HTML to handle the very simple text markup. This means I have tags like <page> and <navigation>. My labelled forms are done with my <lform> tag. Then comes the cool part.
I've distilled browers down into 7 basic types: those that support full DOM and CSS (only one does to date- Mozilla/Netscape 6), those that have good CSS support but still need tables for layout, those that have poor CSS support and need tables for layout, those with no CSS support at all, but still have tables, those which have no CSS support and no tables, those with limited CSS support and tables, but with a constrained screen horizontally (WebTV), and those with very small screens but limited CSS support and tables. It's possible to categorize browsers by checking the User-Agent string, and making a somewhat pessimistic guess if the User-Agent string isn't available. Then, based upon the category, all that has to be done to send out the right kind of HTML for the browser is to point at an appropriate XSLT sheet; the pages generating the code do not have to change at all.
As development progressed further, I found that I could develop yet another level of abstraction. So far, there were objects in Java classes, then JSP operating on those objects to produce XML. I found, however, that often it is desirable to be able to plug small JSP components into larger JSP in the same way as objects are plugged into small JSP components. This modular component design allows a message board or article list to be plugged into any page; gone were the former ties between primary page functionality and allowed page content. This type of modularity owes a great deal to an underlying object design, which keeps the individual objects encapsulated into their own problem spaces. When things are too intertwined, separating a message board from an article becomes impossible.
The end result of all of this development is a news site with a high degree of consistency, broad range of accessibility, and huge performance gains over the former PHP site. The new morons.org runs on what is effectively a highly modular news appliance.
Want in on my technology for your own site? Contact me.
HTML generated using XSLT for type-5 browser (HTML 2.0; text-only; no tables; Lynx compatible)