AddThis Social Bookmark Button

Listen Print

An Interview with Brett McLaughlin

by Lori Houston
05/01/2000

Brett McLaughlin worried that his presentation at the O'Reilly Conference on Java [March 2000] would be a dud. He thought his session's vague title, "Uncoupling Applications: Modular Application Architecture with J2EE" didn't adequately convey the importance of his message. And on top of that, he had to compete with a Sun Microsystems presentation on J2EE (Java 2 Enterprise Edition) next door.

But to his pleasant surprise, McLaughlin ended up with a standing-room-only audience. And the frustrations developers expressed in his session confirmed his thinking: Developers are tired of application components so tightly coupled they can't be used anywhere else. Developers want to be able to uncouple their applications from the data. And they want tools to enable their software components to talk to each other through common contracts and still handle a wide variety of different implementations.

XML affords all of this and more, according to McLaughlin, who is the author of O'Reilly's upcoming book Java and XML. As a longtime Java developer now working extensively with XML, he felt compelled to write this book. O'Reilly talked with McLaughlin about XML's rising popularity, XML as the link to fulfilling Java's "write once, run anywhere" promise, and his new book.

XML's Develolpment and Significance

Houston:
There's a lot of talk and excitement around XML. What exactly is XML, and why is it so significant?

McLaughlin:
XML is the Extensible Markup Language, and it's been around in a much different fashion for twenty to twenty-five years. XML is actually an offshoot of SGML (Standard Generalized Markup Language), used for publishing and markup. The term "XML" was coined around 1996 and gained more widespread attention when the World Wide Web Consortium (W3C) accepted XML in late 1998 as a formal recommendation. That's when people began to realize its merit.

The excitement around XML stems from one of its most important features: It doesn't say a whole lot about itself. By contrast, HTML (which is another markup language), has a very specific set of tools--tags and attributes--that are recognized and processed only one way. XML says, "We're not intelligent enough to think of every possible way you use your data, so we'll let you use it in whatever way makes sense." You have to let your document users know your tag meanings, but XML provides for this by defining DTDs (document type descriptors) to give tags concrete definitions, attributes, and all kinds of parameters.

Houston:
Why is that so important?

McLaughlin:
This adds true portability. Even with Java, which is supposedly a portable language, data still has to be represented. Year after year, project after project, Java developers like me wind up creating their own little proprietary formats to represent a particular project. When the very next project has different requirements, data must be represented differently. That's why languages like Perl are so useful; they're based on parsing text and handling different text formats efficiently. Even though we say Java is "write once, run anywhere," really it isn't. You can compile your byte code and then move it around, but you have to make sure anyone using your application can support that.

XML lets you work a little smarter by defining a standard way for any data to be defined, without defining the particular semantics. What to call attributes and elements is left up to you to define for your particular project's need. XML creates portable code and portable data, allowing for that third-party instance of business-to-business, e-commerce, and e-business--all those e's being promised right now. For the first time we really have complete application portability, something we've all been claiming for years. We're finally actually getting to a standard way to exchange data and write code.

Houston:
If XML has been around for such a long time, why is it only now being recognized as the key to true portability?

McLaughlin:
SGML had all the foundational work done, but everything still focused more on the "M" in SGML, the markup, or perhaps the L, the language, by requiring established meanings for all the languages' words. Everybody forgot to consider variables in usage. I'm from Texas, and we say, "Y'all." You won't find that in any dictionary. It means something to people in Texas but if I go up to Boston and I say "Y'all," they'll look at me with a puzzled expression and say "What? Isn't a 'yawl' some kind of a boat?"

Somebody finally realized people are going to come up with their own slang and meanings, so why not provide a format for that? And in 1996, James Clark suggested the Extensible Markup Language was developed to accomplish this. XML finally caught on because of its ability to accommodate many disparate creations and to find shortcuts without breaking specifications.

Houston:
What is your background with Java and XML?

McLaughlin:
I started developing with Java before the release of 1.0 in 1995. Before that I worked in Pascal, C, and other languages. I have always found myself in a very Web- centric environment because I came into this field right when the Web was taking off in the early '90s. When Java came along I started doing more enterprise-type applications, mainly server-side, J2EE programming (before anybody coined that term). When XML hit and became a spec, it became a significant focus in my work for the past year and a half.

But I'm certainly not from the XML school of thought. I'm really a Java developer at heart who sees XML as something useful. That's an important distinction. One of the reasons I've gotten into Java and XML--and one reason I think this book, Java and XML, is very different from many other books on XML--is that it's from a Java point of view.

XML's Current and Potential Uses

Houston:
How is XML currently being used?

McLaughlin:
One of the biggest uses right now, probably because it's easiest for people to get their minds around, is for presentations without coupling data. With technology like JSP (Java Server Pages)--which everybody is really big on right now--your presentation and your data are mixed. This is also true of HTML. You may have a 500-line HTML or JSP page, but only 50 lines of actual data. The rest is font, style, and table tags to produce a nice page in some browser.

The problem with this is twofold. First, there is the longstanding problem of handling changes. The marketing people want to change a logo, for example, so everything has to be red instead of blue. Every developer has dealt with this. Maybe you can do a global search-and-replace. But if some data had the word "blue" in it, now you've got this wonderfully formatted page with incorrect data because everywhere it said "blue" it now says "red." People have gotten sick and tired of that. They want to be able to separate their data from their presentation. They want fifty lines of data in one file and the HTML or markup in another file, and they want to merge those when necessary.

The second, more prevalent problem lately is the advent of the wireless markup language for phone, palm pilots, and handheld devices with Internet connectivity and pure Java browsers. Suddenly developers can no longer assume HTML clients. There may be clients that support only a subset of HTML. The knee-jerk response is to code to the lowest common denominator so all devices have nifty looking displays. But company Web pages look horrible with only ten tags to work with. The reaction at the other end of the spectrum is to build completely different sites for different clients, incurring huge maintenance overhead.

I can write an XML document using nothing but data with my made-up elements and attribute names. I can then create an XSL (Extensible Stylesheet Language) stylesheet, which is another offshoot or specification of XML that follows the same rules as the core XML language. XSL allows you to specify markup specifics and then provides a pattern-matching approach with instructions. There's also an XSL/T processor to run the document and stylesheet together to produce output.

This means I can create an XSL stylesheet for an HTML client, one for my wireless markup language client, and one for my palm client. And I can go even further. Perhaps I want to use some fancy DHMTML for Internet Explorer 5. I can use a different stylesheet for that one versus the Netscape style sheet, which doesn't support DHTML as much. I can end up with an array of different presentations, all of which can be applied to the same underlying document.

Houston:
Houston: What are some other potential uses for XML?

McLaughlin:
This is where it gets exciting. The first use is getting past the antiquated notion of what a client is, that a client has to be some user. For example, in a large enterprise application, rather than thinking of an application as having clients, you could approach your application as a database with several clients. One client is an Enterprise Java Beans container because it consumes data from the database. Another client of the EJB layer is a servlet that consumes data from the EJB container. That servlet may generate an XML page for another client to actually view. And finally, some other program acts as yet another client sucking data from that program.

When we get out of the mode of thinking of people as clients and everything else as the application, we start building these very loose contracts between components. If I want to expose my EJB container and business logic to an entirely different company, that's fine. The XML data I'm sending back and forth is application neutral. Perhaps the client doesn't want to use all of the data, or they want the data to conform to some filter. In XML, the developer doesn't have to require any presentation-specific markup. Clients are not forced to deal with the developer's application paradigm.

Within the next year, people will really begin to understand that it's all just data that they can manipulate differently using XML. Right now we're still writing XML like it's HTML or some presentation model. We're still thinking linearly. But you can avoid locking yourself into a presentation model in XML. You can group all your tags at the top of the file. The stylesheet can deal with those tags at the beginning of the output for parsing document information, regardless of presentation specifics. In an XML-centric world, we can model our thoughts toward the best representation of data so that any program can use that data.

Houston:
Houston: What's an example of the implications of this?

McLaughlin:
In the last chapter of Java and XML, the O'Reilly editors let me stand on my soapbox a little about where I think XML is headed. I focus on a technology called XML Schema, which is, among other things, another way to represent the strengths of XML and what is allowable. XML Schema is an analog to a Java interface for data. When you develop an interface in Java, you define the methods. But if you want to defer how those methods are implemented, you can allow for different ways to implement them, as long as it conforms to the interface. XML Schema does the same thing for data. It doesn't care what the text is, but it must conform to certain rules you establish.

Ask anyone doing enterprise application development, and I'd wager they're spending ten to twenty-five percent of their time coding validation. For example, when a user inputs a taken or invalid user name or forgets to type in their password, a form comes back and alerts the user. It may come back several times declining subsequent tries and suggest available user names or passwords, most of which usually don't make much sense to the user. Someone has to code all this logic, and it's a real hassle. This is a fundamental necessity, yet it's just like the proprietary data format. Developers go from company to company re-coding validation into business logic with very tight coupling. A simple change like requiring eight-character user names can cause a huge impact to a very simple application because business logic is embedded into the code.


A companion article by Brett McLaughlin, Java and XML: Interested Parties Apply Here, looks at why XML has not been so easy to use from Java and tells how recent Java and XML offerings are changing things, making XML usage from Java available to all who are interested.

I believe a close marriage between things like XML Schema and the Java Virtual Machine--the actual Java interpreter--is possible. Instead of having to code this validation explicitly, you define your Java program and interface. Then you also define a separate XML Schema to make available to that interface. Now you've got very portable code, and changing something like the length of a user name requires simply going into the Schema rather than recompiling code. The XML Schema model is very close to how Java is modeled. It's very object-oriented. There are even things in the XML Schema that allow you to do inheritance and equivalency so you can extend elements for new attributes without having to redefine a new type.

XML's Specs, Standards, and APIs

Houston:
Houston: Are there standards governing XML and its extensions?

McLaughlin:
Yes, Java and XML covers two main standards issues. But first I want to emphasize, the moment we say we want to do portable data with XML, we've made an implicit commitment to public standards. When I say I want to use XML to communicate with Hewlett-Packard, I've implicitly agreed for HP and I to have a standard among ourselves. And people are saying they want to use XML to talk to anybody. They don't want to have to bet on any one technology right now; they want portability across all boundaries. That changes the playing field.

When developers first started using Java, they were writing a wealth of non-standard extensions for Java. Due to mounting problems with this, developers had to start shipping all those extensions with their code. Then JDBC (Java Database Connectivity) came out to standardize Java for databases. The same thing has happened with servlets and EJB. If developers want to do distributed computing, they need a standard to make their code portable. As soon as somebody realizes their need is a common one, they should work to develop a common standard to keep data portable.

XML is undergoing this same sort of philosophy. Developers want to be able to handle presentations, so XSL was developed as the Extensible Stylesheet Language. Other things are cropping up like XQL, the XML Query Language, which handles database access in XML by defining a standard way to represent SQL queries and return results. XML is changing at an incredible pace. The XML 1.0 spec was finalized in late 1998, and here in early 2000 the number of extensions is starting to get almost humorous: XML, XSL, XSLT, XPath, XLink, and XPointer, and XSP-- all of which are important specifications in a year and a half's time.

Houston:
Is there an official XML community developing these specifications?

McLaughlin:
The biggest body involved with XML currently is the World Wide Web Consortium (W3C), which will come out with the pure XML specs, including XSL and XML Schema. Other XML tools have made their way into W3C, like XQL, developed and hosted at the University of North Carolina's metalab project. But the XML specs are coming mainly out of W3C working groups.

Most XML APIs (Application Programming Interfaces) are coming out of the XML community. The two most popular XML APIs for Java are SAX, a simple API for XML, and DOM, the document object model, which is actually used for lots of other things. David Megginson and others from the XML DEV list, one of the largest XML communities, hammered out SAX. The W3C formalized the DOM.

Java and XML: The Book

Houston:
Houston: Do you express any preference in your book regarding DOM or SAX?

McLaughlin:
It's very apparent that the people who wrote these APIs had an XML base of knowledge and then ported it to Javaland. They also ported all the same APIs to C, Perl, and all these different languages. That's really nice cross-platform compatibility, but it's not very Java- centric.

The XML-dev mailing list came up with SAX. It's great for XML-based specifications, but for Java APIs, I'm not convinced either SAX or DOM are good solutions. They require learning new constructs, when Java already has perfectly good alternatives. One of the things I'm proudest about in Java and XML is introducing JDOM, a completely new, pure Java interface. Jason Hunter, a Java developer who is using many of the new Java-XML tools, (and whose Java Servlet Programming book is the best there is), worked with me on developing this. JDOM provides for developing in XML in a Java-centric world without having to learn new constructs. JDOM also approaches XML data without any intention of porting it to C.

Chapter 8 in the book gives all the JDOM classes as well as a complete appendix with with an API reference. It's also available at http://www.jdom.org. It's a rather revolutionary approach, but I believe people do not like dealing with some of the weird things required to talk to the DOM. I go a step further by taking every example for the rest of the book and demonstrating how to rewrite these using JDOM. I am also putting my money where my mouth is and actually using a beta form of JDOM in production already.

The JDOM API will be open source, and by the time the book is released, there may be another completely native implementation not built on DOM or SAX. The book's appendix includes JDOM in the API reference along with SAX and DOM. I think JDOM is tremendously helpful for Java developers learning XML because it doesn't force them to be XML gurus. They can learn about XML through a Java worldview in which data is portable.

Houston:
So what is the audience for your book?

McLaughlin:
The book is probably best for those who know Java at least on an introductory level and have written code. They shouldn't stumble over javac or classes and interfaces. But the book assumes no XML knowledge at all. The reader doesn't even have to know what XML is; I explain it in the first chapter.

When we sent the book out for technical review, we gave the draft to two people with no XML knowledge at all--and one of them had never written a servlet or anything but some stand-alone Java code. In their feedback, they indicated no trouble. Both of them understood all of it. And this is not a "Java and XML in a Nutshell" level book! Not only does it teach fundamentals and how to do it, but it also tells you why something is important. Even for someone familiar with SAX and DOM who has been doing XML and production for years, this is still a great book, because the last six chapters covers topics they would have had to figure out on their own.

Houston:
What are some other important topics covered in your book?

McLaughlin:
There's a "Creating XML" chapter that goes step-by-step through creating a complete XML file, teaching all the constructs and elements, and assuming no knowledge of XML. The book covers how to parse that with Java and how to create XML and XSL stylesheets. There's also the latest coverage of SAX 2.0, which is now feature-frozen, and the DOM level 2, and some things coming in its next version. A business-to-business chapter discusses how to use many exciting technologies, such as RSS (Rich Site Summary), a means of transforming data and passing it around like channels in browsers. And Chapter 9 focuses on Web-publishing frameworks, showing how to use the Apache Cocoon project.

The book approaches all these topics as an application developer by showing how to download and use existing, standards-based tools, then building on top of this to create really good applications instead of mediocre applications built from scratch. That's an approach I haven't seen in other XML books, particularly those with a Java focus. Java and XML is for anyone who is doing enterprise application development, using Apache Cocoon, or who wants to know XML-RPC or how to write Perl applications that talk XML and spit it back out to Java. It definitely covers a wide breadth of subjects.

Houston:
I understand yours is the only book to cover XML-RPC?

McLaughlin:
Yes, and XML-RCP has been a hot topic. As evidence of this, one of our technical reviewers said, "I don't see any value in this at all," and another reviewer said, "This is the best thing I've ever found out about XML." To me this illustrates that XML is useful in so many different situations, we needed to cover as much as we could in the book.

XML-RPC is important because it provides an alternative to RMI, remote method invocation, in Java. RMI is a very useful but heavy-duty means of accessing objects on remote servers and performing foreign functions remotely. XML-RPC is easier to use, making it a good entry point for distributed systems. The book teaches the basic concepts. XML-RPC allow talking XML with a server that can't talk RMI but still needs to handle calls. XML-RPC lets you send your data and requests in XML, and then executes it. It also tells you what happened, and speaks XML back to you whether you're using Perl, Corba, C, or anything else.

Covering XML-RPC is intrinsically important due to a kind of rebirth. RPC was popular ten years ago and died out because everyone was writing proprietary data formats. RPC provides a way to communicate XML across the network, but it never had a good way to represent data uniformly. That's what XML does. XML-RPC also forces developers into a new paradigm.

Houston:
Do you expect to write more books about XML because of its rapidly moving development?

McLaughlin:
I was working on another Java book for O'Reilly on enterprise applications when they asked me to do Java and XML because this is such an important topic. The next step for me is to finish that first book and I'm personally very excited about that. Really it's the next step once someone has read Java and XML and wants to tackle a huge distributed application running on eighteen different servers. The enterprise book is a more advanced approach for someone who understands EJB and servlets. It walks through building a complete enterprise application and addresses such questions as: How do I build loose contracts? How do I push out XML to the client? How do I have a services architecture that I can request things happen remotely? It covers using XML-RPC for services and XML for validation, among other things.

I've tried to write both these books with the real world in mind. The cool thing is that I didn't have to make up any examples for the book. Everything is based on valid ideas that I've already put in production. All the code in both of these books--the examples, the JDOM and other interfaces--are going to be made available to the public as open source and maintained on a Web site. I'm committed to this stuff being usable. I want these to be the kinds of books that don't just sit on your bookshelf but actually lie open on your desk. [Editor's Note: O'Reilly doesn't yet have a title for Brett's new book, but we plan to relaease it in the winter of 2000.]


O'Reilly & Associates will release Java and XML in June 2000.


Brett McLaughlin works as an Enterprise Java consultant at Metro Information Services, and specializes in distributed systems architecture. He is author of the upcoming Java and XML (O'Reilly). He is involved in technologies such as Java servlets, Enterprise JavaBeans, XML, and business-to-business applications. He is an active developer on the Apache Cocoon project, EJBoss EJB server, and a co-founder of the Apache Turbine project.