Java is not just a tool for creating animated icons on web pages anymore. It's a full-fledged programming language that is often the first choice for many coders today. The language's clean structure, automated memory management, and webcentric design makes it one of the fastest development tools around. Elliotte Rusty Harold's latest book, Java I/O, is a close look at how to move data in and out of programs written in Java. We asked him to sit down for a few minutes and discuss what Sun is doing right and wrong with Java's I/O.
- Wayner:
- Which Java programmers are going to be interested in Java I/O?
- Harold:
- All of them. Almost no non-trivial program does not eventually make use
of I/O for one purpose or another. While text book examples often limit
themselves to command line arguments and System.out.println(), real world
programs and real world programmers need to write files, open sockets,
encrypt data, talk to serial port devices, communicate with databases and
do a whole hell of a lot more. Using well designed I/O as opposed to
System.out.println() is one of the
distinguishing factors between amateur and professional programmers.
- Wayner:
- Many people know Java as a tool for making applets on the Net. It's
really a fully-functional
programming language. Do you think that some of the tools are better than C++?
- Harold:
- There's no question that Java's I/O tools are far more sophisticated,
powerful, and easy to use than C and C++'s. I/O in C, C++, and most other
languages, is hamstrung by the assumption that what's being read or written
is a 1970s era dumb terminal, or at least something very much like it. Java
is the first major language to throw out this assumption. The designers of
Java recognized that reading and writing files, network connections, and
serial port devices, was a lot more
important than toy programs from CS101 that read a number from the command
line and squared it.
Unfortunately, precisely because Java I/O is structured so differently from I/O in the languages most of us grew up with, many programmers don't realize how simple and powerful it really is. The questions I hear from my students and on various newsgroups and mailing lists tell me that most people aren't asking the right questions. For instance, the single most frequently asked question is how to read a number from the console. While the answer they want is fairly simple (attach a BufferedReader to System.in, call readLine(), and pass the string returned to Integer.parseInt()), the fact is they probably shouldn't be doing this in the first place.
Although it's the students who keep asking for some equivalent of C's scanf(), I blame the professors for not teaching them better. You see, this in a lot of introductory Java books that start off by introducing a class library that recreates scanf() or readln(). What this tells me is either:
- The authors don't really understand Java, particularly Java I/O
or
- The authors are too lazy to rewrite the same tired, old Pascal
exercises they've been using for
the last twenty years.
In 1999, user interaction should take place through a GUI, not the console. We need to introduce GUI design and programming at a much earlier point in the typical CS curriculum. Right now I'm teaching a graduate level Intro to Java course, and probably no more than 10% of the class has any significant experience of GUI programming.
Of course, just because user interaction should take place through a GUI, doesn't mean that traditional I/O isn't important anymore. But once you remove console access as the driving force behind I/O, you can design a much cleaner I/O interface that really supports files, sockets, and other uses. And indeed that's exactly what Javas done.
- Wayner:
- What about applet developers who are targeting web distribution?
- Harold:
- Java's security model severely restricts the I/O functions an applet running in a Web browser can perform. The new Java 1.2 security model is not supported by major browsers and is thus of theoretical interest only. Furthermore, most users wisely won't grant extra permissions to applets just to make Web developers' lives easier. Consequently, the main I/O a typical applet will perform involves network socket connections back to the host from which it came. Object serialization and RMI are particularly useful for this purpose.
- Wayner:
- When Microsoft started to form its own splinter group from Java, it attacked things like Sun's version of RMI. What do you think of the model?
- Harold:
- Microsoft attacked RMI because it competes directly with Microsoft's own Windows dependent DCOM. And it created ActiveX to try to compete with Java as a means of embedding active content in Web pages. But neither DCOM nor ActiveX have gone anywhere in the Web development arena, and Microsoft's abandoned them there for all intents and purposes. However, the fact remains that RMI and its underlying object serialization scheme is horribly slow. Furthermore, it's Java dependent. Most large, real-world, non-applet projects I'm familiar with have chosen to go with CORBA instead.
- Wayner:
- Do you think there's much hope for the NC model of downloading software as Java applets? The new 1.2 model (oops 2.0) at least creates the possibility.
- Harold:
- When it happens, it won't be on anything we recognize as a PC. A video game console or set top box is a much more feasible platform for this model.
- Wayner:
- Has Sun put in enough "multiculturalism" with UNICODE and its other features? That is, has Sun done enough to make cross platform apps work cross borders?
- Harold:
- Java's I/O classes have been fully internationalized since Java 1.1. The main problem is that programmers don't know this because they're trying to force Java I/O to fit into the model of a non-internationalized language like C or Pascal, and it doesn't. Java cleanly separates the reading of raw bytes of data from reading characters of text. It also separates the formatting of numbers as strings and conversion of strings to numbers from console I/O. Once you understand how the different languages perform the logically separate functions, and how all the layers connect together, then performing operations that would be unmanageably complex in other languages, becomes almost trivial.
- Wayner:
- How could Sun enhance/extend the I/O classes?
- Harold:
- There's no support for little endian data, or other byte orders like VAX floating point numbers. I do, however, show readers how to write stream classes that understand these formats themselves. These classes can be connected to the standard stream classes in a straight-forward fashion.
- Wayner:
- Do you think that there's a need for more translators?
- Harold:
- They're a few obvious needs like support for the new Latin-0 character set that includes Euro support. Furthermore, Unicode 3.0 will be out in a few months and that will require some minor, behind the scenes, adjustments that are unlikely to affect most people's code. And IBM claims that Sun messed up the EBCDIC translators. (Personally I blame IBM for that one. If they'd ever bothered to standardize and document EBCDIC in the first place, it wouldn't be the mess it is today.) Other than that, Sun's done a damn good job of supporting the most common character sets.
- Wayner:
- Many people complain about differences between Java implementations. Have you noticed any significant problems? Any places to concentrate upon?
- Harold:
- The biggest issue for I/O is the java.io.File class. Although it's gotten better in Java 1.2, it still shows its Unix roots. It works well on Unix, OK on Windows, and fails miserably on the Mac. There are simply too many assumptions about what a file is and what a file name looks like that don't apply outside the Unix environment. There are also some hidden issues. For example, Sun assumes that two separate characters that never appear in file names are available to be used as file separator and path separator in the class path. On the Mac that isn't true. There's only one character that can't appear in a file name. This has required some nasty hacks from Apple's Java porting team.
- Wayner:
- One last question: do you think Sun has done a good enough job writing Java to fit the lowest common denominator among operating system features? Or were they UNIX centric in other places as well?
- Harold:
- The AWT is a cross-platform disaster. Writing a class library that makes porting GUIs across Windows, the Mac, and Motif easy is extremely difficult. But Sun really didn't even try. Things have gotten better with Swing and Java 1.2, but Java's still crippled by decisions made by Unix programmers years ago who only had a vague picture of how Windows and the Mac worked.
The networking classes like java.net.Socket and java.net.ServerSocket are also quite Unix-centric. However, in that case both Windows and the Mac have been slowly moving to a Unix like model for their native networking for several years. Consequently, the Unixisms in networking aren't quite so obvious.
- The authors don't really understand Java, particularly Java I/O


