Foreword

I am pleased and flattered that my former Ph.D. student, Doug Lyon, has asked me to contribute a foreword for his new book. As I tell my students, you can't ever divorce your thesis advisor.

This foreword raises some questions and suggests some answers. The book tells you how to manipulate images, in the most ambitious computer environment ever devised, to find answers for yourself.

Where do digital images come from, and where do they go? What can be done with them? Where can one find the best tools for image processing and analysis? How does one know if one has done it right?

Clearly, the World Wide Web is the most plentiful source of images since cave drawings. There are millions of satellite photos, micrographs, maps, sketches, facsimiles of books and articles, pictures of objects for sale and of objects to be admired. One convenient taxonomy divides this bottomless database into

(1) graytone and color pictures of real objects and scenes (photos),

(2) contrived pictures (graphics, virtual reality, scientific visualization),

(3) scanned engineering drawings, maps, schematic diagrams (line drawings), and

(4) images of printed text (text images).

Each of these categories requires a somewhat different approach. This book concentrates on the first two types.

Not only are more images posted daily, but also the same images are massaged and recirculated. Digitized pictures are immortal! But the Web is not a closed ecosystem: it interfaces with all of science, art and society (not to mention business). They are the ultimate source and destination of this visual cornucopia. The book benefits from the author’s having his feet firmly planted in all these worlds.

Image processing transforms one picture into another, while image analysis extracts "information" from a picture. The selection of topics is based on the consensus that has emerged over the last four decades about the most useful building blocks shared by image applications of both kinds.

To begin with, considerable effort may be necessary just to compensate for the effects of camera or scanner distortions. Calibration techniques for correcting sensor/transducer anomalies are based on digitized test charts with known properties. Image displays should also be calibrated. Unlike some of the more theory-oriented texts, this book shows examples of useful test charts.

The most elementary operations change the value of each pixel (picture element) depending on the distribution of pixel values in the entire picture or in some neighborhood. For example, filters attempt to suppress insignificant detail and preserve significant features. Edge detectors look for significant transitions, which may be combined by boundary following. Thresholding and segmentation seek to partition the image into relatively homogeneous regions. Texture analysis delineates deterministic (brick) or statistical (grass) regularities. Thinning, skeletonizing and distance transforms attempt to find economical representations of dominant shape features. Color coordinate transformations, a special strength of the author, link objective and subjective aspects of color. Geometric operations allow registering (overlaying) one picture of an object onto another picture of the same, or similar, object.

Following the success of frequency-domain analysis in communications and signal processing (also fertile grounds for Java, as lambently discussed in Doug's earlier book), two-dimensional integral transforms with sinusoidal, rectangular or wavelet kernels bring spatial frequency techniques into play. Mathematical morphology is, in some sense, the discrete analog of convolution in linear systems theory. Another important topic is that of compression. Over the years, specialized lossy and lossless methods have been developed for high-contrast (fax) documents, for gray-scale and halftones, for color, and for video.

In the sixties and seventies, image software was written mainly in assembly languages or in FORTRAN, and a typical image size was 256x256 pixels (except for satellite pics). In the 80's dawned the era of C, extended eventually by the advent of C++. As computer storage expanded, images grew to 1024x1024 pixels and then kept on growing. For a wallet-sized snapshot-quality picture, or an A-sized page of laser-printer quality print, 4096x4096 pixels are necessary. Digital video processing (as opposed to display) still requires a hefty installation.

Java, both language and environment, is likely to be the next universal paradigm. This book eloquently argues its merits, but is not blind to its shortcomings. The expansion of toolkits and libraries is gradually raising the level of abstraction at which the image programmer works, putting more complex projects within the reach of a single individual. At the same time, the portability of the Java language facilitates large-scale collaboration.

One difficult question that remains is how to know if you have done it right. Quantitative evaluations are hard to come by in this field, and assessments by panels of experts are expensive and confusing. It is always a good idea to compare the results of alternative methods on the same data: consistency is not only the virtue of small minds. The effort invested in collecting a large set of test images is well spent. But the separate components of an image-based system cannot be fully evaluated in isolation. At our current level of understanding, there is no real alternative to building a complete system and testing whether it fulfills its objectives.

Perhaps in another book (Pattern Recognition in Java?), the author will show us how to classify pixels, features, objects, and entire pictures. Applications include modeling and statistical characterization of images, indexing and content-based retrieval. Also just over the horizon is Computer Vision in Java for robots, telemedicine and industrial inspection.

The narrative that follows is written with verve and gusto. It draws deeply on the author's not-always-painless experiences in programming, picture processing, computer art, and teaching computer skills. The program snippets are elegant and translucent. The image processing techniques presented are copiously illustrated with examples based on the sensuous Mandrill photograph.

This may not be the very best book that will ever be written on image processing in Java, but it is the first book. It takes bold vision and fast footwork to be first. First books on a subject are often far more influential than more elaborate later works that follow well-traveled (and muddied) trails. We expect that this book will play a significant role in the current paradigm shift. Enjoy it!

 

George Nagy

Professor of Computer Engineering

Rensselaer Polytechnic Institute

October 1998, Uppsala, Sweden