![]() ![]() ![]() ![]() ![]() ![]() ![]() Web Forms and CGIs: Making Web Pages Interactive
by Chris Savage and Linda Lee Network Notes #19 December, 1995
IntroductionThe World Wide Web (WWW) continues to sustain unparalleled growth in the volume of information published and in its total number of users. The reasons for its growth are complex. End users are drawn to the WWW because it offers an integration of multimedia data types such as text, images, sound and, to a lesser extent, video. Information producers find the WWW appealing because of its immense potential audience and technological capabilities for publishing. They view the WWW as a synthesis of publishing mediums, combining the sophisticated layout of print documents and the diversity of multimedia formats, with the mass-market distribution of radio and television broadcasting. Yet the WWW is clearly more than a broadcasting technology. It is interactive; permitting bi-directional communication between information publishers and end users. For this reason, it is really a mass-market broadcasting and communication technology. This issue of Network Notes addresses the interactive aspect of the WWW through the use of forms embedded in WWW documents to gather information, accompanied by executable programs, or CGI scripts, that process queries and respond to end-user requests.
Defining HTML formsForms are interactive, dynamic documents that permit users to enter data, check off preferences, select options, ask questions and provide comments. Printed forms are simple to design and use; blank space is provided for users to add comments or check off options. However, the authors cannot control how users complete the forms except by encouraging a type of response with clearly written instructions. In this respect, electronic forms are different. Electronic forms created in the Hypertext Markup Language (HTML) can compel users to complete the forms in a particular way by refusing alternative submissions. Options can be limited to one from a selection of many or extended to include multiple choices. Space can be reserved for comments to be entered and file-size limits can be easily defined. Compared to their print counterparts, HTML forms have more powerful functionality, can be simpler to use when they are well-designed, and are more responsive to user needs by returning immediate results. Best of all, most of the functionality can be gained by using simple HTML tagging. This means it requires little labour to create powerful, responsive, yet sophisticated, forms in HTML for use on the WWW.
Forms: What good are they?A popular use of forms is to enable users to search a collection of documents. Search forms can be simple full-text searches on instances of words, or elaborate multi-fielded searches using Boolean operators, limiters, and wild card truncation. Other popular uses of forms are for soliciting reference questions, surveys, comments, placing orders, registering for services, subscribing to mail lists, or, reporting faults and errors. Forms can also be used in libraries to request acquisitions, interlibrary loans, circulation status, borrower information, and act as an interface with the OPAC. The cycle of a form action -- from simple document transfer to CGI script The most common action on the WWW is a simple document transfer:
To each form its CGI scriptIt is critical to note that without a CGI script an HTML form is useless. The WWW server using HTTP alone cannot understand the contents of the form. Therefore a CGI script is required to supplement the limited capabilities of HTTP and interpret the form's contents. However, a specific CGI script must be used because each is specifically created for a particular form. This is because the form contains variable and value fields, for example, the variable "NAME:" and the value "Ralph". The CGI script collects the variable and matching value fields, then processes the data in a predetermined way, such as registering the user "Ralph" in a database. Depending on the purpose of the CGI script, the server may or may not return a result to the browser. In most cases, a message is sent to notify the browser that the form was received and processed. Using the given example, the CGI script may be designed to collect the data, write each field into a database of subscribers, create a new HTML document with the value of the form's variable "NAME:" inserted into a string of text, and then send the document back to the user. For instance, after updating the database, the CGI script may generate a document called "register.html" that the browser displays as "Hello Ralph, you are now registered in our records."
Creating forms and CGI scriptsCreating HTML forms is relatively simple; however, developing CGI scripts can be a laborious endeavour. The most difficult part of creating HTML forms is selecting the content. If the forms already exist in printed format, the content selection process is significantly simplified. For this reason, it is wise to adapt well-conceived, test-proven printed forms to the electronic format whenever possible. Because of the distinct properties of the two media, there will be some differences in their utility and success, but, in most cases, printed forms function well in electronic format. Once the content of a form is decided, the actual HTML coding is reasonably straightforward. Numerous HTML instruction books and documents on the WWW assist with the HTML markup of forms. But, before the form is marked up in HTML, the accompanying CGI script should be sketched out. The form and CGI script are interdependent; the CGI expects to receive values associated with the form's variable names and, in turn, execute an appropriate command. Therefore, the form's design will affect the CGI script and vice versa. Although creating a form is simple, writing the CGI script is more complex. Designing CGI scripts crosses over into the domain of computer programming and brings with it the related responsibilities of testing, debugging and dissecting security holes that accompany every programming venture. If the CGI developer is not a skilled programmer, it is advisable to borrow test-proven CGI scripts from public archive sites rather than develop a custom CGI script. Many standard forms, such as simple search forms, requests for feedback, reference questions and online registration are widely distributed with matching CGI scripts. Frequently, these form/CGI script pairs can be implemented with little or no editing, sparing developers the frustration involved in developing a custom HTML form and matching CGI script. Yet, there are some advantages to developing custom CGI scripts. The most significant is that they can be designed to query or input data directly into an existing database, using the established fields and formulae. Custom CGI scripts can also be written to comply with site-specific security procedures and operate more efficiently with system resources. The CGI specification permits a script to be written in any programming language, provided the host system can execute it. CGI scripts can be written in compiled languages such as C, C++, Pascal, Visual Basic, Fortran, and interpreted languages such as Perl, AWK, sed, TCL, DOS batch, or Unix Bourne shell. The first decision a CGI developer will make is whether to use a compiled or interpreted programming language. How do these differ? The program source code written in a compiled language must be fully compiled, or converted, into the operating system's native language before it can be run. Once compiled, two versions of the same program exist: the source code that is useless by itself, and the compiled program that executes directly in the operating system. Compiled programs are intricately tied to the operating system; thus, the same source code must be compiled separately for each type of platform. Scripts written in interpreted languages, however, can be used across multiple platforms, as long as the computer has an interpreter to convert the script line-by-line into the operating system's native language. Unlike compiled programs, the interpreted script is never saved as an executable program. It is interpreted into the native language of the operating system each time it is run. This extra translation step causes interpreted scripts to execute more slowly than compiled programs. Yet, balanced against this detraction, interpreted languages are simpler to learn than compiled languages and interpreted scripts are easier to debug than compiled programs. Also, a script can be stopped at any time, edited and retested with little effort. For these reasons, the current trend in CGI programming is to use interpreted languages; of these, the most popular is Perl. Freely available, Perl runs in the Unix environment (still the dominant operating system for WWW servers), but has also been ported to Windows NT, DOS, VMS and Mac environments. It handles strings of text particularly well and borrows many of the strengths of C and Bourne shell. Several books and documents on the WWW are devoted to teaching Perl, as well as the newsgroup comp.lang.perl for discussing Perl-related issues.
Words of cautionThere is some concern about the number of WWW browsers that cannot display and process HTML forms. The most popular and advanced browsers, such as Netscape and Mosaic, support forms, but older versions and some text-based browsers do not. Just as publishers cannot survive without knowing the needs and capabilities of their audiences, WWW developers need to know the technological profiles of their end-users. If most of the target audiences are using old text- based browsers with slow connections, alternative strategies need to be developed to ensure that these users receive comparable levels of service. One solution is to provide e-mail addresses for users to send in comments, rather than use a comments request form via a CGI script. Other concerns regarding the use of CGIs are: system resource allocation (scripts use the server's CPU, so running too many CGIs at once can decrease the server's performance), threats to system security, data encryption and user authentication. Consequently, CGI scripts need to be thoroughly evaluated and tested before they are implemented, and policies for acceptable uses should be established. Providing forms for users to search a collection of documents is becoming commonplace. Forms can assist users in locating specific information; however, this should not diminish the importance of developing a logical, browsable web structure. People have unique information searching techniques. While many prefer to use indexes and search forms, others would rather browse the entire collection. Using forms to query a search engine enhances subject access, but it does not satisfy the various information searching behaviours. CGI forms should be considered a convenient rather than an exclusive pathway to finding information on the Web.
Related sources for forms, CGI programming, and PerlThere is much current information about forms and CGI programming, both on the Internet and in bookstores. The following are some suggested starting points: Forms and CGI HTML & CGI Unleashed (1995) by John December, Mark Ginsburg, and other contributors. Indianapolis, IN: Sams.Net Publishing. 830 pages + CD-ROM, $61.95 CAN. This book provides an in-depth coverage of HTML markup and CGI programming with a special emphasis on Perl. The Common Gateway Interface -- http://hoohoo.ncsa.uiuc.edu/cgi/overview.ht ml NCSA's documentation on CGI. It includes an overview of fill-out forms. CGI tutorial -- http://agora.leeds.ac.uk/nik/Cgi/start.html Excellent introduction and tutorial on CGI. Companion to the Perl tutorial. (See below) Perl Learning Perl (November 1993) by Randal L. Schwartz, Foreword by Larry Wall. Sebastopol, CA: O'Reilly & Associates, Inc. 274 pages, $24.95 US. Good introductory, hands-on tutorial designed to help the reader write useful Perl scripts as quickly as possible. Teach Yourself Perl in 21 Days (November 1994) by David Till. Indianapolis, IN: Sams Publishing. 700 pages, $29.99 US. This tutorial and reference book starts with the Perl basics and progresses to advanced features. Comprehensive in scope, it is directed to an audience who need not have had previous programming experience. Introduction to Perl -- http://www.khoros.unm.edu/staff/neilb/perl/introduction/home.html Perl tutorial -- http://agora.leeds.ac.uk/nik/Perl/start.html Excellent Perl tutorial that focuses on CGI script creation. Companion to the CGI tutorial. (See above) CGI Archives perlWWW -- http://www.oac.uci.edu/indiv/ehood/perlW WW/ An index of Perl programs and libraries related to the World Wide Web. Perl scripts area -- http://www.metronet.com/perlinfo/scripts/ Indexed collection of Perl scripts. C library for CGI programming -- http://sunsite.unc.edu/boutell/cgic/cgic.html
![]() |