Digital Texts / Digital Codes

        Remember our "reading vs. seeing" exercise with the toothpaste packages?  Look at this Web page, don't read it.  What do you see?  How is it produced before your eyes?  You have been trained to accept as natural that when you apply the point of a ballpoint pen to a piece of wood-pulp paper it will (if properly charged with ink) create a predictable width line for as long as you press down.  You also have learned to form an alphabet of characters with that "stylus," so as to create words and sentences and paragraphs according to our culture's format for visually representing speech.  The stylus/paper technology has been stable for about 800 years, despite occasional innovations in manufacture of the substrate (what is "paper" made of?) and the stylus (reeds, quills, mechanical "fountains," ball-point, gell inks).  But you also know how to type, though few of you probably ever have operated a "typewriter," the original technology that put the power of the printing press at the fingers of an individual writer.  So how does the Internet, or your word processing program or email program, record, store, and reproduce the characters you type and read, like these?  Trust me--it has very little relationship with the typewriter's simple, mechanical structure and direct-to-paper transfer of inked type.  As you read this page, you are running many layers of computer code which, invisibly to you, translate my keystrokes into storable data and retranslate them into readable characters.

Primitive Machine-Level Codes That Talk Directly to the Micro-Processor:

Machine Code: the first-generation programming language that talks to the machine directly in binary hexadecimal code (1950s, then addressed by assembler codes)

Assembler Code: the second-generation programming language that is interpreted by the machine code to address the machine directly, a set of mnemonic  abbreviations learned by programmers so that they did not have to write in machine code.  The example contains lines beginning with a semi-colon followed by English language text--the semi-colon warns the computer to ignore the line because it contains instructions for the human programmers who have to maintain the program, otherwise the program would "crash" (fail to execute its instructions).

Higher Text Programming Codes That Tell the Micro-Processor How to Store, Display, and Print Text:

ASCII Code:  Created for Teletype machines that repeated typed news stories from a central location to newsrooms around the world, ASCII characters told these "slave" electronic typewriters what to type, as well as when to indent, skip a line, or ring a bell to signal an important news story.  (At 33 seconds, you will one bell to signal an ordinary presidential news conference summary, and at 2:25, a "two-bell" story is announced telling the world that the U.S. soccer team had been eliminated by Ghana from the World Cup.)  United Press International [UPI] rated stories by their bell count: "Bulletin" or "Urgent" stories got five bells; the Kennedy assassination and FDR's death were "Flash" stories, fifteen bells.  ASCII, the American Standard Code for Information Interchange, was devised in 1960-63 by the American National Standards Institute (ANSI) based on well-established teletype codes that had been working since the 1950s.   Because computers can only perform mathematical operations like addition, subtraction, etc., or logical operations that can be represented mathematically (Boolean AND/OR/NOT sorting), all text you see on a computer screen first was a number in machine code which referred to a character in a font table that was, itself, represented by numbers telling the computer what shape to draw on the screen and where to put it.  A capital "A," for instance, is "01000001," and a small "a" is 01100001  When you tell MS-Word to save a file as "Text Only," you are saving only the ASCII characters without other formatting.

Waterloo Script  A pre-WYSISYG (What-You-See[on the screen]-Is-What-You-Get[when you print the document]) word processing system for mainframe computers.   The user had to master at least a basic set of Script mnemonic codes in order to get the document to print out legibly, and those codes would only be activated when the document was sent ("spooled") to the mainframe computer's line printer, a single high-speed device that served the entire community's printing needs.  Printing delays often were measured in hours.  Script reversed the "comment" convention (see Assembler above) so that when the machine saw a period, later a colon, in the left margin, it would assume it was reading a machine instruction code (e.g., ".pp;" for paragraph) and anything following the semi-colon or on  a line that does not begin with a period was treated as plain text.  In time, Script became the basis for GML or Generalized Markup Language, the ancestor of HTML (below) and XML  GML instructions started with a colon in the left margin to distinguish them from Script instructions, but otherwise their logic was very similar to Script's.  Click here for the University of Waterloo SCRIPT features and system requirements page (1990).

HTML  Hyper-Text Markup Language, a descendant of Waterloo Script via "SGML" (Standardized Generalized Markup Language), the first attempt to standardize all digital document formatting, HTML is the standard language used to create web pages.  If you go to Internet Explorer while viewing this page and click on the "View" menu at the top, then click on "View Source," you will see the source code behind the text that is displayed in WYSIWYG on your screen and your printout.  To see it in Mozilla Firefox, go to Tools-->Web Developer-->View Source.  All of that should remain invisible to you when you read digital text, even as the type compositor's assembly of a string of lead type units on a composing stick or the printer's pulling the tympan down upon the paper and plate is invisible to the reader of a printed book, or the scribe's individual pen strokes and preparation of a calf skin to become parchment for writing is invisible to the reader of a manuscript.  Early versions of MS-Word would show users its markup codes, as well, but this word processing program has been WYSIWYG so long that this is no longer considered by Microsoft to be necessary for ordinary users of the program.

XML  Extensible Markup Language (ca. 1998), the current descendant of SGML and Script, this is the newest standard digital code for creating Web based documents and other artifacts, including those which contain video and audio.  XML "tags" look very like those of HTML, but they are language-independent (e.g., non-Roman characters) and they can interrelate information of the same type in many documents.

MARC (a specialized code for running library catalogs): You can see the MARC code for any library catalog search result by clicking on the "MARC Display" on the top menu buttons.  Cataloging librarians tend to find he MARC view easier to read, because it's the form of document they write!

        If you want to learn more about code and coders (those engineers who write code), set aside a few hours to read Paul Ford's long-form article, "What Is Code?," Businessweek, June 11, 2015, available at  Because of its use of graphics to explain coding concepts, you cannot print it out for reading, even though it is the length of a print novela or long sort story.  Ford attempts to explain code to business people who work with coders but who are increasingly unable to understand what their colleagues are doing and saying.  The very existence of the article signals an impending shift in commercial software development.  Coders need business types to help them make money from their inventions, and busines types need coders to invent products they can sell, but when the two can no longer speak the same language, something big may be about to happen.  Ford can introduce you to much more powerful coding languages than the few mentioned above, such as "C," "C++," and "Python."  HTML and XML are really more about controlling how text displays on Web based applications vs. robust programing languages that actually run the world you live in, from your cell phone and laptop to airplanes and elevators and not a few vending machines on campus.