One aspect of the computer industry seems to have eluded both IBM and the US Department of Justice as they argue about the termination of the 1956 Consent Decree: there is a clear connection between the tabulating equipment market in which IBM had exercised market control when the government sued in 1952 and the computer industry today. That connection is embodied, by accident or design, in the provisions of the 1956 settlement that refer to the EDPM (electronic data processing machine) industry. The link between the past and the present is found in the underlying numeric codes used to represent information. There is not one such code in the computer industry, there are two. One of them was used by tabulating equipment in the 1950s and today, in a somewhat r efined form, is used to represent data in IBM’s mainframes and AS/400s. That coding scheme is called EBCDIC, for Extended Binary Coded Decimal Interchange Code. The other way of representing letters and other symbols is called ASCII, the American Standard Code for Information Interchange.
Tabulating
ASCII is used in personal computers, all mid-range computers except the AS/400 and all mainframes except IBM System/390s. IBM’s EBCDIC computers are direct descendants of the tabulating equipment on which the company’s market power was first built. And it is in the EBCDIC segment of the industry that IBM still has great power, despite its nearly 40 years of operation under the constraints of an anti-trust settlement, the 1956 Consent Decree. In the distinct ASCII realm, which sprung from the communications business and not the tabulating equipment industry, IBM remains only one of many participants; it has never had a majority market share, let alone a monopoly. The two numeric coding methods used to represent numbers, letters, punctuation marks and other elements – principally special information that governs the interpretation of data by computing machinery – are dissimilar. Computers that use each scheme must have unique ways of sorting and alphabetising lists, and they differ in the way they perform some mathematical calculations. Computers can translate one coding scheme into another, but this does not blur the distinction between the codes. The translation must be explicit, and anyone inspecting the operation of a computer that must translate code can pinpoint precisely where the transition occurs. The separation between the two methods of representing data is not a mere 40 years old, like the 1956 Consent Decree. It has roots going back into the 18th century. And it will persist into the 21st. The two coding schemes have evolved during the past two centuries, sometimes acquiring new or modified names. But both EBCDIC and ASCII have remained fundamentally unchanged since the 1960s. Only 128 of 256 possible characters are rigidly defined in ASCII and the rest of the standardisation is mainly by convention, such as the widespread adaptation of certain graphics characters used to draw forms. Also, ASCII and EBCDIC both have minor variations (a few symbols that differ from locale to locale) so computers can serve people in, for instance, Scandinavia as well as those in, for example, the United States. There are not only mathematical differences but also cultural and historical distinctions between the codes, dissimilarities that stem from their origins. ASCII is the child of codes used in communications. A direct parent of today’s 8-bit ASCII code (of the form XXXXXXXX where each X is a 0 or 1) is the 7-bit code used in the TWX teleprinter network. That code was preceded by a 5-bit code used in older Teletype networks. The 5-bit code is called Baudot Code, after the Frenchman Jean Maurice Emile Baudot. In 1872, Baudot invented a time-division scheme that vastly increased the capacity of telegraph networks. His name also gave us the word baud, a measure of the comm unications capacity of a transmission medium. Baudot’s work was built on that of Thomas Edison, William Cooke and Charles Wheatstone, among many others.
By Hesh Wiener
A
nd they all owe inspiration to Claude Chappe and George Murray, who developed semaphore visual telegraph systems in the late 18th century. At the beginning of the 20th century, when tabulating machines were first finding use in the business world, telegraphy was well established as a worldwide means of communication. So it was hardly surprising that, decades later, at the dawn of the computer age, some inventors adopted inexpensive, dependable Teletype machines using ASCII code (or its immediate forerunner) for use as consoles. Subsequently, the Teletype gave way to CRT terminals and printers that were developed with only ASCII code in mind. And what was once telegraphy is now called electronic mail. EBCDIC is descended from the data representation scheme developed by Herman Hollerith for use with punch card tabulating machines. Hollerith proved the value of tabulating equipment by saving the US government millions of dollars and years of work during the processing of the 1890 census. In 1896, Hollerith foun ded the Tabulating Machine Company. In 1924, following a number of mergers, Hollerith’s firm was given a broader vision and new name: International Business Machines Corporation. IBM eventually shed its meat scale business and got out of the time clock market. But it kept its Hollerith code, adapting it to changes in technology. And IBM never lost sight of its mission, expressed as the title of a 1934 book of Thomas J Watson’s essays: Men-Minutes-Money. Hollerith’s cards were in part inspired by punch cards that carried weaving instructions for looms invented by Joseph Marie Jacquard. Jacquard’s loom was developed in 1790, during the French revolution, but the fearful inventor kept its existence secret until 1801. The connection between the cards and calculating equipment was first made by Charles Babbage, who designed punch card reading technology into his Difference Engine. EBCDIC did not take its present form until the development of the IBM System/360, which used 8-bit bytes for characters but could also work with numbers packed two to a byte to save storage space. The 360’s hardware design and EBCDIC code were developed as part of IBM’s overall systems architecture, which included a compatible range of processors, peripherals and software.
Traditional
(The 360 actually had a hardware feature that switched its central processor to ASCII mode for certain arithmetic operations, an internal switch called Bit 12 of the Program Status Word. The presence of the switch enabled the System/360 to satisfy any requirements for ASCII support that might have been invoked by the US Government, which has a penchant for compatibility standards expressed, for example, in its demand that computers understand Cobol. But there was never any support in IBM’s OS/360 systems software for using that feature. And, it turned out, commercial customers did not seem to want it). As IBM supplemented – and eventually replaced – punch card input and output with Selectric printing terminals and CRT terminals, it stayed with EBCDIC. IBM even carried the 80-column width of its punch card into the interactive age by defining the standard CRT screen to have the same capacity, 80 characters, on each line. Just about all character-based visual display screens, including those of ASCII personal computers running MS-DOS, retain the capability to display data in an 80-column format. The punch card’s visible legacy is finally disappearing as graphical user interfaces, such as Windows, replace old style character displays. Not all IBM computers use EBCDIC coding these days, but all IBM mainframes are based on EBCDIC codes. And so are all AS/400s. By contrast, IBM’s personal computers and RS/6000s use ASCII codes. IBM could have developed the personal computer as an EBCDIC system if it wished, but, presumably for marketing reasons, it did not. However, in the case of the RS/6000, IBM was obliged to build computers that used ASCII codes so it could compete in the Unix world, which depends on ASCII coding as a foundation for the interchang
e of data and program source code. To permit ASCII computers, such as personal computers, to interact with the computers IBM has resolutely kept in the EBCDIC culture – the culture of tabulating equipment – IBM has equipped its System/390 and AS/400 systems with technology to translate every single byte of text that is passed across the code boundary. And, by every indication, it will maintain this tradition forever. Copyright (C) 1995 Technology News Ltd.