Chapter 3

The Dynamic World of Computer Technology

 

Overview of Computer Development.            Throughout the sixty-plus year time span from the advent of electronic computation preceding the period of World War II to the beginning of the Twenty-First Century, the capability and sophistication of computer technology has not only grown by several orders of magnitude, but the physical size of individual units comprising this technological marvel has also decreased commensurately.  What once filled a large room and required kilowatts to operate now fits within a wristwatch, consumes milliwatts, and is much more capable.  Furthermore, applications have expanded from simple arithmetic functions to extremely complex pattern recognition, well beyond the initial visions of the original developers.  What was once science fiction in information processing is now engineering reality.  We are now approaching another watershed of computer revolution – the threshold of man-made sentience.  The thrust into nanoelectronics is one element that will bring about this revolution.  But to put this into a proper perspective, let us venture a short glimpse into the history of computer development as it morphed from a power-hungry behemoth of limited capability to a miniature encephalon and review the highlights of some of the more remarkable events that permanently altered its course.

 

History:  The characteristics of the first computer are lost in the mists of antiquity.  The spiritual welfare of early tribal groups depended upon how well they could track the seasons and accurately predict the times to hold their rituals such that they might invoke the spirits to ensure necessities such as a good harvest.  The task of tracking the seasons likely fell to the tribal shaman, who through notching sticks or placing marks on stones or walls, evolved the first counting system.  As the tribes and their needs expanded, a rudimentary form of commerce with neighboring groups began to develop.  While the motivation to develop some form of computational capability may have emerged from the need for tribal ceremonies, it was undoubtedly amplified as ancient tribes and city-states began to establish commerce between each other.  As trade increased throughout the ancient world, the need arose to record the flow of goods.  The clay tablets of the Hittites reported the commercial transactions between their empire and that of Egypt.  Half way around the world and several millennia later, the knotted strings of the Inca quipu documented the commercial transactions throughout their empire.  But merely writing down the details of the transpired commerce was not sufficient.  There also needed to be some means of computing the values of these exchanges and of determining the accumulations and the deficits.  Although the early humans in all likelihood applied fingers and toes as their first computational devices, the spark of the first computer likely evolved from a primitive merchant’s use of notched sticks and pebble piles to tally traded quantities exceeding the number of fingers and toes.  But as cultures grew larger and more civilized and as the flow of goods became more sophisticated, something better was needed – some form of computer.

 

            There is little doubt that the abacus, which traces its history for over 5000 years, deserves the honor as having emerged as the world’s first computer.  Although its true origin is buried in antiquity, some credit the Chinese with its invention.  But its birth could well have been in Asia Minor.  The origin of the word “abacus” itself is derived from the ancient Greek word abax, which in turn is thought to have been borrowed from an even older Semitic word. [1]  The abacus, the simple sliding-bead accounting system, was man’s first attempt to “automate” the counting process.  However, it is not a mechanical calculator but more of a memory aid for use in more complex arithmetic.  By placing beads in rows representing units, tens, hundreds, and so forth, an ancient merchant could readily execute basic addition and subtraction.  This concept was so successful that this primitive ancestor of the electronic computer survived as a business tool in the Orient well into modern times.

 

The first actual mechanical adding machine was not realized until the mid 1600s when a wheel calculator was invented in France to support tax collection.  In 1642 Blaise Pascal, a French mathematician and scientist, designed and assembled the first mechanical calculator.  Based on the principle of multiple sets of ten-tooth gears, a proper sequencing of the wheels would produce a cumulative sum.  While it represented substantial improvement over hand calculations, it was very complicated and costly.  As a result, its further development never progressed.

 

Little additional progress in computer improvement came about until 1822 when Charles Babbage, an English professor of mathematics, suggested a steam-powered machine to solve differential equations, which he called a “difference engine.”  His work on this concept was followed a decade later with the idea of a general purpose computer he referred to as an “analytical engine.”  This machine was the first to consider the use of punch cards as a primitive control program for storing instructions, a concept he borrowed from the Jacquard loom that used punched boards to control the weave pattern.

 

            If one were to identify a beginning of the computer age, a “time zero” against which to reference the start of the tremendous growth witnessed in computing technology, the year 1834 when Babbage completed the first drawings and introduced the design for his analytical engine would be most appropriate. [2] His was a revolutionary concept, as it essentially outlined the basic components and characteristics of the early modern general-purpose computers.  His punch-card-instructed machine would allow instructions to be executed in a predetermined order, rather than in numerical sequence.  However, the machine was very complicated – especially for its time, as it was made up of over 50,000 parts.  Remarkably similar in logical configuration to modern electronic computers, the analytical engine comprised five major segments:  a “controller,” storage, a processor mill, and input and output.  Not only did the punch cards contain the operating instructions to control the analytic engine, but additional cards also provided a rudimentary storage memory of 1000 numbers, each of 50 digits.  Furthermore, he had designed his system for essentially infinite storage by outputting data to other punched cards that could be read in again later.  The analytic engine also introduced the concept of a central processor in the embodiment of a mill that accepted the input quantities and enabled the processing of instructions in any order.  In addition, he had also designed an output device to generate printed results.

 

            In 1889 the American Herman Hollerith, who was seeking a methodology that could accelerate the computation of the U. S. census, independently struck upon the punch-card idea from watching a train conductor punch tickets. [3]  Conventional methods were requiring as long as seven years to complete a census count, and with an expanding population the census bureau was desperate for a more streamlined approach.  Rather than use the perforated cards to instruct a computing machine as Babbage had done, Hollerith instead came upon the notion of storing the basic information on the cards as encoded perforations.  He would then feed these cards into a machine that would compile the results mechanically.  He invented a code comprising a single punch as representing a number and combinations of two punches representing a letter.  With this, he was able to store 80 input variables on a single card.  The 1890 census was completed in six weeks with Hollerith’s revolutionary approach.  In 1896 with his refined punch-card reader he founded a company called the Tabulating Machine Company, which after a merger in 1924 became known as International Business Machines – IBM.  And the punch-card processors were used by IBM well past mid Twentieth Century.

 

            The next computer revolution occurred immediately prior to the outbreak of World War II when Professor John V. Atanasoff of now Iowa State University and his graduate student Clifford Berry questioned the clumsy bulkiness of their present-day mechanical computers. [4]  As a means of eliminating this morass of machinery, they suggested a computer comprised primarily of electric components.  Furthermore, they envisioned that the electrical circuitry could utilize Boolean algebra, a binary system of mathematics introduced in the mid-Nineteenth Century by George Boole.  The elegance of this concept was that not only could mathematical relationships be stated as a simple true or false through Boolean algebra, but these statements could also be represented by switching an electrical circuit on or off.  In 1939 Atanasoff and Berry introduced the World’s first semi-electronic digital computer, the ABC.

 

Post World War II:  The Second World War, followed by the long-enduring Cold War, precipitated the computer revolution as we know it.  Governments recognized the strategic importance of computers very early during this period and poured money into computer development to hasten progress.  During the war years, machine computation emerged as an invaluable aid for such tasks as code breaking, the development of ballistic charts, and the design of aircraft and missiles where results were needed very quickly.  The decades following the war saw such fundamental and revolutionary changes to this technology that its impact has permanently altered the lives of everyone worldwide.  These advances, and their rapid acceptance and assimilation by the general populace, are directly applicable to the theme of this thesis – especially the accelerated trend toward miniaturization.

 

            The period from the war years to around the mid-to-late 1950s defines the era of the “first-generation” electronic computers.  During this stage of computer development, the binary number system had been introduced into computing to facilitate electronic switching.  The binary system is a number system to the base 2, as contrasted with the familiar number system to the base 10.  For example, the number 11.625 in base 10 converts to 1011.101 in base 2.  The applicability of the binary system to electrical switching can be seen immediately when one considers that “1” represents the switch being closed or “on,” while the “0” represents the switch being open or “off.”  Hence, any number or sequence of numbers can be represented by a series of open or closed electronic gates, seen as strings of “1s” and “0s.”  In addition, these first-generation computers were commanded by machine language, which comprised various binary-coded instructions.  Machine language, however, was rather unwieldy, resulting in programming difficulties as well as speed and flexibility limitations.

 

            But in 1936, prior to the outbreak of the Second World War, Alan Turing, a British mathematician, introduced a concept that was to become the paradigm for programmable computers.  In his paper entitled, “On Computable Numbers,” [5] he suggested a hypothetical device capable of executing logical operations, which he dubbed the “Turing Machine.”  The Turing Machine would be capable of reading and writing symbols on a paper tape.  This matching of a subsequent action with an instruction list of possible states was a prelude to modern computer programming.

 

As war threatened Europe, the United States was barely recovering from the Great Depression and had little interest in world affairs.  But as events evolved and it became more obvious that U.S. involvement was inevitable, the Army Ordnance Department at the Aberdeen Proving Ground in Maryland began an effort to improve the accuracy of the large guns being tested there. [6]  Trajectory tables were seriously needed that portrayed a projectile’s range as a function of the inclination of the gun barrel, coupled with other parameters such as wind speed and direction, atmospheric pressure, temperature, and humidity.  These tables were required for all types of guns and projectiles and for all combinations of operating conditions.  The equations of motion for these calculations were three-dimensional, second-order differential equations, the solutions of which were generally accomplished by hand after hours of calculations.  This task was mitigated somewhat by a mechanical analog machine, called a Bush differential analyzer after its inventor Vannevar Bush of MIT.  Unfortunately, this machine was prone to frequent and costly failures – often occurring toward the end of a lengthy calculation.  Lieutenant P. N. Gillon, who was in charge of the ballistic computations at Aberdeen, knew that the University of Pennsylvania’s Moore School of Electrical Engineering owned a Bush differential that was faster and larger than the one at Aberdeen.  He contracted with the university for its exclusive use to produce trajectory tables. 

 

When the contract was signed in 1942, the talents of John W. Mauchly and J. Presper Eckert were introduced to this problem.  Approximately a year earlier in 1941, Mauchly had visited Atanasoff at Iowa State University and came away with the seed of an idea for an all-electronic computer to perform numerical calculation – all with no mechanical moving parts.  Mauchly teamed with Eckert and together they drafted a design for such a system. [7]  Based on this design by Mauchly and Eckert, the Army Ordnance Department signed a contract on June 5, 1943 with the Moore School of Electrical Engineering to produce the first Electronic Numerical Integrator And Computer – the ENIAC.  The world’s first all-electronic computer was born. 

 

            The ENIAC was a gigantic system occupying over 1800 square feet of floor space of primary and peripheral equipment comprising thirty separate units plus power supply and forced-air cooling.  It weighed over 30 tons and consumed nearly 200 kilowatts of electrical power, causing brownouts in Philadelphia when the computer was turned on.  Its electronic components included 1500 relays and hundreds of thousands of capacitors, resistors, and inductors.  Furthermore, it utilized 17,468 vacuum tubes – nearly nine times more that any predecessor.  Yet with all of its mass and power the ENIAC was hardly considered programmable.  In order to carry out any operation, separate units of the system were plugged together to approximate a “program,” and computations would occur as instructions were routed through this specific configuration.  However, each new problem required that these connections be redone, a task that would require technicians weeks of set-up and checkout time to modify.  Individual cables had to be connected into plug-and-socket receptacles, and three thousand function-table switches had to be set.  Failures among its nearly 18,000 vacuum tubes were so pervasive that a separate development effort was initiated to improve their reliability.  But when the ENIAC was set up to perform its primary function – that of calculating the differential equations of motion for ballistics, its processing speed was unparalleled.  A skilled person with a desk calculator required approximately twenty hours to calculate the ballistics for a sixty-second projectile trajectory.  The Bush analog differential analyzer could compute the same result in fifteen minutes.  ENIAC, however, had the solution in thirty seconds.  This was revolutionary. 

 

The ENIAC was very efficient at executing operations for which it was designed and remained in operation until October 2, 1955.  With all of its ponderous mass and ravenous appetite for power, the ENIAC represents the beginning of electronic computation, the fountainhead of today’s four-pound laptop possessing several orders of magnitude greater computing capability for only two percent of the power.

 

            In 1948, the Hungarian-born John von Neumann, a member of the ENIAC team and frustrated by ENIAC’s unwieldiness, introduced a concept of computer architecture that was to become the paradigm of computer systems as we know them.  He suggested that a computer, comprising a simple but fixed structure with properly programmed control, should be capable of executing any command without the need for modifying hardware.  His idea was the stored-program technique.  He suggested a machine instruction, referred to as a “conditional control transfer,” that allowed for interruption and reinstatement of the program anywhere in the computing sequence (philosophically similar to that suggested by Babbage).  He proposed that storing instruction programs in the same memory unit as data would enable instructions to be modified arithmetically in the same manner as data, thus rendering both essentially the same.  A central processing unit would extract both data and instructions from memory, operate on them, and then return them to memory.  The result was much faster, more efficient programming and computing that enabled instructions to be written in the form of subroutines that did not require reprogramming for each new problem.  Lengthy computations could be worked piece by piece with their intermediate results temporarily stored in the memory, to be called later for assembly into the final result.

 

            Von Neumann’s ideas were first applied to the next generation of the ENIAC, a system called the Electronic Discrete Variable Automatic Computer, or EDVAC. [8]  The EDVAC was a dramatic improvement over the ENIAC, both in physical size and in capability.  Not only did it possess an increased internal memory and an internally stored execution program, but it was also designed to operate completely with the binary number system.

 

            But the first computer to take the most effective advantage of these advances, developed by Remington Rand Corporation, became commercially available in 1951.  Called the Universal Automatic Computer, or UNIVAC, it was capable of manipulating both alphabetic and numeric data with equal facility.  Because of its simplified programming and operational architecture, the UNIVAC was a commercial success and is credited with having launched the computer industry’s first generation.

 

            The next watershed in electronics development occurred in 1948 with the invention of the transistor.  As the transistor became more refined, it began to replace the more voluminous and power-hungry vacuum tubes in every electronic device – drastically reducing their size and power requirements.  The first all-transistorized computer introduced in 1959 initiated the second computer generation, which was to endure until the mid 1960s. [9]  Transistorized computers not only incorporated the advances inherited from the first generation, but they were also faster and more reliable than any of their predecessors.  The demand stimulated by the nuclear laboratories for computers capable of handling enormous amounts of data induced companies such as IBM and Sperry-Rand to apply transistor technology in the development of early supercomputers.  Programming also became more user-friendly with the replacement of machine language with assembly language, and finally with FORTRAN and COBOL – the last spawning the massive software industry.  These innovations, coupled with the addition of peripherals like printers and magnetic tape storage, rendered computers sufficiently attractive and cost-effective that they rapidly found their way into businesses and universities in addition to government.

 

            But the next true revolution in the development of electronics, a quantum leap even greater than the transistor toward miniaturization, came on September 12, 1958, when Jack St. Clair Kilby demonstrated the first microchip integrated circuit. [10]  After having recently joined Texas Instruments, he observed that the advent of the transistor had stimulated the designs of increasingly more complex electronic circuits comprising literally thousands of discrete parts.  However, all of these components had to be individually connected, involving the expensive and time-consuming process of hand-soldering thousands of bits of wires.  A reliable, cost-effective production process was sorely needed.  Prior to the time when Kilby came aboard, the Army had funded Texas Instruments through the Micro-Module program to investigate the concept of producing all-electronic components of uniform size and shape with built-in wiring such that they could be snapped together, thus eliminating any further need for wiring and soldering.  But Kilby was cool to this idea.  Instead, he observed that the critical components were the semiconductors and that the passive devices, such as resistors and capacitors, could all be made of the same material as the active devices.  Furthermore, if only a single material were necessary, then the components could be fabricated in place as an interconnected complete circuit.  His 1958 world-changing chip was a modest germanium sliver glued to a glass slide.  It comprised a transistor and a few other devices necessary for a complete circuit.  The entire assembly was about the size of a paperclip.  But when connected to an oscilloscope, it heralded the entrance of a new age of electronics and spawned a whole new industry that has since grown to over $1.1 trillion.  The integrated circuit launched the third generation of computers.  It was during this generation when, in 1964, Gordon Moore postulated his famous observation that information storable on an integrated circuit doubled every year since the technology was invented.

 

            But the third generation was short-lived, for electronic developments were beginning to accelerate as manufacturers recognized the enormous advantages and realities to miniaturization.  The introduction of the first microprocessor by Intel in 1971 ushered in the forth computer generation.  This first specialized integrated circuit contained its own logic unit and could process four bits of data simultaneously.  Shortly thereafter large-scale integrated circuits were fitting hundreds of components onto a single chip.  But one of the most significant impacts of the microprocessor was its influence on lowering the cost of computers.  With the introduction of very large-scale integrated circuits, or VLSI,  microprocessors produced by CMOS technology and capable of packing millions of components onto a single chip, computers emerged from the exclusive domain of government and large corporations – the only organizations that could truly afford their enormous cost and incessant maintenance.  In 1971 when Intel introduced its 4004 microprocessor that located the central processing unit, memory, and input and output controls all onto a single chip, the age of personal computers was born and the rest is history.

 

            This thesis focuses on a computer of the fifth generation and beyond.  With the specter of Moore's Law approaching its limit with CMOS processing in the foreseeable future, research into alternative solutions is well underway in response to meeting the ever-accelerating demand for better computing capability.  System-on-a-chip technology and wireless computer interfacing are becoming routine.  Alternative microdevices are already being explored and finding applications.  We are seeing single-electron-tunneling devices improve high-density, low-power memories.  Resonant-tunneling diodes are being considered for digital-to-analog converters where they have the capability of much higher speeds (typically 10 to 100 GHz) than does conventional CMOS. [11]  Researchers have recently reported successful advances in DNA computing and quantum computing, and we will briefly look at these two revolutionary fields. 

 

DNA computing is based on the premise that there are similarities between mathematical operations and biological reactions, and these similarities can be exploited to perform calculations.  A tremendous advantage of this approach is that computations performed by a biological-like reaction are capable of overcoming the problem of parallelism and interconnections characteristic of a digital computer.  Through self-assembly of a DNA string in the correct sequence from massively parallel "batches," it can solve very complex combinational problems.  First introduced by Leonard Adleman of the University of Southern California in November of 1994, DNA computing has since become a major growth field in the computational sciences. [12]  Adleman first applied DNA computing to the classic Hamiltonian Path or "traveling salesman" problem, wherein a salesman is to find the minimum route through a given number of cities while visiting each city only once.  Adleman chose seven cities and encoded each with two four-letter names, where each name represented a sequence of the four DNA nucleotides (adenine, guanine, cytosine, and thymine).  A similar encoding was assigned to the routing between the cities.  When the coded DNA is mixed and subsequently measured through electrophoretic processes, all strands without the correct number of nucleotides and without the total number of city codes are discarded.  Remaining strands beginning and ending with the correct city, and containing all of the cities with their correct city codes, represents the solution.  Sequencing the strand reveals the answer. 

 

Its potential speed, memory, and energy efficiency over that of the conventional digital computer are the remarkable advantages that DNA computing offers for this class of problem.  First, the speed in arriving at a solution demonstrates the power of performing many calculations simultaneously.  A DNA computer holds the promise of executing over 1014 million instructions per second, or 1014 MIPS - a million times faster the 100 million MIPS of the human brain!  Furthermore, a DNA computer is extremely energy efficient.  Adleman's solution was obtained at an energy-expenditure rate of 1019 operations per Joule.  When compared with the CMOS energy demand of 109 operations per Joule for digital computers, the DNA computer is potentially 1010 times more energy efficient.  And data storage with DNA can be accomplished at the molecular level with a density on the order of one bit per cubic nanometer.  That same one bit of information requires a volume of 1012 cubic nanometers when stored by conventional magnetic storage used for digital computers.

 

            A quantum computer takes advantage of the strange behavior of quantum physics, and research in this area is still in an embryonic state.  A quantum computer parallels the philosophy of a classical digital computer in approach. [13]  Where the fundamental unit of information for the digital computer is a bit, for a quantum computer it is a quantum bit or qubit.  But unlike the binary bit that is represented by a string of 1s and 0s, the qubit is more quaternary in nature because of the radical difference of the laws of quantum physics from the classical world.  A qubit can exist in the 0 and 1 state corresponding to that of the classical bit as well as in a superposition of these states.  It can be a 0 or 1, or it can exist simultaneously as both.  While counterintuitive, this behavior is dictated by the characteristics inherent in quantum physics that are absent in the macroscopic world of classical physics.  This feature enables a quantum computer to perform a computation on many different numbers at once and then execute an "interference" with those results to arrive at a single solution. 

 

While the idea of such a computer was first proposed as early as the 1970s, it was not until 1994 when Peter Shor described his algorithm for factoring very large numbers did quantum computing emerge from academic curiosity to a potential reality. [14]  Extremely complex factoring is a very important problem in cryptography in that many of the security codes rely on this difficulty.  Shor's algorithm was designed specifically for a quantum computer.  Quantum computers have a tremendous promise as a computational device and one day no doubt will surpass the classical digital computer.  But in the interim, much research remains on error correction and quantum hardware before such systems enter the practical world.  For the time being, quantum computing is still considered as a special discipline within theoretical physics.  However, quantum computing may ultimately prove very important to the subject of this thesis.  As we will see in Chapter 5, there is a compelling hypothesis that some form of quantum "computing" may play a pivotal role in consciousness.  While it may not be computing in the same sense that we understand it, there is a strong argument that certain intercellular protein structures may have the capacity to function as qubits.  The tubulin proteins of a neuron's microtubule structure seem to possess these qualities.

 

The Digital Computer.  From its inception the computer has been likened to the human brain, and the notion of developing a “thinking machine” with comparable cognitive powers has inspired computer scientists since the first electronic calculation.  The period shortly following World War II saw the emergence of two architectural philosophies that addressed the question of developing a computer that would exhibit intelligent behavior.  The first concept – the subject of this chapter thus far – was the well-known digital computer.  The other architecture is the neural computer, or the artificial neural network that will be treated later.  As discussed above, the digital computer captures knowledge as a set of symbols, and these symbols are manipulated in conformance with a collection of pre-determined algorithms that are typically introduced into the system in the form of software.  The digital computer is based on the operating principle most familiar to everyone.  The early programmers recognized that the electronic bits stored in memory could also assume any arbitrary meaning, and hence observed that these same computers were also capable of manipulating these symbols.

 

Text Box:  

Figure 3.1.  Schematic diagram of “Von Neumann” computer.
            The digital computer as we know it, the unit that sits on virtually everyone’s office desk in today’s working world, is fundamentally a closed system.  By this we mean that its functions and operations are essentially independent of its surrounding environment.  It does not on its own volition accommodate changes to its environment.  Occasionally referred to as the “Von Neumann machine,” the digital computer in principle functions in a fairly straightforward manner.  In concept, it is comprised primarily of a memory and a central processing unit, or CPU, as shown schematically in Figure 3.1.  Input data and operating instructions, the latter called algorithms, are stored in the memory.  Operations, calculations, and processing in general takes place in the central processing unit, or CPU.  When a specific operation is desired, the CPU activates the relevant stored algorithm.  The CPU extracts these stored instructions along with the appropriate supporting data from the memory.  These instructions are then executed within the CPU, and the resulting new data is sent back to memory where it is stored.  The process is then repeated until the desired computational objective is met.  

 

It is interesting to note that the symbol-manipulation methodology has been adopted as the basic paradigm within the field of Artificial Intelligence throughout the last several decades of its development, whereas intelligence itself is so integrally dependent upon adaptation to the environment.  Yet the digital computer can not adapt to environmental variations beyond that which can be anticipated by its algorithms.  In other words, the flexibility of a digital computer is limited by how well its programmers had anticipated the probable responses demanded by it and had developed the algorithms accordingly.  Variations beyond that will not elicit the correct response.  A truly “intelligent” system will need something more, and that “something more” is a computer capable of massive parallel processing using environmental cues as its primary source of programming – a neural computer patterned after the brain.

 

The Neuron.  The symbol manipulations, component configurations and interfaces, and serial processing of the digital computer are a long way from approximating the architecture and neural functions of the mammalian brain.  The numerous nodes that approximate neuronal behavior represent the dominant feature of neural-network architecture.  But before we discuss the various attributes of the neural computer, a short description of the actual biological unit itself would help shed light on what the artificial system is trying to emulate. 

 

The aggregate of the mammalian central nervous system is comprised of literally millions of interacting behavioral nodes called neurons.  The 1.5 kg of convoluted, dough-like matter that makes up the human brain is a collection of around 100 billion of these neurons to which there are hundreds of trillions of interconnections, an effective accumulation of thousands of miles of cabling.  The functionality of each neuron, like that of an individual ant in an ant colony, can be reasonably approximated by a small number of rules. [15]  But the totality of these simple rules when applied over millions of neurons leads to a much more complex and rich behavior than ever could be predicted from the individual neurons.

 

Text Box:  
Figure 3.2.  Depiction of a typical neuron.
            A typical neuron is a cell with numerous short root-like fibers spraying out from it called dendrites and a long single fiber with a brush-like tip called an axon.  (Figure 3.2.)  The axon, shielded within a myelin sheath, carries impulses away from the cell body, or soma, while the dendrites accept incoming impulses from other neurons and carry the signals toward the soma.  Signal interchanges between neurons are accomplished at the synapses, located primarily at the dendrite terminals.  Each signal pulse initiates the release of a neurotransmitter chemical that travels across the synaptic cleft from the axon side and is received at the post-synaptic receptor site on the dendrite.  The neurotransmitter then binds to the receptor molecular site where it initiates a change in the electrical charge, or voltage potential, in the dendritic membrane.  This change in electrical charge, called the post-synaptic potential, either increases or decreases the polarization of the post-synaptic membrane either to inhibit the generation of pulses in the recipient neuron for the case of increased potential, or to excite the generation of pulses when potential is decreased.  In other words, the neuron integrates the impulses it receives from its dendrites through its synapses and either generates an action potential or does not.  Each post-synaptic-potential pulse travels along its dendrite and spreads over the cell body until it eventually reaches the base of the axon, a region called the axon-hillock.  The receiving neuron integrates the effects of thousands of such pulses received at the axon-hillock from its dendritic tree over a given time.  When the integrated potential exceeds a predetermined threshold, the receiving cell then fires, generating an electrical impulse that travels down its axon to initiate this same sequence of events with neurons to which it is connected. The number of direct connections, or synapses, any given neuron makes with other neurons is referred to as fanout, and the fanout of typical central-nervous-system neurons range from 1000 to 10,000. [16]  According to Kuffler, [17] a single neuron may receive input from as many as 80,000 others.

 

            One aspect of the neuron that is particularly relevant to this thesis is its distinctively small size, especially its dendrites and synapses.  The feature sizes near the terminus of neuronal dendrites vary from a nominal 1000 to 200 nanometers. [18]  While still larger than that required of nanosystems, they are clearly approaching sizes where at least one dimension of a component is within 100 nanometers.  The synaptic gap formed between dendrites, however, is on the order of 15 nanometers, which definitely falls within this regime.  Nonetheless, these sizes of dendredic termini and synaptic gaps are sufficiently small that surface effects begin to control over bulk material effects, and quantum influences begin to emerge and possibly even dominate - all of which are characteristic traits of nanotechnology.  When we look upon these features as they emerge in the inanimate world, we often overlook the possibility that these identical effects may account for our own behavior.

 

The Neural Computer.  Nature chose the neural computer as its preferred information processor and has devoted millions of years of research and development toward its perfection.  Nature’s motivation was to equip species with the means of rapidly assessing environmental changes and adapting their behaviors to new challenges as a means of assuring their survival and the propagation of their species.  It was essential that this processor be capable of nearly instantaneous solutions to very complex problems, even if the solution were not exact.  After all, its owner’s life depended upon it!  Furthermore, it was very important that any behavioral modifications, or “reprogramming,” be remembered so that subsequent executions could be performed per the latest instructions – all without the benefit of software.  The several millions of years of adaptations and growth of this information processor has produced a system so complex that only in recent history have we even begun to understand it.  But as researchers have probed it marvels, new insights have emerged into Nature’s information-processing paradigm, suggesting favorable alternatives to the digital computer for certain classes of problems. 

 

From the period of the Second Word War a research thrust in information-processing architecture based in principle on that of the mammalian brain has been pursued in parallel with development of the digital computer.  The philosophy of this new computer was to learn from its environment through a training process, rather than its having to depend upon its functions being preprogrammed algorithmically.  As early as 1943 McCulloch and Pitts observed that finite logical relationships could be expressed with combinations of neuron models, for which they had proposed a simple mathematical derivation. [19]  McCulloch and Pitts proposed that a neuron could be approximated as a logic gate or switch that could be designed to function within the principles of Boolean algebra.  By combining these individual gates into multiple parallel sets, they further reasoned that these new “neural” networks could be made to perform any combination of logical functions.  Theirs was the first rudimentary introduction to the neural computer, or the artificial neural network as it is more commonly called, comprising a computer architecture designed to simulate biological neurologies. These computers are often merely referred to as neural networks.  Although early progress in this new processing concept was overshadowed by that of the digital computer, interest in neural computers grew very strong during the 1960s, but diminished until around the mid 1980s when renewed interest motivated a revival.

 

Although much is still unknown about the brain “wiring” and how it processes and retains information, enough has been learned of the neurological infrastructure to enable the design of an artificial processing system capable of representing at least a simple biological nervous system.  The architecture of neural networks simulates nerve cells or neurons connected together to form a massively parallel system network.  Artificial neurons take their cue from Nature’s design and attempt to approximate these same functions.  Typically, neural networks are organized in layers comprised of numerous interconnected nodes, where each node is a simulated neuron represented by a system of multiple inputs to a processing element followed by a single output.  A sketch of a typical but simplified artificial neuron is given in Figure 3.3.  If we apply the biological analogy, the neuron cell is represented by an electronic chip comprising the threshold logic required to receive and integrate input signals and to decide when the accumulated signal has reached the desired voltage threshold to release an output pulse.  The threshold logic chip is denoted by the Greek letter sigma (S) in Figure 3.3.  Its logic pattern behaves very much like a sigmoid, a mathematical function like that depicted in the insert. 

Text Box:  

Figure 3.3.  An artificial neuron.  The insert depicts the threshold activation potential of the sigmoid voltage function.

            The dendrites are represented by the multiple input lines at the left of the logic chip in the figure.  The signal arriving along each input line is modified by a weighting factor, an electronic chip capable of being adapted to a particular training pattern that may be desired.  This modified electronic charge is analogous to the post-synaptic potential seen in the dendritic membrane of a biological neuron.  The weighted signals from all of the input lines enter the threshold logic chip where they are summed to produce an activation potential for the artificial neuron.  This activation potential continues to build until it reaches some predetermined threshold voltage (shown as q in Figure 3.3), at which time it sends out a pulse along its axon to the next set of neurons.

 

The application of the neural computer to problem solving is quite different from that of the digital computer. [20] In the first place, the neural computer learns by experience – there is no complex set of algorithms involved.  Instead, it learns of the problem it is to address by receiving an input of some known variables with the intent of inferring some information about a desired unknown state as output variables.  To accomplish this, some relationship – no matter how tenuous – must exist between the known input and unknown output parameters.  Typically, the exact nature of this relationship is unknown and usually not readily predictable.  But it must be there.  Furthermore, it is not necessary that all the information about the problem at hand be known, and in fact it can be very noisy.  Neural networks on their own learn the relationship between the input and output information through training.  They in essence evolve their own algorithms.

 

Although different classes of networks use many different detailed training methods, all fall under two basic training philosophies – supervised and unsupervised.  Supervised training is the most common methodology, of which back propagation (discussed later) is the best known example.  For supervised training, the investigator assembles a set of training data that comprises examples of typical inputs along with corresponding known outputs relevant to the problem at hand.  This information is then fed into the network, and the network infers the relationship between them.  The training process takes the known input and output parameters and adjusts the various weights and thresholds of the network’s many neurons until all of the prediction errors of the training set have been minimized (See Figure 3.3).  Once properly trained where on its own the network has discovered the correct relationship between the input and output variables, the neural computer can then be used with different input data and produce a completely new output not known ahead of time.  One could, for example, train a neural computer to predict next week’s Dow-Jones stock prices by exposing the network to prices and indices taken from the past few months’ historical data.  Barring other outside market influences, the resulting output would show the expected trends.

 

Unsupervised networks, on the other hand, are trained with an input data set only.  This approach is especially useful if one is seeking patterns in the structure of the input data.  Clusters of data can be recognized and related to each other, thus enabling a better understanding of its behavior.

 

The neural systems of McCulloch and Pitts were only two-layer networks wherein information was introduced through the first layer and dispersed from the second layer. But in the late 1960’s, the Cornell University psychologist Frank Rosenblatt introduced a structural variation that he called the “perceptron.[21, 22]  His was the first approach at attempting to emulate biological neurons, and through this he introduced a unique architecture comprising processing units capable of adapting interconnecting “weights” and transmitting signals.  Perceptron “learning” was based on modifying the weights of the non-active lines in response to an error – a form of “punishment” for guessing wrong.  Typically, the learning philosophy for neural nets is to change the weight value such that the system is pulled toward some defined goal.  The perceptron pushed itself away from a non-goal.  While Rosenblatt’s interest was primarily in modeling the brain to support his research into understanding cognitive processes, other researchers saw the perceptron as a tool for broader applications.  Eager to apply this newfound capability to other problems of the day, they picked up the gauntlet and championed the attributes of this new medium.  But constraints of embryonic computing facilities with its limited breadth severely impacted development of viable applications, and the potential of the perceptron was oversold.  In 1969 Marvin Minsky and Seymour Papert wrote a book critical of the perceptron’s inherent limitations, which delivered a major setback to neural networks and curtailed research in the U.S. in this area for nearly two decades. [23]  They showed that the perceptron, the most sophisticated neural network of its time, was unable to solve a fairly straightforward nonlinear classification problem, which was readily accomplished by digital computers at their state of development in the late 1960s.

 

Neural-computer research may have been down, but it was not out.  Again taking biology as a cue, investigators who continued to follow the promise of neural processors chose to avoid the pattern classification problems presented by the perceptron and turned instead to obtaining a better understanding of how memories are formed in the brain through formation of distributed pathways.  Given the premise that the strengthening or weakening of the synaptic connections between groups of neurons would generate interactive memory, they set out to produce neural networks that would approximate this function. [24, 25]

 

But neural-network research experienced its first real renaissance that pulled it from the post-1969 malaise when in the mid 1980s two researchers from the University of California at San Diego made two fundamental modifications to the basic perceptron.  David Rumelhart and James McClelland first introduced a third layer, or hidden layer, of neurons between the input and output layers. [26]  Called the multilayer perceptron, each neuron connected to every other neuron between adjacent layers, but connections are not allowed across more than one layer or within any given layer.  Furthermore, multiple hidden layers are also admissible.  Only the input and output layers are in direct contact with the Text Box:  
Figure 3.4.  Depiction of simple typical neural network showing inputs, outputs, and one hidden layer.  Neural networks can be very complex and involve multiple hidden layers, as seen by the insert.
environment.  An example of a neural network with a hidden layer is shown in Figure 3.4.  Next, they introduced the concept of “back propagation” of errors to enable the network to learn more effectively.  Although this method was similar to that proposed by Paul Werbos in 1974, [27] the capabilities of the back propagation methodology were never actually established until investigated by the UC San Diego team. 

 

The elegance of the back propagation network lies in how the network learns or is trained.  Although many more sophisticated methods have been used for some time, the back-propagation technique was a pioneering methodology and deserves a short discussion.  A back-propagation network is trained with examples for which a pattern of activities for both the input and output are known.  If, for example, the network is trying to recognize the behavior of free-market oil prices, it initially guesses a result.  The guessed output is then matched with the known answer of a training set, and the difference is determined between the desired and actual output – defined as the prediction error.  With this completed, the error is now propagated backward through the network to the input and is minimized by adjusting the various connection strengths, or weights, between the neurons (see Figure 3.3).  While the weights can be any positive or negative value, usually randomly chosen small numbers are picked initially.  This process is iterated until a global-error minimum is found.  The network is considered to have been trained when this global-error minimum is reached and can now be applied to general problems, which in our example is predicting free-market oil prices.  The learning rule applied to back propagation first estimates and then uses the error contributed within each pathway through the network and modifies its response via predetermined weight-change rules.  The weight-change rules for the layers closer to the input layer include all of the weights of the more rearward layers (i.e., closest to the output layer).  This strategy is based on the premise that the back-layer weightings have a much higher influence on the magnitude of the error than do the input layers.  However, the spreading of this weight-change process over several layers often results in a slow-learning procedure, sometimes requiring a very large number of iterations.  Why this is so becomes more obvious if we consider the search for a global minimum in a more pictorial sense by envisioning a three-dimensional “error landscape,” often referred to as a “fitness landscape.”  This depiction is analogous to a segment of a mountain range where the highest peak is the global maximum, and the lowest valley is the global minimum being sought.  The learning algorithm of the back-propagation neural network searches for the lowest feature on the landscape.  This procedure is in essence a “gradient descent” wherein it attempts to determine the steepest descent on the landscape surface as a means of driving to a minimum as quickly as possible.  However, the search for the global-error minimum is an inherently difficult optimization process because the landscape is a complicated function of the many network-connection weights.  If for instance in our oil-price example there were thirty weights requiring optimization, then the error surface would be a thirty-dimensional space – a highly irregular surface.  Not only might the training process be slow, but there is also the possibility that the process may become trapped in one of the many local minima and not reach the global minimum at all.  This pitfall is typically avoided by mitigation techniques such as assigning smaller perturbations to the connection weights, adding more nodes to the hidden layers, and/or intentionally introducing noise.  While the introduction of noise may be counterintuitive, it has the net effect of making the local minima look information poor and encourages the network to seek a steeper, more information-rich valley.

 

Text Box:  

Figure 3.5.  Two versions of the Hopfield neural network showing each node connected to every other node.
All of the neural networks discussed thus far share one common feature – they are all “feed-forward” networks.  Their architectures allow signals to flow only in the direction of input to output with no return or “feedback.”  While studying the magnetic behavior of spin glasses at Caltech in the early 1980s, John Hopfield suggested a neural network with a fully connected architecture. [28]  Called the Hopfield network, its architecture features each neuron connected to every other neuron, thus enabling a feedback process.  A depiction of the Hopfield network is shown in Figure 3.5.  Although these specific networks have a fairly limited class of applications, they have been adapted very well as content-addressable memories for large databases.  For digital computers data retrieval is accomplished by accessing labeled files through the specification of a discrete address.  Because this is a serial single-point process, any error no matter how small will produce an incorrect result.  Being a form of associative memory, content addressability is extremely robust and allows the desired information to be extracted even when the specified keywords are incomplete or even incorrect.  The feedback processes lead to nonlinear effects that evolve into organized patterns that enable specific neurons to fire continuously while others remain silent.  The actual pattern depends upon the initial input, or the data being sought, and the sets of local minima produced by the Hopfield network are used as the memory cells.

 

Networks involving feedback, called recurrent networks, are fascinating subjects of interest to researchers, but so far have found only limited applications.  And partly because of their very complex dynamics, they occasionally may be unstable.  Although research continues in recurrent networks, the feed-forward structure is the mainstay for most of the problem solving accomplished by neural networks.

 

The decade of the 1990s saw the real resurgence in neural networks.  Ironically, this revival is heavily attributed to the phenomenal growth the digital computer experienced over this same period.  The greatly enhanced digital computational power now permitted the simulation of neural networks on software, thus enabling rapid experimentation with novel architectures.  In addition, the steep growth of CMOS foundries that contributed to the accelerated digital computer expansion also enabled much wider and more rapid neural-network implementations in hardware.  But it was predominantly through software that researchers have been able to generate richly complex neural networks for a myriad of different applications.  Although a neural network implemented as hardware is faster than that of software, the software implementation is much more flexible and easier to modify.  Hidden layers can be added or removed, neurons can be put in or taken out, the training process can be varied, and supporting algorithms can be introduced to assist this process.

 

The multilayer perceptron discussed earlier remains as the most popular neural-network architecture in use, and a myriad of application techniques have been introduced recently that enhance its effectiveness.  But the resurgence of interest in this technology has encouraged the development of numerous other paradigms, as well. [29, 30, 31]  Examples include radial-basis-function networks, probabilistic neural networks, generalized-regression neural networks, and Kohonen networks, to name a few.  The domain of the radial-basis-function network is divided into various circles, each characterized by a center and radius.  The system responds to the distance the points are to the center represented by a given radial unit, and each radial unit traces out a Gaussian response surface.  The slope of this response surface can be modified, if desired, in a manner very similar to how one would modify a neuron’s sigmoid curve (see Figure 3.3).  These various radial units comprise the hidden layer of the radial-basis-function network.  The nonlinearity of these functions renders a single hidden layer sufficient to model any function, a distinct advantage over the multilayer perceptron.  Furthermore, a linear combination of a weighted sum of the Gaussian response surfaces is all that is necessary to combine the hidden layer with the output.  This feature allows optimization by traditional linear modeling methods that are both fast and much less prone to discovering local minima, which impedes the training of the multilayer perceptron.  The net result is that the training time for the radial-basis-function network is orders of magnitude faster.

 

Probabilistic neural networks are kernel-based processes designed specifically to perform classification tasks by estimating probability-density functions from a given data set.  By knowing the probability-density functions of possible classes within the domain of interest, one can then compare the probability magnitudes of these various classes and select the most likely.  For example, for diagnosing the intensity and vectors of a disease epidemic, the data set might include the first identified breakout, the percent of population that presently have the disease, and the dates and locations of each reported confirmation.  A high probability density would be represented by a cluster of cases found close together.  Deviation from this region traces a typical Gaussian probability function.  The probabilistic neural network is very similar to the radial-basis function network in that it is made up of at least three layers – an input, radial (hidden), and output layer.  Each radial unit models a Gaussian function specific to and centered at a given training case.  Only one radial unit is assigned to any given case, and each radial unit is copied directly from the training data.  Furthermore, the network is constrained to produce only one output per class, thus requiring each radial unit to be connected only to those belonging to its class while disallowing connections from all others.  The result is that any given output represents summations of responses relevant only to a specific class.  The aggregate outputs are estimates of the probability-density functions of all the classes.  In the above example, these probability-density functions may show such factors as the increasing rate of the disease outbreak with time, the movement of the outbreak, and an estimate of misdiagnoses.  The probabilistic nature of the outputs from this network greatly simplifies the interpretation of the results.  Another advantage of the probabilistic neural network is its very rapid training speed.  Training primarily involves merely copying the necessary input data into the network and applying an appropriate smoothing factor for control.  The smoothing factor is effectively the radial deviation of the Gaussian functions, usually chosen experimentally to provide an element of data overlap.  With this accomplished, the network is trained.  But because the network also contains all the data of all the training cases, its size can become enormous and hence slow to execute.  This factor is likely its major disadvantage.

 

The generalized regression neural network is very similar to the probabilistic neural network except that it performs regression tasks for seeking statistical relationships between the input and output variables.  However, the generalized regression network typically incorporates at least two hidden layers – the first comprising the radial units to model the Gaussian functions for each training case and the second to assist in estimating the weighting factors.  This second layer performs a specialized function in that it is made up of special units assigned to each output wherein weighted sums for each respective output are formulated.  The output, however, reflects weighted average, which is obtained by dividing the weighted sum by the sum of the weighting factors in one of the special units of the second layer.  As is typical of regression problems, the hidden layer always comprises one more layer than does the output.  To reduce network size and increase execution speed, the radial units of the generalized regression network can be chosen to represent clusters of training input functions rather than individual cases. And again like the probabilistic neural network, the generalized regression neural network trains very quickly, but is inclined to be unwieldy and slow.

 

The Kohonen network, the last one we will discuss, differs considerably from all of the above networks in that it is designed primarily for unsupervised learning.  As alluded to earlier, in supervised learning the training data is composed of input variables combined with their relevant output information in order to enable the network to infer a relationship between the two.  Only the input variables are involved in unsupervised learning.  The principal application of a Kohonen network is to learn the structure of the data.  By recognizing and identifying classes of data, the network can perform classification tasks.  A typical Kohonen network comprises only an input and output layer, but the output layer is made up of radial units and is called the topological map layer.  The architecture of the Kohonen network follows that of the cerebral cortex of the brain where maps of the body components are found adjacent to each other along its surface.  Training is accomplished iteratively.  A random set of input data formed into radial centers are gradually adjusted to reflect clustering of the training set.  In addition, iterative training converges related input data clusters so that they are also in close proximity on the topological map.  This technique aids the user in visualizing and understanding otherwise very obtuse data.  Once training is complete, the topological map can be examined for clusters, and as with the cerebral cortex, relationships between adjacent clusters can be inferred.  The Kohonen network is very useful for conducting exploratory data analysis where relationships among the data are too complex to easily understand and a means of clustering and classifying the data is needed.

 

Neural networks differ distinctly from digital computers in several fundamental ways.  Because neural networks have the capacity to learn from their environment, they are capable of performing a variety of tasks found to be very difficult or impossible to execute with a conventional digital computer.  Like their biological counterparts, they are capable of dealing with non-linearities and can readily handle incomplete, noisy, or even missing data as they deal with problems with no clear-cut solution.  Furthermore, they do not rely upon the intricate programming of equations or algorithms to produce desired results – they create their own relationships from environmental information received through training.  They learn by experience.  Unlike digital computers that depend upon deductive reasoning where known rules or algorithms are applied to the input data to generate an output, neural networks utilize inductive reasoning wherein input and output data are given as training examples and the neural network itself develops the rules.  They automatically produce associations based on results of known situations learned during training and adjust or “adapt” themselves to new situations until the new situation is eventually generalized.

 

Computations performed by digital computers are serial, centralized, and synchronous, while neural networks function in a parallel, collective, and asynchronous mode.  A digital computer’s memory is discretely packaged, stored, and addressable by specific address location.  A neural network has a distributed memory that is completely internalized and is content addressable.  Neural networks are capable of working with a very large number of variables in an environment of unknown rules and noisy or partial data, whereas a digital computer requires very well defined operating instructions that apply known algorithms to produce an output.  Neural networks are very fault tolerant and experience graceful degradation when failures occur.  Digital computers, on the other hand, are not at all fault tolerant.  The failure of a single component or a miscoded command will cause the entire system to fail.  However, digital computers are very fast and exact when it comes to producing mathematical computations.  For example, they can generate a missile-ranging solution depicting the exact impact point in milliseconds.  Neural networks are not at all as effective for such applications.  They are fairly slow and inexact when applied to “conventional” computer functions.  Nonetheless, they do very well at recognizing and matching vague, complex, or incomplete patterns, and they are especially adept at projecting likely events.  They are unequaled in speed when arriving at approximate solutions derived through parallel processing.  They are, after all, the computational paradigm selected by Nature.

 

            An excellent example of the type of problem that a neural processor handles readily where a digital computer falters is the intractable problem - a class of computation for which the time required for its solution becomes unbounded.  The raw computing power of the digital computer has negligible effect in developing a solution for this class of problem.  The classical example of an intractable problem is the Hamiltonian Path mentioned earlier in this chapter, the case of the traveling salesman who needs to visit a large number of cities but needs to optimize his route so that he passes through each city only once.  In addition, he must make this trip in such a way that the distance over his total route is minimized.  In principle, the formulation of this problem is very straightforward, but the permutations of possible solutions are so vast that no well-behaved algorithm has ever been derived that is compatible with the operational methodology of a digital computer.  For a few cities and limited options, the computer's behavior is stable.  For example, for only five cities there are twelve combinations of possible routes, and a digital computer handles this nicely.  But if the number of cities doubles to ten, then there are suddenly 181,440 possible routes!  The required computational time has dramatically increased.  Let the number of cities increase to only 25 and the number of possible routes become so large that a solution from a digital computer computing a million possible routes per second would require 9.8 billion years to complete. [32] 

 

            But when neural processors are applied to this same problem, the results are quite different.  In 1992, a team of university and national-laboratory researchers demonstrated a routing through 3038 cities with only eighteen months of neural-network processing time. [33] Because of the nature of neural networks, they lend themselves quite readily to this class of problem.  And for over 40 years of their application to the traveling-salesman example, they typically have been focused on determining large-scale solutions.  As a case in point, Shara Amin and Jose-Luis Fernandez, working with a Kohonen network at the British Telecommunications Laboratories, have been studying a self-organizing network of nodes as applied to this same problem in an attempt to derive a solution for 35,000 cities. [33] An immediate practical application of this effort is for telecommunications routing, which British Telecommunications is implementing.

 

Future Trends.  Throughout this chapter we have seen how computers have evolved over the past fifty years from monstrous behemoths with very limited capability to very compact, sophisticated computational systems with enormous speed, capacity, and capability.  We have seen how two completely different computer paradigms – the digital and neural computer – have evolved in parallel.  When the original ENIAC agreement was signed on June 5, 1943 to explore six months of research and development of an electronic numerical integrator, even the most visionary prophet of that time could hardly imagine the eventual outcome computers would experience a half century later.  The thirty-ton ENIAC required thirty seconds to compute a sixty-second ballistic trajectory while consuming nearly two-hundred kilowatts of power to perform that calculation.  Today, that same computation can be performed instantly on any five-pound laptop with only a few watts.  Furthermore, computer development has brought about the information-technology revolution that created the Internet, dramatically impacted telecommunications, and enabled the emergence of countless new industries and products that never before existed.  And from what we can now see, this is only the beginning.  And the driver that will cause this trend to continue is the tremendous economic impact the computer has imparted throughout the world. 

 

            The advent of the computer, and in particular the desktop computer, has been a revolutionary contributor and catalyst for the Information Age, and with it the world is experiencing a dramatic departure in economic philosophy from that of the Industrial Revolution.  Since the mid 1980s, the United States especially has experienced a transformation in the structure, function, and rules of its economic system.  The classical job creation through standard assembly-line manufacturing processes is giving way to a knowledge- and idea-based economic system where innovation and technology embedded within services and manufactured products dominate.  Risk, change, and uncertainty have become the norm.  Not to detract from conventional mass production of goods, especially since manufacturing and agricultural productivity in the U.S. is higher than ever, but an office economy involved with high technology and services is rapidly emerging.  It is partially because of this high production of staple goods that has enabled nearly 80% of the workforce to spend their time in the services and information sectors.  Robert Atkinson and Randolph Court of the Progressive Policy Institute in Washington, D.C. [34] in their assessment of the recent impact of technology refer to this trend as the “New Economy.”

 

            The period from 1990 to 1996 saw the output of high-technology companies increase from 5.5 to 6.2 percent of the Gross Domestic Product.  And computer evolution has dominated the growth of the high-tech information industry.  IBM Corporation that was the predominant computer producer in the last half of the Twentieth Century had only 2500 competitors for its goods and services in 1965, but was facing as many as 50,000 in 1992.  The pace of evolution in information technology is so dynamic that the time between innovation and market introduction is becoming dramatically shorter just for companies to maintain a competitive edge.  Computer components, a principal factor in information technology, lose value on the order of one percent per week. [34]  That which took three years to bring a product from conception to market in 1990 now is typically completed in fewer than two.  The demand for newer, more capable goods has become so intensive that high-technology companies with a heavy electronics and computer segment are finding a progressively higher percentage of their revenues derived from recently introduced products.  Examples include companies like 3M, which receives thirty percent of its revenues from products it has marketed for fewer than four years.  A more dramatic example is Hewlett Packard that derives nearly 77% of its revenues from products that have been on the market for fewer than two years. [35]

 

            The predominant motivation for the thrust to improve microelectronics is to produce ever more compact and more capable computers.  A spin-off being realized from this trend is the development and perfection of microprocessors for a myriad of other applications.  As microprocessors have become smaller and more capable, they have proliferated throughout virtually everything we touch.  Cell phones, aircraft avionics, automotive systems, laptops, kitchen appliances, and digital watches all depend upon microprocessors – and the microprocessors give a “computer-like” performance to each item for which it is adapted.  And the market is growing dramatically as capability expands.  During the fifteen-year period from the early 1980s to 1997, the world production of microprocessors tripled to about 260 billion units, and that number is expected to increase at least half again by the year 2003.  In economic terms that first fifteen-year growth represented a factor of five market increase to well over $100 billion – all while the individual components were dropping in price.  The information technology enabled by the ubiquitous microprocessor is bringing about completely new industries through its transformation of businesses and products.  For instance, the advent of the Internet, made possible by the computer and microelectronics industry, has dramatically increased the speed of commerce and has been instrumental in the creation of thousands of new jobs.

 

            But as the economic transitions unfold for the Information Age, what influences do we see emerging that may directly encourage development of nanoelectronics?  First, the economic trend of the Information Age brought about by the computer is essentially irreversible.  Economics has driven the rush toward ever-smaller transistors for the development of ever-more capable microprocessors.  When Intel Corporation introduced its 8086 chip in 1978 comprising 29,000 transistors, the chip cost 1.2 cents per transistor, resulting in $480 per million instructions per second (MIPS).  The 386 chip with 275,000 transistors cost 0.11 cents per transistor or $50 per MIPS when it came out in 1985.  The Pentium Pro, which entered the market in 1995 complete with 5.5 million transistors, reduced computing cost to 0.02 cents per transistor or $4 per MIPS. [34]  Nanoelectronic devices will drive computing costs dramatically lower.  Their promise of increased performance will in turn provide a very strong commercial motivation for their continued research.  Molecular-size devices mean an ever-greater number can be packaged onto a single chip, resulting in enhanced capability at considerably lower cost.  While there are numerous technological barriers yet to be overcome, the motivation for lower-cost devices will continue to drive research in this area.  Research that will ultimately lead to the production of nanoprocessors will continue not only for the dramatic improvement in lower-cost processing capability, but also for its tremendous power efficiency benefit.  As more is demanded of conventional electronic components, their energy use increases and heat dissipation becomes a major problem, thus putting greater demand on the total system.  Nanoprocessors will consume much less energy per processing step and improve computer power efficiency by potentially over a million-fold.  New markets will emerge from the completely new systems that will be created by this nanotechnology.  Because the domain of nanoelectronics is at the molecular scale, it enables the melding of chemical and biological elements as electronic devices to produce new components that were previously not possible.  A near-term major recipient of this benefit would be the health-care industry.  Microchips for medical diagnostics that integrate nanoelectronic devices with biological media for automated biomedical analyses are but one example.  The integration of bioproteins with electronics becomes possible at the nanodevice level.

 

            But there will also be a significant macroscale economic impact from systems derived from nanoelectronics.  As devices become smaller, the resulting processing systems become more capable and less expensive, thus encouraging the evolution of a ubiquitous digital economy.  This new economic trend, wrought from increased research and innovation, will accelerate the demand for higher-wage jobs and a more skilled workforce.  We have already seen the beginning of this effect where high-paying jobs grew by 20% between 1989 and 1998. [34]  New information technologies will tend to increase the rates of productivity growth, which in turn will lead to higher income and reduced unemployment.

 

            As the world continues to acquire an appetite for the economic advantages offered by the Information Age, wrought by the advent of the computer and accelerated by research in nanoelectronics, industries devoted to knowledge production will play an increasingly more dominant role as the principal growth engines for the new economy.  Emerging nano-processing technology will facilitate the outputs from the knowledge producers, those engineers and scientists involved in such widely diverse fields as biotechnology and information processing.  The products of these knowledge producers in turn will secure new markets for industries that manage information, such as telecommunications, advertising, and education.  The world economic paradigm can never return to its pre-existing state.  The trend toward smaller, more capable computers will be self-sustaining.  The genie is out of the bottle.  In but a very short time we have come a long way from the abacus!