Artificial intelligence (AI) is the intelligence of machines and the branch of computer science which aims to create it.
Major AI textbooks define the field as "the study and design of intelligent agents,"[1] where an intelligent agent is a system that perceives its environment and takes actions which maximize its chances of success.[2] John McCarthy, who coined the term in 1956,[3] defines it as "the science and engineering of making intelligent machines."[4]
Among the traits that researchers hope machines will exhibit are reasoning, knowledge, planning, learning, communication, perception and the ability to move and manipulate objects.[5] General intelligence (or "strong AI") has not yet been achieved and is a long-term goal of some AI research.[6]
AI research uses tools and insights from many fields, including computer science, psychology, philosophy, neuroscience, cognitive science, linguistics, ontology, operations research, economics, control theory, probability, optimization and logic.[7] AI research also overlaps with tasks such as robotics, control systems, scheduling, data mining, logistics, speech recognition, facial recognition and many others.[8]
Other names for the field have been proposed, such as computational intelligence,[9] synthetic intelligence,[9] intelligent systems,[10] or computational rationality.[11] These alternative names are sometimes used to set oneself apart from the part of AI dealing with symbols (considered outdated by many, see GOFAI) which is often associated with the term “AI” itself.
Contents |
Thinking machines and artificial beings appear in Greek myths, such as Talos of Crete, the golden robots of Hephaestus and Pygmalion's Galatea.[12] Human likenesses believed to have intelligence were built in every civilization, beginning with the sacred statues worshipped in Egypt and Greece,[13][14] and including the machines of Yan Shi,[15] Hero of Alexandria,[16] Al-Jazari[17] or Wolfgang von Kempelen.[18] It was widely believed that artificial beings had been created by Geber,[19] Judah Loew[20] and Paracelsus.[21] Stories of these creatures and their fates discuss many of the same hopes, fears and ethical concerns that are presented by artificial intelligence.[22]
Mary Shelley's Frankenstein,[23] considers a key issue in the ethics of artificial intelligence: if a machine can be created that has intelligence, could it also feel? If it can feel, does it have the same rights as a human being? The idea also appears in modern science fiction: the film Artificial Intelligence: A.I. considers a machine in the form of a small boy which has been given the ability to feel human emotions, including, tragically, the capacity to suffer. This issue, now known as "robot rights", is currently being considered by, for example, California's Institute for the Future,[24] although many critics believe that the discussion is premature.[25]
Another issue explored by both science fiction writers and futurists is the impact of artificial intelligence on society. In fiction, AI has appeared as a servant (R2D2 in Star Wars), a comrade (Lt. Commander Data in Star Trek), an extension to human abilities (Ghost in the Shell), a conqueror (The Matrix), a dictator (With Folded Hands), an exterminator (Terminator, Battlestar Galactica) and a race (Asurans in "Stargate Atlantis"). Academic sources have considered such consequences as: a decreased demand for human labor;[26] the enhancement of human ability or experience;[27] and a need for redefinition of human identity and basic values.[28]
Several futurists argue that artificial intelligence will transcend the limits of progress and fundamentally transform humanity. Ray Kurzweil has used Moore's law (which describes the relentless exponential improvement in digital technology with uncanny accuracy) to calculate that desktop computers will have the same processing power as human brains by the year 2029, and that by 2045 artificial intelligence will reach a point where it is able to improve itself at a rate that far exceeds anything conceivable in the past, a scenario that science fiction writer Vernor Vinge named the "technological singularity".[27] Edward Fredkin argues that "artificial intelligence is the next stage in evolution,"[29] an idea first proposed by Samuel Butler's Darwin Among the Machines (1863), and expanded upon by George Dyson in his book of the same name in 1998. Several futurists and science fiction writers have predicted that human beings and machines will merge in the future into cyborgs that are more capable and powerful than either. This idea, called transhumanism, which has roots in Aldous Huxley and Robert Ettinger, is now associated with robot designer Hans Moravec, cyberneticist Kevin Warwick and inventor Ray Kurzweil.[27] Transhumanism has been illustrated in fiction as well, for example on the manga Ghost in the Shell. Pamela McCorduck believes that these scenarios are expressions of an ancient human desire to, as she calls it, "forge the gods."[22]
In the middle of the 20th century, a handful of scientists began a new approach to building intelligent machines, based on recent discoveries in neurology, a new mathematical theory of information, an understanding of control and stability called cybernetics, and above all, by the invention of the digital computer, a machine based on the abstract essence of mathematical reasoning.[30]
The field of modern AI research was founded at a conference on the campus of Dartmouth College in the summer of 1956.[31] Those who attended would become the leaders of AI research for many decades, especially John McCarthy, Marvin Minsky, Allen Newell and Herbert Simon, who founded AI laboratories at MIT, CMU and Stanford. They and their students wrote programs that were, to most people, simply astonishing:[32] computers were solving word problems in algebra, proving logical theorems and speaking English.[33] By the middle 60s their research was heavily funded by the U.S. Department of Defense[34] and they were optimistic about the future of the new field:
These predictions, and many like them, would not come true. They had failed to recognize the difficulty of some of the problems they faced.[37] In 1974, in response to the criticism of England's Sir James Lighthill and ongoing pressure from Congress to fund more productive projects, the U.S. and British governments cut off all undirected, exploratory research in AI. This was the first AI Winter.[38]
In the early 80s, AI research was revived by the commercial success of expert systems[39] (a form of AI program that simulated the knowledge and analytical skills of one or more human experts). By 1985 the market for AI had reached more than a billion dollars and governments around the world poured money back into the field.[40] However, just a few years later, beginning with the collapse of the Lisp Machine market in 1987, AI once again fell into disrepute, and a second, more lasting AI Winter began.[41]
In the 90s and early 21st century AI achieved its greatest successes, albeit somewhat behind the scenes. Artificial intelligence was adopted throughout the technology industry, providing the heavy lifting for logistics, data mining, medical diagnosis and many other areas.[42] The success was due to several factors: the incredible power of computers today (see Moore's law), a greater emphasis on solving specific subproblems, the creation of new ties between AI and other fields working on similar problems, and above all a new commitment by researchers to solid mathematical methods and rigorous scientific standards.[43]
Artificial intelligence, by claiming to be able to recreate the capabilities of the human mind, is both a challenge and an inspiration for philosophy. Are there limits to how intelligent machines can be? Is there an essential difference between human intelligence and artificial intelligence? Can a machine have a mind and consciousness? A few of the most influential answers to these questions are given below.[44]
While there is no universally accepted definition of intelligence,[55] AI researchers have studied several traits that are considered essential.[5]
Early AI researchers developed algorithms that imitated the process of conscious, step-by-step reasoning that human beings use when they solve puzzles, play board games, or make logical deductions.[56] By the late 80s and 90s, AI research had also developed highly successful methods for dealing with uncertain or incomplete information, employing concepts from probability and economics.[57]
For difficult problems, most of these algorithms can require enormous computational resources — most experience a "combinatorial explosion": the amount of memory or computer time required becomes astronomical when the problem goes beyond a certain size. The search for more efficient problem solving algorithms is a high priority for AI research.[58]
It is not clear, however, that conscious human reasoning is any more efficient when faced with a difficult abstract problem. Cognitive scientists have demonstrated that human beings solve most of their problems using nonconscious reasoning, rather than the conscious, step-by-step deduction that early AI research was able to model.[59] Embodied cognitive science argues that sensorimotor skills are essential to our problem solving abilities. It is hoped that sub-symbolic methods, like computational intelligence and situated AI, will be able to model these instinctive skills. The problem of unconscious problem solving, which forms part of our commonsense reasoning, is largely unsolved[dubious ].
Knowledge representation[60] and knowledge engineering[61] are central to AI research. Many of the problems machines are expected to solve will require extensive knowledge about the world. Among the things that AI needs to represent are: objects, properties, categories and relations between objects;[62] situations, events, states and time;[63] causes and effects;[64] knowledge about knowledge (what we know about what other people know);[65] and many other, less well researched domains. A complete representation of "what exists" is an ontology[66] (borrowing a word from traditional philosophy), of which the most general are called upper ontologies.
Among the most difficult problems in knowledge representation are:
Intelligent agents must be able to set goals and achieve them.[70] They need a way to visualize the future (they must have a representation of the state of the world and be able to make predictions about how their actions will change it) and be able to make choices that maximize the utility (or "value") of the available choices.[71]
In some planning problems, the agent can assume that it is the only thing acting on the world and it can be certain what the consequences of its actions may be.[72] However, if this is not true, it must periodically check if the world matches its predictions and it must change its plan as this becomes necessary, requiring the agent to reason under uncertainty.[73]
Multi-agent planning uses the cooperation and competition of many agents to achieve a given goal. Emergent behavior such as this is used by evolutionary algorithms and swarm intelligence.[74]
Important machine learning[75] problems are:
The mathematical analysis of machine learning algorithms and their performance is a branch of theoretical computer science known as computational learning theory.
Natural language processing[77] gives machines the ability to read and understand the languages human beings speak. Many researchers hope that a sufficiently powerful natural language processing system would be able to acquire knowledge on its own, by reading the existing text available over the internet. Some straightforward applications of natural language processing include information retrieval (or text mining) and machine translation.[78]
The field of robotics[79] is closely related to AI. Intelligence is required for robots to be able to handle such tasks as object manipulation[80] and navigation, with sub-problems of localization (knowing where you are), mapping (learning what is around you) and motion planning (figuring out how to get there).[81]
Machine perception[82] is the ability to use input from sensors (such as cameras, microphones, sonar and others more exotic) to deduce aspects of the world. Computer vision[83] is the ability to analyze visual input. A few selected subproblems are speech recognition,[84] facial recognition and object recognition.[85]
Emotion and social skills play two roles for an intelligent agent:[86]
A sub-field of AI addresses creativity both theoretically (from a philosophical and psychological perspective) and practically (via specific implementations of systems that generate outputs that can be considered creative).
Most researchers hope that their work will eventually be incorporated into a machine with general intelligence (known as strong AI), combining all the skills above and exceeding human abilities at most or all of them.[6] A few believe that anthropomorphic features like artificial consciousness or an artificial brain may be required for such a project.
Many of the problems above are considered AI-complete: to solve one problem, you must solve them all. For example, even a straightforward, specific task like machine translation requires that the machine follow the author's argument (reason), know what it's talking about (knowledge), and faithfully reproduce the author's intention (social intelligence). Machine translation, therefore, is believed to be AI-complete: it may require strong AI to be done as well as humans can do it.[87]
Artificial intelligence is a young science and there is still no established unifying theory. The field is fragmented[88] and research communities have grown around different approaches.
In the 40s and 50s, a number of researchers explored the connection between neurology, information theory, and cybernetics. Some of them built machines that used electronic networks to exhibit rudimentary intelligence, such as W. Grey Walter's turtles and the Johns Hopkins Beast. Many of these researchers gathered for meetings of the Teleological Society at Princeton and the Ratio Club in England.[30]
When access to digital computers became possible in the middle 1950s, AI research began to explore the possibility that human intelligence could be reduced to symbol manipulation. The research was centered in three institutions: CMU, Stanford and MIT, and each one developed its own style of research. John Haugeland named these approaches to AI "good old fashioned AI" or "GOFAI".[89]
During the 1960s, symbolic approaches had achieved great success at simulating high-level thinking in small demonstration programs. Approaches based on cybernetics or neural networks were abandoned or pushed into the background.[99] By the 1980s, however, progress in symbolic AI seemed to stall and many believed that symbolic systems would never be able to imitate all the processes of human cognition, especially perception, robotics, learning and pattern recognition. A number of researchers began to look into "sub-symbolic" approaches to specific AI problems.[100]
The "intelligent agent" paradigm became widely accepted during the 1990s.[104] An intelligent agent is a system that perceives its environment and takes actions which maximizes its chances of success. The simplest intelligent agents are programs that solve specific problems. The most complicated intelligent agents are rational, thinking human beings.[105] The paradigm gives researchers license to study isolated problems and find solutions that are both verifiable and useful, without agreeing on one single approach. An agent that solves a specific problem can use any approach that works — some agents are symbolic and logical, some are sub-symbolic neural networks and others may use new approaches. The paradigm also gives researchers a common language to communicate with other fields—such as decision theory and economics—that also use concepts of abstract agents.
An agent architecture or cognitive architecture allows researchers to build more versatile and intelligent systems out of interacting intelligent agents in a multi-agent system.[106] A system with both symbolic and sub-symbolic components is a hybrid intelligent system, and the study of such systems is artificial intelligence systems integration. A hierarchical control system provides a bridge between sub-symbolic AI at its lowest, reactive levels and traditional symbolic AI at its highest levels, where relaxed time constraints permit planning and world modelling.[107] Rodney Brooks' subsumption architecture was an early proposal for such a hierarchical system.
In the course of 50 years of research, AI has developed a large number of tools to solve the most difficult problems in computer science. A few of the most general of these methods are discussed below.
Many problems in AI can be solved in theory by intelligently searching through many possible solutions:[108] Reasoning can be reduced to performing a search. For example, logical proof can be viewed as searching for a path that leads from premises to conclusions, where each step is the application of an inference rule.[109] Planning algorithms search through trees of goals and subgoals, attempting to find a path to a target goal, a process called means-ends analysis.[110] Robotics algorithms for moving limbs and grasping objects use local searches in configuration space.[80] Many learning algorithms use search algorithms based on optimization.
Simple exhaustive searches[111] are rarely sufficient for most real world problems: the search space (the number of places to search) quickly grows to astronomical numbers. The result is a search that is too slow or never completes. The solution, for many problems, is to use "heuristics" or "rules of thumb" that eliminate choices that are unlikely to lead to the goal (called "pruning the search tree"). Heuristics supply the program with a "best guess" for what path the solution lies on.[112]
A very different kind of search came to prominence in the 1990s, based on the mathematical theory of optimization. For many problems, it is possible to begin the search with some form of a guess and then refine the guess incrementally until no more refinements can be made. These algorithms can be visualized as blind hill climbing: we begin the search at a random point on the landscape, and then, by jumps or steps, we keep moving our guess uphill, until we reach the top. Other optimization algorithms are simulated annealing, beam search and random optimization.[113]
Evolutionary computation uses a form of optimization search. For example, they may begin with a population of organisms (the guesses) and then allow them to mutate and recombine, selecting only the fittest to survive each generation (refining the guesses). Forms of evolutionary computation include swarm intelligence algorithms (such as ant colony or particle swarm optimization)[114] and evolutionary algorithms (such as genetic algorithms[115] and genetic programming[116][117]).
Logic[118] was introduced into AI research by John McCarthy in his 1958 Advice Taker proposal. The most important technical development was J. Alan Robinson's discovery of the resolution and unification algorithm for logical deduction in 1963. This procedure is simple, complete and entirely algorithmic, and can easily be performed by digital computers.[119] However, a naive implementation of the algorithm quickly leads to a combinatorial explosion or an infinite loop. In 1974, Robert Kowalski suggested representing logical expressions as Horn clauses (statements in the form of rules: "if p then q"), which reduced logical deduction to backward chaining or forward chaining. This greatly alleviated (but did not eliminate) the problem.[109][120]
Logic is used for knowledge representation and problem solving, but it can be applied to other problems as well. For example, the satplan algorithm uses logic for planning,[121] and inductive logic programming is a method for learning.[122] There are several different forms of logic used in AI research.
Many problems in AI (in reasoning, planning, learning, perception and robotics) require the agent to operate with incomplete or uncertain information. Starting in the late 80s and early 90s, Judea Pearl and others championed the use of methods drawn from probability theory and economics to devise a number of powerful tools to solve these problems.[126][127]
Bayesian networks[128] are very general tool that can be used for a large number of problems: reasoning (using the Bayesian inference algorithm),[129] learning (using the expectation-maximization algorithm),[130] planning (using decision networks)[131] and perception (using dynamic Bayesian networks).[132]
Probabilistic algorithms can also be used for filtering, prediction, smoothing and finding explanations for streams of data, helping perception systems to analyze processes that occur over time[133] (e.g., hidden Markov models[134] and Kalman filters[135]).
A key concept from the science of economics is "utility": a measure of how valuable something is to an intelligent agent. Precise mathematical tools have been developed that analyze how an agent can make choices and plan, using decision theory, decision analysis,[136] information value theory.[71] These tools include models such as Markov decision processes,[137] dynamic decision networks,[137] game theory and mechanism design[138]
The simplest AI applications can be divided into two types: classifiers ("if shiny then diamond") and controllers ("if shiny then pick up"). Controllers do however also classify conditions before inferring actions, and therefore classification forms a central part of many AI systems.
Classifiers[139] are functions that use pattern matching to determine a closest match. They can be tuned according to examples, making them very attractive for use in AI. These examples are known as observations or patterns. In supervised learning, each pattern belongs to a certain predefined class. A class can be seen as a decision that has to be made. All the observations combined with their class labels are known as a data set.
When a new observation is received, that observation is classified based on previous experience. A classifier can be trained in various ways; there are many statistical and machine learning approaches.
A wide range of classifiers are available, each with its strengths and weaknesses. Classifier performance depends greatly on the characteristics of the data to be classified. There is no single classifier that works best on all given problems; this is also referred to as the "no free lunch" theorem. Various empirical tests have been performed to compare classifier performance and to find the characteristics of data that determine classifier performance. Determining a suitable classifier for a given problem is however still more an art than science.
The most widely used classifiers are the neural network,[140] kernel methods such as the support vector machine,[141] k-nearest neighbor algorithm,[142] Gaussian mixture model,[143] naive Bayes classifier,[144] and decision tree.[145] The performance of these classifiers have been compared over a wide range of classification tasks[146] in order to find data characteristics that determine classifier performance.
The study of artificial neural networks[140] began in the decade before the field AI research was founded. In the 1960s Frank Rosenblatt developed an important early version, the perceptron.[147] Paul Werbos developed the backpropagation algorithm for multilayer perceptrons in 1974,[148] which led to a renaissance in neural network research and connectionism in general in the middle 1980s. The Hopfield net, a form of attractor network, was first described by John Hopfield in 1982.
Common network architectures which have been developed include the feedforward neural network, the radial basis network, the Kohonen self-organizing map and various recurrent neural networks.[citation needed] Neural networks are applied to the problem of learning, using such techniques as Hebbian learning, competitive learning[149] and the relatively new field of Hierarchical Temporal Memory which simulates the architecture of the neocortex.[150]
Control theory, the grandchild of cybernetics, has many important applications, especially in robotics.[151]
AI researchers have developed several specialized languages for AI research:
AI applications are also often written in standard languages like C++ and languages designed for mathematics, such as Matlab and Lush.
How can one determine if an agent is intelligent? In 1950, Alan Turing proposed a general procedure to test the intelligence of an agent now known as the Turing test. This procedure allows almost all the major problems of artificial intelligence to be tested. However, it is a very difficult challenge and at present all agents fail.
Artificial intelligence can also be evaluated on specific problems such as small problems in chemistry, hand-writing recognition and game-playing. Such tests have been termed subject matter expert Turing tests. Smaller problems provide more achievable goals and there are an ever-increasing number of positive results.
The broad classes of outcome for an AI test are:
For example, performance at checkers (draughts) is optimal,[156] performance at chess is super-human and nearing strong super-human,[157] and performance at many everyday tasks performed by humans is sub-human.
There are a number of competitions and prizes to promote research in artificial intelligence. The main areas promoted are: general machine intelligence, conversational behaviour, data-mining, driverless cars, robot soccer and games.
Artificial intelligence has successfully been used in a wide range of fields including medical diagnosis, stock trading, robot control, law, scientific discovery and toys. Frequently, when a technique reaches mainstream use it is no longer considered artificial intelligence, sometimes described as the AI effect.[158] It may also become integrated into artificial life.