10
978-1-5386-6854-2/19/$31.00 ©2019 IEEE 1 A Review and Proposed Framework for Artificial General Intelligence Lyle N. Long The Pennsylvania State Univ. 233 Hammond Building University Park, PA 16802 814-865-1172 [email protected] Carl F. Cotner Applied Research Lab The Pennsylvania State Univ. University Park, PA 16802 814 865 7299 [email protected] AbstractThis paper will review the current status of artificial general intelligence (AGI) and describe a general framework for developing these systems for autonomous systems. While AI has been around for about 70 years, almost everything that has been done to-date has been what is called “narrow” AI. That is, the theory and applications have mostly been focused on specific applications, not “strong” or general-purpose AI (or Artificial General Intelligence, AGI). As stated in [5], Deep Blue was great at playing chess but could not play checkers. The typical AI of today is basically a collection of well-known algorithms and techniques that are each good for certain applications. TABLE OF CONTENTS 1. INTRODUCTION ....................................................... 1 2. ELEMENTS OF ARTIFICIAL INTELLIGENCE .......... 2 3. PREVIOUS AGI WORK............................................ 5 4. PROPOSED AGI FRAMEWORK ............................... 6 6. CONCLUSIONS ......................................................... 8 REFERENCES ............................................................... 8 BIOGRAPHIES ............................................................ 10 1. INTRODUCTION There is enormous interest in unmanned autonomous systems, but these systems need significant onboard intelligence to be effective. This requires powerful computers [12], powerful algorithms, complex software, lots of training, and lots of data; all of which now exist to some extent. Also, it is not feasible to hand-code software for truly intelligent autonomous systems. Even the software used on manned aircraft can cost hundreds of millions of dollars, and could be tens of millions of lines of code. There is an enormous need for artificial general intelligence for unmanned autonomous systems. Machine learning will be essential to making these systems effective. The systems will need to learn much as an infant learns, by exploring their environment. Humans get their training data by exploring their environment and using all their sensors, in addition to being taught by other humans. Humans can also learn by doing thought experiments. One aspect of this scenario that scares some people is that once these systems reach human-level (or beyond) intelligence, we will be able to make copies of them easily, whereas each human requires about 20 years to train. Humans are often thought of as being the pinnacle of intelligence, but machines will eventually surpass humans. The processing power and memory of supercomputers already exceed humans, and human memory is really very poor compared to computers. Eye-witness accounts are known to be unreliable, and the Miller number demonstrates how little can be stored in short term memory. Dictionary definitions of intelligence are often roughly: “the capacity to acquire and apply knowledge.” But in [2] a group of researchers defined it as: “A very general mental capability that, among other things, involves the ability to reason, plan, solve problems, think abstractly, comprehend complex ideas, learn quickly and learn from experience. It is not merely book learning, a narrow academic skill, or test-taking smarts. Rather, it reflects a broader and deeper capability for comprehending our surroundings—"catching on," "making sense" of things, or "figuring out" what to do.” While artificial intelligence (AI) has been around for about 70 years, almost everything that has been done to-date has been what is called “narrow” AI. That is, the theory and applications have mostly been focused on specific applications, not “strong” or general-purpose AI (or Artificial General Intelligence, AGI). And the systems have been very far from having the intelligence capabilities described above. As stated in [5], Deep Blue was great at playing chess but could not play checkers. The typical AI of today is basically a collection of well-known algorithms and techniques that are each good for certain applications. A few of these will be discussed in this paper. None of these algorithms alone is currently capable of AGI. Domingos [7] claims that no one knows the ultimate master algorithm for AI, but the current algorithms described herein could be the master algorithm. There are pure AGI theories (e.g. AIXI [27], Gödel machines [30], etc.), but these are not practical on even the largest supercomputers. They are essentially

A Review and Proposed Framework for Artificial General ...test.scripts.psu.edu/users/l/n/lnl/papers/IEEE_AGI_2019.pdf · 978-1-5386-6854-2/19/$31.00 ©2019 IEEE 1 A Review and Proposed

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A Review and Proposed Framework for Artificial General ...test.scripts.psu.edu/users/l/n/lnl/papers/IEEE_AGI_2019.pdf · 978-1-5386-6854-2/19/$31.00 ©2019 IEEE 1 A Review and Proposed

978-1-5386-6854-2/19/$31.00 ©2019 IEEE 1

A Review and Proposed Framework for Artificial General Intelligence

Lyle N. Long

The Pennsylvania State Univ. 233 Hammond Building

University Park, PA 16802 814-865-1172

[email protected]

Carl F. Cotner Applied Research Lab

The Pennsylvania State Univ. University Park, PA 16802

814 865 7299 [email protected]

Abstract— This paper will review the current status of artificial general intelligence (AGI) and describe a general framework for developing these systems for autonomous systems. While AI has been around for about 70 years, almost everything that has been done to-date has been what is called “narrow” AI. That is, the theory and applications have mostly been focused on specific applications, not “strong” or general-purpose AI (or Artificial General Intelligence, AGI). As stated in [5], Deep Blue was great at playing chess but could not play checkers. The typical AI of today is basically a collection of well-known algorithms and techniques that are each good for certain applications.

TABLE OF CONTENTS 1. INTRODUCTION ....................................................... 12. ELEMENTS OF ARTIFICIAL INTELLIGENCE .......... 23. PREVIOUS AGI WORK ............................................ 54. PROPOSED AGI FRAMEWORK ............................... 66. CONCLUSIONS ......................................................... 8REFERENCES ............................................................... 8BIOGRAPHIES ............................................................ 10

1. INTRODUCTION There is enormous interest in unmanned autonomous systems, but these systems need significant onboard intelligence to be effective. This requires powerful computers [12], powerful algorithms, complex software, lots of training, and lots of data; all of which now exist to some extent. Also, it is not feasible to hand-code software for truly intelligent autonomous systems. Even the software used on manned aircraft can cost hundreds of millions of dollars, and could be tens of millions of lines of code. There is an enormous need for artificial general intelligence for unmanned autonomous systems.

Machine learning will be essential to making these systems effective. The systems will need to learn much as an infant learns, by exploring their environment. Humans get their training data by exploring their environment and using all their sensors, in addition to being taught by other humans. Humans can also learn by doing thought experiments. One

aspect of this scenario that scares some people is that once these systems reach human-level (or beyond) intelligence, we will be able to make copies of them easily, whereas each human requires about 20 years to train.

Humans are often thought of as being the pinnacle of intelligence, but machines will eventually surpass humans. The processing power and memory of supercomputers already exceed humans, and human memory is really very poor compared to computers. Eye-witness accounts are known to be unreliable, and the Miller number demonstrates how little can be stored in short term memory.

Dictionary definitions of intelligence are often roughly:

“the capacity to acquire and apply knowledge.”

But in [2] a group of researchers defined it as:

“A very general mental capability that, among other things, involves the ability to reason, plan, solve problems, think abstractly, comprehend complex ideas, learn quickly and learn from experience. It is not merely book learning, a narrow academic skill, or test-taking smarts. Rather, it reflects a broader and deeper capability for comprehending our surroundings—"catching on," "making sense" of things, or "figuring out" what to do.”

While artificial intelligence (AI) has been around for about 70 years, almost everything that has been done to-date has been what is called “narrow” AI. That is, the theory and applications have mostly been focused on specific applications, not “strong” or general-purpose AI (or Artificial General Intelligence, AGI). And the systems have been very far from having the intelligence capabilities described above. As stated in [5], Deep Blue was great at playing chess but could not play checkers. The typical AI of today is basically a collection of well-known algorithms and techniques that are each good for certain applications. A few of these will be discussed in this paper. None of these algorithms alone is currently capable of AGI. Domingos [7] claims that no one knows the ultimate master algorithm for AI, but the current algorithms described herein could be the master algorithm. There are pure AGI theories (e.g. AIXI [27], Gödel machines [30], etc.), but these are not practical on even the largest supercomputers. They are essentially

lnl
Text Box
Long, Lyle N. and Cotner, C., "A Review of and Proposed Framework for Artificial General Intelligence," presented at IEEE Aerospace Conference, Mar., 2019.
Page 2: A Review and Proposed Framework for Artificial General ...test.scripts.psu.edu/users/l/n/lnl/papers/IEEE_AGI_2019.pdf · 978-1-5386-6854-2/19/$31.00 ©2019 IEEE 1 A Review and Proposed

2

theoretical or mathematical approaches, but are very useful for guidance in algorithm development.

This paper will review the current status of artificial general intelligence (AGI) and describe a general framework for developing these systems for autonomous systems. The framework builds upon neural networks [36] and Monte Carlo tree search [38]. It is reminiscent of reinforcement learning (RL) [13] but does not use any of the standard RL approaches. This system would continually build models of the world, and would take actions both in its mind and in the real world. To have a truly intelligent autonomous system it would need a wide variety of multi-modal sensor inputs, all using neural networks for processing. Humans have an enormous amount of sensory inputs from touch, hearing, vision, taste, and smell. Table 1 shows approximate human sensor input data.

A portion of the proposed system would be similar to that used in the world class AlphaGo Zero [24] and AlphaZero, which was able to learn the game Go in about three days with no human input (and chess in just four hours). The approach described in this paper expands upon this and describes an intelligent system (such as a UAV) embedded in the real world, which requires perception and a model of the world. While most UAV’s have all their behaviors and actions preprogrammed, the approaches described here would allow a system to learn on its own. It is not possible to program all the features of an intelligent system (e.g., using rule-based systems such as Soar, ACT/R, EPIC, and SS-RICS).

2. ELEMENTS OF ARTIFICIAL INTELLIGENCE Artificial Intelligence (AI) has been discussed since the Dartmouth conference in 1956, but interest in it has been cyclic, primarily because it has been repeatedly oversold. Figure 1 shows a graph of computer power vs. years, where the computers are the fastest supercomputers of the time (i.e. IBM 701, Cray 1, ASCI Red, and TaihuLight). The trend far exceeds Moore’s law due to parallel computing. From this figure it is easy to show how AI has been oversold. In the 1950’s (when AI was first discussed) the computers had roughly the power of a C. Elegan worm. In

the 1970’s the Cray supercomputers were roughly equivalent to a cockroach in memory and speed. In the 1990’s we saw the first teraflop computer (ASCI Red), which approached the speed of a rat nervous system. It was not until the last few years that the largest computers in the world surpassed the memory and processing of humans. So for most of the last 70 years there was little chance that AI could achieve human-level intelligence. Now, however, AI has the potential to be very powerful (even superhuman), and people are interested in and also afraid of it.

It should be pointed out, however, that while computers such as the TaihuLight have achieved the memory and speed of a human brain, they require about a million times more energy than the brain and are about a million times larger. General purpose processors cannot compete with the efficiency of the brain. Recent efforts to build special purpose neuromorphic chips are promising to change that, however. IBM [1], Intel [3] and DeepMind [4] have all developed special purpose chips for efficiently simulating neural networks. These are in the early stages of development, and learning algorithms in particular need much more work.

Computing machinery dates back to the abacus and Babbage’s analytic difference engine, but the theory of modern-day computing dates back to Turing [6], and the Universal Turing machine. With computers being ubiquitous now, it is difficult to understand the achievements of Turing and others, but in the 1930’s it was not at all clear how to build a computer. And technology has been changing exponentially, as described by Moore’s law. It is now possible to buy a compute server with fifty-six Intel cores (2.5 GHz) for only about $25K. This would have a peak speed about 10% of the speed of the ASCI Red machine for about 1000th the price.

Figure 2 shows a comparison of the memory and computational power of biological systems compared to human-developed systems [42]. A PC-based server today exceeds the power and memory of the Cray 2. This server also exceeds the computing capacity of a mouse, which is a very intelligent animal. The TuahuLight exceeds the power and memory of a human.

The AI of today is basically a collection of well-known algorithms and techniques that are each good for certain applications, for example:

• Neural networks • Decision trees • Reinforcement learning • Supervised/Unsupervised learning • Bayesian methods • Natural language processing • Cognitive architectures • Genetic algorithms • Others…

Table 1. Human Sensory Inputs

Sensor Modality Approximate Size

Vision 120 million rods/cones

Olfaction 10 million receptor neurons

Touch 100,000 neurons in hand

Hearing 15,500 hair cells

Taste 10,000 taste buds

Page 3: A Review and Proposed Framework for Artificial General ...test.scripts.psu.edu/users/l/n/lnl/papers/IEEE_AGI_2019.pdf · 978-1-5386-6854-2/19/$31.00 ©2019 IEEE 1 A Review and Proposed

3

A few of these are described below. Most of these algorithms are good for some problems but not others. Throughout history people have developed and used the best algorithms they could at the time, given the computer power available at that time. As computers get more and more powerful some algorithms are abandoned because better ones become feasible.

We know that intelligence exists in humans (and other animals) and it is based on neural networks. We just don’t understand the detailed behavior of the neurons, the

synapses, learning, or the network connectivity (the connectome [8]). Progress is being made, but building a human-like nervous system is extremely challenging. In addition, the brain itself is of little use without the massive sensor input from the nervous system. Nevertheless, we do know human intelligence is neural-network based.

Artificial neural network (ANN) algorithms have a long history, dating back to about 1943 [37]. Most of these have little in common with real electro-chemical biological neurons and synapses. The most common type of ANN’s are feed-forward networks trained using back-propagation. Back-propagation uses stochastic gradient descent (SGD). Gradient descent is usually a fairly poor optimization algorithm but SGD seems to work well on ANN’s, and better methods have not been found yet. Deep learning methods (e.g., TensorFlow) are based upon this approach too. Given enormous amounts of data and lengthy training periods, they are very effective and widely used. These deep networks, however, are not intelligent [45].

There is no standard mathematical definition of a neural network, but as commonly used in machine learning, a neural network is a parameterized function f(x; θ) that maps the input data from RM to RN. It is usually used to interpolate a set of data points (Xi, Yi). Typically this function can be factored through d – 1 intermediate Euclidean spaces RNk, and d is called the depth of the neural network. This is what makes neural networks so effective, the data is much easier to separate when it is transformed into higher dimensional spaces using nonlinear activation functions.

Figure 1. Computer power vs. years

Figure 2. Comparison of biological and human-made systems [42].

Page 4: A Review and Proposed Framework for Artificial General ...test.scripts.psu.edu/users/l/n/lnl/papers/IEEE_AGI_2019.pdf · 978-1-5386-6854-2/19/$31.00 ©2019 IEEE 1 A Review and Proposed

4

More interesting, but less developed, are biologically plausible spiking neural networks [12]. For these, each neuron is modeled using time-dependent differential equations. There is a range of mathematical models from leaky integrate and fire (LIF) methods [9] to the Hodgkin-Huxley equations [10]. The simpler models have been implemented in silicon chips, as mentioned above, by Intel and IBM, but training them is not a solved problem. In addition, there are roughly 100 different kinds of neurons in the human brain; and the synapses are very complex chemical systems. One of the authors here (Long) has shown that massive networks of Hodgkin-Huxley neurons can be solved very efficiently and scalably on massively parallel computers [11,12].

Another very powerful and biologically plausible algorithm is reinforcement learning [13]. In this approach the system continually performs actions in the environment while attempting to optimize the reward. These techniques are related to Markov Decision Processes (MDP), control theory, and dynamic programming. This approach is quite different than supervised learning, as it does not need a collection of correct input-output results. Legged-robots have been shown to learn to walk without human training [34], just using reinforcement learning; and it appears similar to watching a pony learn to walk. The approach might not be the most efficient method but biological systems learn very slowly also. It takes roughly 20 years to

train a human. One difference with machines is that if we can train one, we can then make millions of copies relatively quickly, which scares people.

Cognitive architectures are another AI approach. These are rule-based systems that date back to Newell and Simon’s universal problem solver [14,15]. The most common implementations are ACT-R [16], SOAR [17], SS-RICS [18], and EPIC [20]. These have shown to be able to simulate some aspects of human performance, but are also very far from AGI. They are difficult to use and have very limited ability to learn. These have been implemented on mobile robots [21], but they are not well suited to that

application, primarily because they have very little ability to learn. Also, people must develop essentially all the software and rules. It is simply not feasible to build intelligent systems that rely on humans to code all the behaviors and actions. As Turing said [19]: “Instead of trying to produce a programme to simulate the adult mind, why not rather try to produce one which simulates the child’s?”

Symbolic systems such as these are an example of an approach that was good at the time, but since this approach was developed computers have gotten about 10 million times faster. The key to AGI systems is learning and neural networks.

Another key algorithm is Monte Carlo Tree Search (MCTS) [38]. This is most often used as a way to sample game trees to find good moves when a full game tree search would be too expensive. So instead of searching the whole subtree to evaluate the strength of a given node, one plays out random moves all the way to the end of the game many times (called a playout). The nodes involved are then weighted based on the wins and losses. During future playouts, higher weighted nodes are preferentially chosen. UCT [22] stands for Upper Confidence bounds for Trees and provides an optimal way to balance exploitation and exploration of moves to minimize a mathematically defined quantity called regret.

The main approach to UCT balances exploitation and exploration using the equation to chose which nodes to

choose:

𝑈𝐶𝑇 = 𝑤!𝑛!

+ 𝑐 ln𝑁!𝑛!

where:

• wi stands for the number of wins for the node considered after the ith move

• ni stands for the number of simulations for the node considered after the ith move (e.g. wins+loses+draws)

Figure 3. Example MCTS with UCT [41].

Page 5: A Review and Proposed Framework for Artificial General ...test.scripts.psu.edu/users/l/n/lnl/papers/IEEE_AGI_2019.pdf · 978-1-5386-6854-2/19/$31.00 ©2019 IEEE 1 A Review and Proposed

5

• Ni stands for the total number of simulations after the ith move (at top node)

• c is the exploration parameter—theoretically equal to 2; in practice usually chosen empirically

The next node in the tree to consider is chosen by the one that has the highest value of UCT. The first term is the exploitation term, which will be large if this choice has produced positive results in the past. The second term is the exploration term, and will drive the algorithm to choose terms that have not been explored enough. This approach is one form of reinforcement learning.

Figure 3 shows how UCT works. The first panel shows the current status of the tree, the second panel shows the selection of the next node, and the third panel shows running the simulation on this tree. In the case shown the simulation produces a loss (i.e. wins/total = 0/1). The fourth panel shows how the result of the simulation is back-propagated up the tree.

One of the most interesting AI systems to-date is AlphaGo Zero [24] (and AlphaZero). This uses MCTS and neural networks. It also uses tensor processing units [4]. This system, without human training or data, learned to play one of the most difficult games known (Go) in only 70 hours (it used a computer which cost roughly $25 million and there were 17 people listed as co-authors on the paper). It beat all other software systems and the best human Go champions, and it is considered an AI-hard problem. Many Go experts have spent their entire lives learning to play the game, while AlphaZero did it in 3 days. So this system exceeded human performance and it learned to do it roughly 3000 times faster than a human. This system was also able to learn to play chess in only 4 hours, which does demonstrate more generalizability than previous game playing systems (e.g. Watson [25]). And most importantly, this system learns on its own through trial-and-error; and does not need humans to teach it. This is a very exciting breakthrough. It is a step in the right direction, and as discussed below may help lead to AGI eventually. What we are proposing here is implement this approach in a mobile robot embedded in the real world.

The power of this approach is through the combination of neural networks and MCTS/UCT. Figure 4 shows a flowchart of MCTS/UCT combined with a neural network. As the MCTS runs the neural network is trained on the data from the simulation. The inputs to this system would be the definition of the problem (e.g. definition of games rules, goals for a robot, etc.), the definition of the tree structure, and the state of the system. The other powerful feature of this approach is the relatively small amount of code required.

The other feature illustrated in Fig. 4 is the conversion of the process to unconscious processing. One the task has been learned (solution known), the MCTS and neural network training can be skipped. Humans do this continuously, as they learn skills they are converted to

unconscious “programs,” that can be executed without thinking about them. Only about 5% of the human brain is used for conscious processing [44], and is too slow for many tasks.

To illustrate the MCTS/UCT approach, consider the trivial problem of tic-tac-toe. The tree structure would have layers of 1-9-8-7-6-5-4-3-2-1, and the state could be represented by 18 bits. The first 9 bits would be 0 or 1 depending on whether an X was in one of the nine squares, and the second nine bits would be 0 or 1 depending on whether there was an O in the squares. The board at the start of the game would have all zeros in this state vector. As the state vector changes, some leaves in the tree will become unavailable. The program would choose where to put an X using UCT, and then it could choose where to put an O randomly (or more intelligently). A win can be determined by the values of the state. As the many different games are played UCT is used to update value of each leaf in the tree, and the neural network is trained to choose the next move given the current state. Eventually the MCTS/UCT will not be needed and the neural network can be used, which is somewhat analogous to humans developing “muscle memory” on some task or chunking.

While Go is one of the most difficult games known (orders of magnitude more difficult than chess), it is still just a game. It has well-defined structure, rules, and states; and the result is either a win or a loss. It can hardly be called an intelligent system using the definition in Section 1. And an unmanned system embedded in the real world is much more complicated. But AlphaZero shows us that given a decision tree structure, a goal, and a definition of the state vector it is possible for the system to learn how to succeed. The road ahead is still not easy, but there is now sufficient evidence of how we might proceed.

3. PREVIOUS AGI WORK A Turing machine [6] is a formal computational model that consists of a finite state machine and infinite tape (memory). Its primary importance is that it provides a well-defined computational model that allows one to prove theorems about computation. It is theoretically equivalent to a variety of other standard mathematically defined computational models, and it is believed by most that anything that is computable is computable by a Turing machine. This is conjecture is known as Church’s thesis (or the Church–Turing thesis) [28].

AIXI [27] is a precise information-theoretic formula for near optimal reinforcement learning. It combines Solomonoff induction with sequential decision theory. ‘Information theoretic’ means that it does not take computational complexity into account; indeed, it is uncomputable because of the halting problem [28,29]. Moreover, even if it were computable, it would be completely intractable as a practical algorithm. One can use it, however, as the basis of a practical algorithm by approximating it. AIXItl [27] approximates it by putting

Page 6: A Review and Proposed Framework for Artificial General ...test.scripts.psu.edu/users/l/n/lnl/papers/IEEE_AGI_2019.pdf · 978-1-5386-6854-2/19/$31.00 ©2019 IEEE 1 A Review and Proposed

6

limits on the time and space used in the computation, and MC-AIXI-CTW [27] approximates it stochastically.

A Gödel machine [30-32] is a Turing machine that is a provably optimal (in run-time) reinforcement ‘learner’. At the moment, its primary importance is that it shows that such a thing exists, and although no full implementation has been created yet, one could actually be created, and in the future it may become a practical solution to the general learning problem. It recursively self-improves itself by rewriting its own code when it can prove the new code provides a better strategy.

Steunebrink and Schmidhuber [40] state:

“One can view a Gödel Machine as a program consisting of two parts. One part, which we will call the solver, can be any problem-solving program. For clarity of presentation, we will pretend the solver is a Reinforcement Learning (RL) program interacting with

some external environment. This will provide us with a convenient way of determining utility (using the RL program’s reward function), which will be an important topic later on. But in general, no constraints are placed on the solver. The second part of the Gödel Machine, which we will call the searcher, is a program that tries to improve the entire Gödel Machine (including the searcher ) in a provably optimal way.”

Additional approaches to AGI are discussed in proceedings of the Artificial General Intelligence conferences [39].

4. PROPOSED AGI FRAMEWORK The main message of this paper is that there is a Master Algorithm despite what Domingos [7] says. It is neural networks. This is what all mammals use, and this is what AlphaZero uses. There are also many types of computational neural networks: artificial neural networks (ANN), spiking neural networks (SNN), recurrent neural networks, deep neural networks, convolutional neural networks, etc. These can be trained in many different ways also: backpropagation, supervised, unsupervised, STDP, MCTS/UCT, reinforcement learning, etc.

Intelligence can be defined in a variety of mathematically equivalent ways, and one of them is the ability to learn to predict the future state of a system from the current state. For an agent in an environment, the current state includes the state of the agent and any action the agent performs.

Animals have brains so they can (imperfectly) predict what actions would result in the best outcomes for themselves. Without brains, an animal’s actions would either be random or hard-coded (reactive) like an insect, and, indeed, the actions of primitive animals and most current robots are essentially reactive. But to survive and prosper in an unknown and/or complicated environment, an animal needs to be able to learn the environment through experience (physical experiments or thought experiments) to be able to make predictions about the environment.

Figure 5 illustrates the information flow of a proposed intelligent system in its environment. It is embedded in the world and has numerous sensor inputs. This could, for example, be a mobile robot with vision, smell, touch, or other sensor inputs. The inputs are all processed using artificial neural networks (ANN). These neural networks can be trained ahead of time to recognize different sensor inputs, or they could implement reinforcement learning.

As in animals there is a “data fusion” step where the different sensor modalities are fused together. This could be accomplished using a hierarchical neural network that uses the sensor ANN’s as input. It is much more effective to use all the input data than just one set, especially with noisy or incomplete data which often occurs. Evolution has enabled us to predict danger is nearby (e.g., a tiger) even if we have incomplete sensor data. A partial view combined with a partial sound or smell might be enough to recognize the

Figure 4. Flowchart showing MCTS with UCT and neural network.

Page 7: A Review and Proposed Framework for Artificial General ...test.scripts.psu.edu/users/l/n/lnl/papers/IEEE_AGI_2019.pdf · 978-1-5386-6854-2/19/$31.00 ©2019 IEEE 1 A Review and Proposed

7

tiger. Sensor data fusion is already being used in some military systems [33].

Once the system perceives what is in the world, then it either reacts very quickly (e.g., to danger), or it reasons about how to accomplish its goals. For example, if an object is thrown at a person and could potentially cause harm, there is no time to think about this, you just have to react very quickly. This is also the sort of behavior a professional baseball player needs to be successful. There is not enough time for our brains to process a very fast pitch, the player just has to react and rely on years of training. Some call this “muscle memory.” This sort of processing takes place in the more primitive parts of the brain. Roughy 95% of the human brain is devoted to unconscious processing [44].

If the system does not need to react quickly, it can take the time to reason about its options and perform thought experiments. This is the sort of processing that takes place in the cortex, where most of the complex processing takes place. But this processing takes time. Human reaction time for visual stimulus is about 0.25 seconds. In the system proposed here this area would perform neural network processing and Monte Carlo tree search, as in AlphaZero. The goal here, as in AlphaZero, is for the system to learn on its own. This is essential to having a truly intelligent system, since it is not feasible to program all aspects of behavior.

The only way to be able to predict the future state of an environment is to learn a model of it and then perform ‘thought experiments’ (simulations) in the model. If the model is precise enough and there is sufficient time to do enough simulations to explore a wide variety of possible future outcomes, then these simulations will allow an agent to pick an action that has a good outcome. It is important to understand that doing ‘thought experiments’ allows one to

explore actions with bad outcomes more cheaply and safely than having to perform the experiment in reality.

For example, if a human is walking near the edge of a cliff and is trying to decide what to do next, it is essential for survivability to think about what will happen if they step over the edge. To actually perform the experiment in the real world would obviously not be effective.

The simulations in the learned model can be performed using the same algorithms as AlphaZero, which uses neural networks and Monte Carlo decision trees. If the model is sufficiently detailed and precise, then the results in the model will be a close approximation to the results in the real world if the same actions are performed as were performed in the simulation. We note that this process essentially results in a reinforcement learning algorithm.

One of the difficult aspects of this system is determining the objectives to go into MCTS/UCT approach for all the different goals and sub-goals. For a game (even one as difficult as GO) the objectives, tree structure, and state information is well known. However, imagine a new-born baby trying to survive in the world. They have some built-in instincts, but have to learn an enormous amount. They have urges to eat, avoid danger, etc., but to accomplish these they have to learn sub-goals such as learn to walk, learn to recognize objects and people, learn to use their hands, etc. Other humans might teach them some of these. Until this system can reason about goals and sub-goals, humans could be used to input goals, sub-goals, state information, and decision tree structures.

Neural networks can be used to both learn a model of the world, and also the ‘physics’ of the real world. This is true no matter what the ‘physics’ is. For the real world, the physics is the standard physics we are accustomed to studying, but in other circumstances the ‘physics’ might

Figure 5. Unmanned System Architecture [43]

Page 8: A Review and Proposed Framework for Artificial General ...test.scripts.psu.edu/users/l/n/lnl/papers/IEEE_AGI_2019.pdf · 978-1-5386-6854-2/19/$31.00 ©2019 IEEE 1 A Review and Proposed

8

correspond to the rules of a game or some other formal system.

One of the very interesting aspects of this approach is the small number of lines of code required, and the relatively small amount of computations required. A simple program to implement MCTS/UCT can be written in about 20 lines of Python code. A neural network including backpropagation can be implemented in about 500 lines of C++ code. These are very tiny codes compared to those in traditional systems, which could easily have tens of millions of lines of code. For example, the Soar cognitive architecture contains about 500,000 lines of C++ code. And the Boeing 787 has 6.5 million lines of code.

6. CONCLUSIONS This paper reviews artificial intelligence algorithms and describes a framework for artificial general intelligence that could, conceivably, be implemented in software for autonomous systems. While approaches like this were not feasible in the past, with the rapid increase in computing power (even surpassing humans), sensors, and algorithms they may now be feasible.

REFERENCES 1. Modha, D. S. (2017). Introducing a brain-inspired

computer. Published online at http://www. research. ibm. com/articles/brain-chip.shtml.

2. Gottfredson, Linda S. (1997). "Mainstream Science on Intelligence (editorial)" (PDF). Intelligence. 24: 13–23. doi:10.1016/s0160-2896(97)90011-8. ISSN 0160-2896. Archived (PDF) from the original on 22 December 2014.

3. Davies, M., et al, “Loihi: A Neuromorphic Manycore Processor with On-Chip Learning,” IEEE Micro, Jan., 2018.

4. Norman P. Jouppi, Cliff Young, Nishant Patil, David Patterson, et al., Google, Inc., Mountain View, CA USA 2017. In-Datacenter Performance Analysis of a Tensor Processing Unit. In Proceedings of ISCA ’17, Toronto, ON, Canada, June 24-28, 2017.

5. Kelley, T. and Long, L.N., “Deep Blue Can't Play Checkers: the Need for Generalized Intelligence for Mobile Robots," Journal of Robotics, Vol. 2010, May, 2010.

6. Turing, A.M. (1936). "On Computable Numbers, with an Application to the Entscheidungs problem". Proceedings of the London Mathematical Society. 2 (published 1937). 42: 230–265. doi:10.1112/plms/s2-42.1.230. (and Turing, A.M. (1938). "On Computable Numbers, with an Application to the Entscheidungsproblem: A correction". Proceedings of the Lon-don Mathematical Society. 2 (published 1937). 43 (6): 544–6. doi:10.1112/plms/s2-43.6.544.).

7. Domingos, P., “The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World,” Basic Books, 2018.

8. Sporns, Olaf; Tononi, Giulio; Kötter, Rolf (2005). "The Human Connectome: A Structural Description of the Human Brain". PLoS Computational Biology. 1 (4):

9. Koch, C., Biophysics of Computation: Information Processing in Single Neurons, Oxford Univ. Press, Oxford, 1999.

10. Hodgkin, A. L., and Huxley, A. F., “A Quantitative Description of Ion Currents and Its Applications to Conduction and Excitation in Nerve Membranes,” Journal of Physiology, Vol. 117, No. 4, 1952, pp. 500–544.

11. Skocik, M.J. and Long, L.N., "On The Capabilities and Computational Costs of Neuron Models," IEEE Trans. on Neural Networks and Learning, Vol. 25, No. 8, Aug., 2014.

12. Long, Lyle N., " Toward Human Level Massively Parallel Neural Networks with Hodgkin-Huxley Neurons," presented at 16th Conference on Artificial General Intelligence, New York, NY, July 16-19, 2016.

13. Sutton, R.S. and A. G. Barto, Reinforcement Learning: An Introduction (Adaptive Computation and Machine Learning), second edition, 1998.

14. Newell, A., and Simon, H. A., Human Problem Solving, Prentice-Hall, Englewood Cliffs, NJ, 1972.

15. Newell, A., Unified Theories of Cognition, Harvard Univ. Press, Cambridge, MA, 1990.

16. J.R. Anderson, D. Bothell, M.D. Byrne, S. Douglass, C. Lebiere, and Y. Qin, “An Integrated Theory of the Mind,” Psychological Review 2004, 11(4), pp. 1026-1060.

17. Laird, J.E., The Soar Cognitive Architecture, MIT Press, 2012

18. Kelley, T. D., “Symbolic and Sub-symbolic Representations in Computational Models of Human Cognition: What Can be Learned from Biology?” Theory & Psychology, Vol. 13, No. 6, 2003, pp. 847–860.

19. Turing, A.M. (1950). “Computing machinery and intelligence,” Mind, 59, 433-460.

20. Kieras, D.E. and D.E. Meyer, “An Overview of the EPIC Architecture for Cognition and Performance With Application to Human-Computer Interaction,” HumanComputer Interaction, 1997, 12, pp. 391-438.

Page 9: A Review and Proposed Framework for Artificial General ...test.scripts.psu.edu/users/l/n/lnl/papers/IEEE_AGI_2019.pdf · 978-1-5386-6854-2/19/$31.00 ©2019 IEEE 1 A Review and Proposed

9

21. Hanford, S. D., Janrathitikarn, O., and Long, L. N., “Control of Mobile Robots Using the Soar Cognitive Architecture,” Journal of Aerospace Computing, Information, and Communication, Vol. 6, No. 6, 2009, pp. 69–91.

22. Abramson, B., “The Expected-Outcome Model of Two-Player Games,” Ph.D. Dissertation, Columbia Univ., 1987.

23. Levente Kocsis and Csaba Szepesv, “Bandit based Monte-Carlo Planning,” Proceeding ECML'06 Proceedings of the 17th European conference on Machine Learning, Pages 282-293, Berlin, Germany, September 18 - 22, 2006.

24. Silver, D., et al., “Mastering the game of Go without human knowledge,” Nature, Vol. 550, October 19, 2017.

25. David Ferrucci, Anthony Levas, Sugato, Bagchi, David, Gondek, and Erik T. Mueller, “Watson: Beyond Jeopardy!,” Artificial Intelligence, Volumes 199–200, June–July 2013, Pages 93-105.

26. Davis, Martin, ed. (1965). The Undecidable, Basic Papers on Undecidable Propositions, Unsolvable Problems And Computable Functions. New York: Raven Press.

27. Hutter, M., http://www.hutter1.net/ai/

28. Turing, A.M., 1948, "Intelligent Machinery." Reprinted in "Cybernetics: Key Papers." Ed. C.R. Evans and A.D.J. Robertson. Baltimore: University Park Press, 1968. p. 31. Reprinted in Turing, A. M. (1996). "Intelligent Machinery, A Heretical Theory". Philosophia Mathematica. 4 (3): 256. doi:10.1093/philmat/4.3.256.

29. Teuscher, C. “Alan Turing: Life and Legacy of a Great Thinker,” Springer Science & Business Media, Jun 29, 2013.

30. Schmidhuber, Jürgen (December 2006). “Gödel Machines: Self-Referential Universal Problem Solvers, Making Provably Optimal Self-Improvements,” arXiv:cs/0309048, Vers. 5, 2006.

31. Steunebrink B.R., Schmidhuber J. (2011) A Family of Gödel Machine Implementations. In: Schmidhuber J., Thórisson K.R., Looks M. (eds) Artificial General Intelligence. AGI 2011. Lecture Notes in Computer Science, vol 6830. Springer, Berlin, Heidelberg.

32. Gödel, K., “On Formally Undecidable Propositions of Principia Mathematica And Related Systems,” (original: Gödel, " Uber formal unentscheidbare Satze der Principia Mathemat-ica und vervvandter Systeme, I," . Monatsheftc Math. Phys., 38 (1931), 173-198.)

33. Liggins II, Martin, D. Hall, J. Llinas, “Handbook of Multisensor Data Fusion: Theory and Practice,” 2nd Edition, CRC Press, 2017.

34. https://www.youtube.com/watch?v=RZf8fR1SmNY

35. https://faculty.washington.edu/chudler/facts.html

36. Haykin, S., Neural Networks: A Comprehensive Foundation, 2nd Edition, Prentice-Hall, 1999.

37. McCulloch, W.S. and W. Pitts, “A Logical Calculus of the ideas immanent in nervous activity,” Bull. Of Math. BioPhys., Vol. 5, No. 5, 1943, pp. 115-133.

38. Browne, C. B., E. Powley, D. Whitehouse, S. M. Lucas, P. I. Cowling, P. Rohlfshagen, S. Tavener, D. Perez, S. Samothrakis, and S. Colton, “A Survey of Monte Carlo Tree Search Methods,” IEEE Transactions On Computational Intelligence And AI in Games, Vol. 4, No. 1, Mar. 2012.

39. http://agi-conf.org/

40. Steunebrink B.R., Schmidhuber J. (2012) Towards an Actual Gödel Machine Implementation: a Lesson in Self-Reflective Systems. In: Wang P., Goertzel B. (eds) Theoretical Foundations of Artificial General Intelligence. Atlantis Thinking Machines, vol 4. Atlantis Press, Paris.

41. https://en.wikipedia.org/wiki/Monte_Carlo_tree_search

42. Moravec, H., Mind Children: The Future of Robot and Human Intelligence, Harvard Univ. Press, Cambridge, 1988.

43. Long, Lyle N., Kelley, Troy D., and Avery, Eric S., "An Emotion and Temperament Model for Cognitive Mobile Robots," 24th Conference on Behavior Representation in Modeling and Simulation (BRIMS), March 31-April 3, 2015, Washington, DC.

44. Lipton, B. H., The biology of belief: Unleashing the power of consciousness, matter and miracles. Santa Rosa, CA, US: Mountain of Love/Elite Books, 2005.

45. Lohr, S., “Is There a Smarter Path to Artificial Intelligence? Some Experts Hope So,” New York Times, June 20, 2018.

Page 10: A Review and Proposed Framework for Artificial General ...test.scripts.psu.edu/users/l/n/lnl/papers/IEEE_AGI_2019.pdf · 978-1-5386-6854-2/19/$31.00 ©2019 IEEE 1 A Review and Proposed

10

BIOGRAPHIES

Lyle N. Long is a Professor of Aerospace Engineering, Computational Science, Neuroscience, and Mathematics at The Pennsylvania State University. He has a D.Sc. from George Washington University, an M.S. from Stanford University, and a B.M.E. from the University of Minnesota. In

2007-2008 he was a Moore Distinguished Scholar at the Caltech. And has been a visiting scientist at the Army Research Lab, Thinking Machines Corp., and NASA Langley Research Center; and was a Senior Research Scientist at Lockheed California Company for six years. He received the Penn State Outstanding Research Award, the 1993 IEEE Computer Society Gordon Bell Prize for achieving highest performance on a parallel computer, a Lockheed award for excellence in research and development, and the 2017 AIAA Software Engineering award and medal. He is a Fellow of the American Physical Society (APS), and the American Institute of Aeronautics and Astronautics (AIAA). He has written more than 260 journal and conference papers.

Carl F. Cotner is an Associate Research Professor in the Access and Effects Department of the Applied Research Laboratory at the Pennsylvania State University. He has a Ph.D. in Mathematics from the University of California at Berkeley.