A NEURAL NETWORK INSTANTIATION ENVIRONMENT

Dynamically creating neural nets lets you concentrate on network response characteristics

Andrew J. Czuchy, Jr.

Andy earned the A.B. degree in computer science from Dartmouth College and the M.S. degree in information and computer science from the Georgia Institute of Technology. He is presently pursuing a Ph.D degree in information and computer science at the Georgia Institute of Technology. His research is supported by the Artificial Intelligence Branch of the Georgia Tech Research Institute. Andy has published several articles on topics concerning "intelligent" computer systems. He can be reached at the Georgia Institute of Technology, A. I. Branch, Georgia Tech Research Institute, 243 Baker Bldg., Atlanta, GA 30332.


The automatic generation of tailored neural network architectures greatly simplifies the tedious task of putting together neural networks. Typically, an architecture is assembled by manually writing and modifying a collection of software routines; automation speeds this standard process of assembling networks. However, task simplification through the automatic generation of network architectures often implies limited flexibility when applied to real-world problems. In order to develop useful network architectures in an efficient manner, the provision of both task simplification and complete flexibility is an inherent design principle in the research environment that instantiates (dynamically creates) neural networks. Instantiation is the flexible process of automatically piecing together architectures based upon modifiable structures that represent the parameters of the assembled neural networks.

The incorporation of network instantiation into an entire research environment for neural networks results in a system that provides both task simplification and complete flexibility.

Task simplification can be achieved by using a variety of knowledge-representation techniques. Complete flexibility can be maintained by strictly applying standard software-modularization techniques. The merging of these two types of techniques -- knowledge representation and software modularization -- provides the foundation for the instantiation process that forms the basis of a powerful neural network research environment.

In this article, I discuss the need for such an environment and describe a working version. In so doing, I describe the knowledge-representation techniques used, and the essential integration of knowledge representation and software modularization. (The model was developed on a Symbolics Lisp machine, chosen for its flexibility and power in symbolic manipulation and for its exploratory programming environment. The implementation language is Lisp.) I also present experimental results of using the environment for a test-case network, and finally, I discuss future efforts and the evolution of the environment.

System Overview

The task of generating usable neural network architectures for real-world problems is quite challenging. Basic standard networks are merely skeletons for useful systems. For example, Fukushima's neocognitron1 is really a class of neural networks. Most often only specific instances (class elements) are described in the literature -- the skeleton for a neocognitron is a multilayer, hierarchical neural network for visual pattern recognition. It consists of a series of layers of subnetworks that are organized according to specific guidelines. The system's exact parameters (for example, the number of layers, the number of subnetworks per layer, and the size of each subnetwork) are often tailored to the problem being addressed. This tailoring is the meat on the skeleton and is determined by the application's processing requirements.

Such tailoring is evident in the differences between the architectures of the neocognitron described by Fukushima and Miyake2 and by Fukushima.10 Fukushima and Miyake2 describe a seven-layer system for type-written or stylized (that is, written to meet certain specifications of consistency) numeral recognition. Each layer of the network has 24 subnetworks, except for the input layer, which is a single subnetwork layer. In contrast, the architecture of the neocognitron for hand-written numeral recognition, as described in Fukushima,10 is a nine-layer network with 1, 12, 8, 38, 19, 35, 23, 11, and 10 subnetworks per respective layer.

In addition to these architectural differences, the setting of various internal parameters may also vary according to the application. More noise tolerance is provided by decreasing the inhibitory ("negative") weights and shrinking the number of connections per node in a subnetwork. Finer degrees of class separation are provided by increasing inhibition and increasing the number of connections between the nodes in each subnetwork. A variety of other internal parameters can be altered as well.

Given the goal of efficiently establishing useful architectures and parameter settings for real-world applications, automatic generation of neural networks based upon a flexible representation of the desired characteristics is vital. This goal has been realized through the development of the research environment described in this article. The environment dynamically creates neural networks based upon the information encoded in underlying knowledge-representation structures. The research environment automatically builds these structures based upon parametric specification of the desired characteristics of the network architecture.

For example, passing the network creation routines the network type of neocognitron, the layer number 9, and the subnetwork size list of (1, 12, 8, 38, 19, 35, 23, 11, 10) would produce an architecture similar to the one described by Fukushima.10 An exact match to Fukushima's architecture could be obtained through additional parameter specifications. The key point is that the research environment comprises a combination of multipurpose routines that are pieced together appropriately through the use of flexible knowledge representation structures. The environment's flexibility is maintained through the strict application of software-modularization techniques. For example, software modularization ensures that weight calculation routines can be adjusted independently of the connection calculation routines. These ideas are clarified in the following sections.

Knowledge Representation

There are two fundamental ideas behind the use of knowledge representation. The first is that simple parametric changes can significantly alter the network architecture's final structure. For example, changing the size (number of nodes) in each subnetwork can greatly affect the specific connections between the nodes. This is of primary importance in networks such as the neocognitron1 for two reasons:

    1. The connections are between subnetworks rather than within subnetworks.

    2. The nodes are not completely connected (that is, every node is connected to only a subset of the nodes in other subnetworks). This means that the connection architecture is heavily influenced by the size and number of subnetworks.

The second fundamental idea is that many routines for the creation of neural networks and subsequent network processing are common to entirely different architectures. As a result, these routines can be reused and, to some degree, tailored automatically by combining and adapting the modules. The realization of a research environment that automatically generates flexible neural network architectures has, thus, been based upon a knowledge representation in which every structure "carries around with it" all the local information for piecing itself into the network puzzle and for subsequently computing/processing data once the architecture is assembled.

Three main knowledge structures -- NETs, LAYERs, and PLANEs -- collectively compose the knowledge representation. These structures are presented in Listing One, page 93. The NET structure consists of a list of one or more layers and a variety of local parameters specific to the global processing of the particular type of architecture (for example, the vigilance parameter in ART.3,5) LAYER encodes a list of subnetworks and the connections between layers (e.g., a list of the connections from each subnetwork in the present layer to the subnetworks of the preceding layer). Local parameters, such as inhibition constants or gain constants, are also stored within the layer. The type of the layer (for example, S or C for the neocognitron1 and F1 or F2 in ART3,5) is also recorded. A pointer to the previous layer is provided so that the routines can "get around" in the network. PLANE, a subnetwork, is used to store the connections within the subnetwork. The weights for both inter- and intraplane connections are also recorded in the PLANE structure. A size parameter for the plane is used for instantiation and is locally encoded. A pointer to the layer of which this plane is a part is stored as well. The actual nodes (cells) or processing elements are stored as an array, which is used to record output activation values.

In the future, this array will be extended so that each node is itself a knowledge structure. In this way, the activation functions, output functions, and local node parameters can be maintained locally. Such an extended representation will further increase the environment's flexibility by providing for the adjustment of input and output activation functions within the overall architecture and, thereby, will extend the standard of common activation functions for each cell in the subnetwork. This standard for specific activation functions for the entire subnetwork is not a severe restriction. However, the extended representation will support novel research endeavors and, thus, could prove to be extremely valuable.

The information contained in these knowledge structures is stored at the time of network instantiation and is utilized as the computational map by the processing routines. The structures thus dynamically control the routines to be called, the data to be passed, and the amount of processing to be performed. Each structure has been designed to carry locally all the information necessary to direct processing through the contents of the structure itself rather than through a priori routines. This advantage increases processing flexibility. In adapting the flow of processing, no routines need to be altered; only the information in the structures is modified.

Modularity

Great care has been taken to ensure software modularity. The significance of this is apparent upon analysis of the power obtained by the integration of knowledge representation and software modularization. Before discussing the integration, however, I will briefly describe the modularity.

Software modularity has been preserved in all three of the main phases of neural network applications: instantiation, processing, and training. The network instantiation routines, highlighted in Listing Two, page 93, are the pieces that are meshed to dynamically create network architectures. Instantiation is obviously the first step in the use of neural networks for any application. The instantiation process proceeds from generic net-level creation routines to specific layer-level creation routines that are tailored to the specific type of network. Finally, it proceeds to generic plane-level routines that perform connection and connection weight calculations in addition to creation of the plane itself. Upon completion of the instantiation process, the network is ready for training.

Listing Three, page 96, indicates the training routines. These routines are used to perform the processing all the way from the network level down to the level of the individual cell. Listing Three is abbreviated, however, and depicts the routines only down to the initial processing at the plane level. Subsequent processing occurs at the plane, connection, and cell levels. After training, the network can be used for identification tasks. Identification functions at the network level are presented in Listing Four, page 98. Further processing occurs at the layer and plane levels but is not included here.

Integration

The knowledge representation structures function as generic placeholders in which data about instantiated neural networks is recorded. As mentioned previously, the instantiation process dynamically produces an entire network based on the parametric specification of the desired characteristics. Instantiation begins by calling CREATE-NET (see Listing Two) and passing it the appropriate parameters for the desired network.

Two examples of the parametric settings and function calls for different versions of a neocognitron are depicted in Listing Five, page 98. Evaluating *neocognitron-net*, (that is, (eval *neocognitron-net*)) returns a NET structure (see Listing One) that contains an instantiated network meeting the characteristics specified in the parameters recorded in the *neocognitron-net* variable. More specifically, a seven-layer network would be created. The input layer would contain one subnetwork (plane), and each of the other layers would contain 24 planes. The plane in the input layer would contain a 16 x 16 array of nodes (cells). Each of the planes in the next layer would contain a 16 x 16 array of cells. Subsequent layers would be composed of planes with 10 x 10, 8 x 8, 6 x 6, 2 x 2, and 1 x 1 arrays of cells, respectively.

Each cell in a plane would have a "square" projection pattern; cells are connected to other cells that occupy a corresponding square area in another plane. Connections from the first layer to the input layer would cover a 5 x 5 array. Connections from the second layer to the first would also cover a 5 x 5 array, and similarly for all the connections up to and including the connections from the sixth layer to the fifth layer. Connections between the last (seventh) and the sixth layer, however, would cover a 2 x 2 array. Each cell, xi, would thus become an input to multiple other cells, xj, and each xj would receive inputs from many different xk cells.

Each of these connections has associated with it a connection weight. Within the knowledge representation, the connection weights are stored separately from the connections themselves so as to provide for adaptation of the weights independently of the connection structure. In addition to these elements common to all neural networks (that is, one or more layers, one or more subnetworks per layer, individual cells, connections, and connection weights), a variety of other parameters significant for the neocognitron would be set as indicated in the *neocognitron-net* variable in Listing Five.

An additional comment about the assignment of connections is important. The exact connection patterns are calculated based upon the size of the sending and receiving planes and the size of the projection area. The instantiation routines are structured such that the appropriate connections are computed based upon the parameters passed to the respective routines. For example, if each element in a 5 x 5 network is to project into a 3 x 4 network such that each element has a 2 x 2 projection and all of the 3 x 4 network elements are covered, then the system would compute that the element in position (0,0) of the 5 x 5 network would be connected to the elements in positions (0,0), (0,1), (1,0), and (1,1) of the 3 x 4 network (see Figure 1). The element in position (4,4) of the 5 x 5 network would be connected to the elements in positions (1,2), (1,3), (2,2), and (2,3) of the 3 x 4 network. A variety of standard connection architectures have been presented in the neural network literature -- for example ART,3,5,6,7 back propagation,8 Hopfield networks,9 and the neocognitron.1 Because the connection calculation routines are parameterized and actually calculate the connection patterns, arbitrary algorithmically expressed connection patterns can be realized.

Experimental Results

To investigate the viability of the research environment presented in this article, a standard neural network architecture was chosen to test the environment's instantiation, training, and identification capabilities. The network chosen was the neocognitron1 because of its large size and the complexity of the connection architecture. More specifically, the architecture presented by Fukushima and Miyake2 was reproduced and is characterized by *neocognitron-net* in Listing Five. For this version of a neocognitron, there are a total of more than 2.3M connections, each with its own weighting factor, between the 10K cells and 145 planes in the network. Additionally, several parameters interact and affect the behavior of the network. For example, each of the layers (excluding the input layer) has an intensity-of-inhibition parameter to control the amount of noise tolerated in matching a pattern; this parameter interacts with both the excitatory and inhibitory weights as an output is computed for a particular network cell.

Instantiation of the network, described by the *neocognitron-net* variable in Listing Five, and subsequent training and identification testing yielded significant results:

    1. Different patterns produce different excitation patterns within the network (see Figure 2 and Figure 3).

    2. Training the network alters its excitation patterns (see Figure 3 and Figure 4).

    3. After training, only a single cell fires at the recognition layer in response to different stimulus patterns (Figure 4).

    4. Appropriate clusterings are achieved for multiple versions of various numeric characters (see Figure 4, Figure 5, and Figure 6).

The input pattern is depicted at the base of each figure in Figure 2through Figure 5. Each layer is represented by the double row of squares (planes), which are respectively labeled S1, C1, S2, C2, S3, and C3 on the right-hand edge of the figures. The colored areas within each square represent the output activities of the corresponding nodes (cells). The color scale is shown on the left-hand edge of the figures and indicates that activity ranges from a low level of black to blue, to green, to red, to yellow, to a high level of white. These pictures are produced within the research environment as a useful utility for qualitatively observing the results of the particular instantiated architecture. As depicted herein, the results correlated well with those presented by Fukushima and Miyake.2

Future Efforts

The research environment and corresponding instantiation of neural networks have many possible applications. From a general perspective, such applications encompass both new-model development and the analysis of standard network models. More specifically, one of the motivating ideas behind this research has been that of using digitized images as training patterns. The hypothesis is that network models such as the neocognitron should theoretically be able to extract "useful" information from such images. For example, through training an instantiated network using a variety of images that contain a tree, the network should extract the common pattern of the tree and, thus, be able to indicate the presence of a tree in subsequent test images.

A possible future research effort would investigate the size of various networks required to actually perform such recognition and to characterize any additional requirements (for example, use of Grossberg's Boundary Contour/Feature Contour System [S. Grossberg and E. Mingolla, "Neural dynamics of perceptual grouping: Textures, boundaries, and emergent segmentations," Perception & Psychophysics, 38 (1985), 141 - 171.] as a preprocessor to simplify processing within an appropriate version of a neocognitron). Additionally, recent extensions to the neocognitron's architecture (that is, feedback between layers3) could be incorporated into the currently instantiated neocognitrons and could possibly provide for segmentation of trees within the test images after training has occurred. A significant amount of work would be required to obtain such results, but a real possibility of attaining them does exist.

On the front of run-time analysis of instantiated networks, the speed and versatility of processing related to the implementation of the environment should be considered. As mentioned, the present implementation, which was developed for in-house use, was developed on a Symbolics Lisp machine. The lisp machine was chosen for its flexibility and power in symbolic manipulation and for its exploratory programming environment. The implementation language is Lisp. In order to enhance processing speed, there is a plan to port the environment to a Sun 4/280. Although the environment is currently organized to dynamically create Fukushima's neocognitron,2 its versatility will be tested by instantiating additional neural network models.

Conclusion

The main virtue of the environment described here is that it frees the user/programmer/researcher from the need to write programs that assemble neural networks; the environment automatically generates flexible neural network architectures based upon parametric specifications. Flexibility is achieved through the integration of knowledge representation and standard software modularization techniques. Together, these two types of techniques form the powerful basis of a research environment for neural networks.

The mechanisms through which the power is harnessed and utilized are the heart of this article. The key idea is that a knowledge representation has been developed so that each element of a neural network carries around with it all the information necessary for local processing (for example, what processing to perform and where to get the inputs). Because this information storage is consistent from the node level to the network level, the entire network is executed without the need for global routines to encode its structure. Generic routines become specialized processors as they are adapted by the contents of the knowledge structures.

The viability of such an environment has been demonstrated through the instantiation and testing of a standard network architecture, the neocognitron. The results correlate accurately with those of an analogous network described by Fukushima and Miyake. The present work suggests many significant applications, some of which are currently under investigation.

Knowledge representation and software modularization are key tools ideally suited for the empirical analysis of neural networks. Wrapping an environment around these fundamental tools facilitates concentration on network-response characteristics rather than on monotonous debugging of specialized routines that encode network architectures.

Acknowledgments

This work has been performed under the guidance and with the support of John Gilmore, head of the Artificial Intelligence Branch at the Georgia Tech Research Institute. Additional contributions to the initial design and development of this environment have been made by Harold Forbes and Steven Strader. I am indebted to Diane Czuchry for her assistance in the preparation of this manuscript.

Notes

    1. K. Fukushima, "Neocognitron: A self-organizing neural network for a mechanism of pattern recognition unaffected by shift in position," Biological Cybernetics 36 (1980): 193 - 202.

    2. K. Fukushima, and S. Miyake, ~ Neocognitron: A new algorithm for ~ recognition tolerant of deformat ~ shifts in position," Pattern Recognition 15 (1982): 455 - 469.

    3. G. Carpenter, and S. Grossberg, "A massively parallel architecture for a self-organizing neural pattern recognition machine," Computer Vision Graphics Image Processing 37(1) (1987): 54 - 115.

    4. K. Fukushima, "A neural network for visual pattern recognition," IEEE Computer 21(3) (March 1988): 65 - 75.

    5. G. Carpenter, and S. Grossberg, "ART2: Self-organization of stable category recognition codes for analog input patterns," Applied Optics, 26 (1987) : 4919 - 4930.

    6. S. Grossberg, "Adaptive pattern classification and universal recoding: I. Parallel development and coding of neural feature detectors," Biological Cybernetics 23 (1976): 121 - 134.

    7. S. Grossberg, "Adaptive pattern classification and universal recoding: II. Feedback, expectation, olfaction, illusions, Biological Cybernetics 23 (1976): 187 - 202.

    8. D. Rumelhart, G. Hinton, and R. Williams, "Learning internal representations by error propagation," in Parallel Distributed Processing 318-362. (Cambridge, Mass.: MIT Press, 1986.)

    9. J. Hopfield, "Neural networks and physical systems with emergent collective computational abilities," Poc. Nat. Academy Sci, USA 79 (1982): 2554 - 2558.

    10. K. Fukushima, "Neocognitron: A hierarchical neural network capable of visual pattern recognition," Neural Networks 1: (1988) 119 - 130.

_INSTANTIATION OF NEURAL NETS_ by Andy Czuchry [LISTING ONE]



;;; -*- Mode: LISP; Syntax: Common-lisp; Package: andy; Base: 10 -*-


;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;
;;;
;;;     A Research Environment for the Instantiation of Neural Networks
;;;
;;;                 Andrew  J. Czuchry, Jr.
;;;
;;;                   Georgia Institute of Technology
;;;                   Georgia Tech Research Institute
;;;                   Artificial Intelligence Branch
;;;
;;;                    December 1, 1989
;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;



;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;
;;;            Knowledge Representation Structure definitions
;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;



(defstruct (NET                              ; structure for a network
             (:print-function net-printer))     ; printer function

    layers            ; list of layers
    local-parameters          ; list of local parameters for net
  )




(defstruct (LAYER                            ; structure for a layer
        (:print-function layer-printer))   ; printer function

  planes           ; list of planes
  e-connections           ; list of offsets for excitatory connections
  i-connections           ; list of offsets for inhibitory connections
  local-parameters           ; list of local parameters for layer
  prev-layer           ; "ptr" to preceding layer structure
  type              ; layer type (e.g., "S" or "C" in
                             ;  neocognitron; "F1" or "F2" in ART)
  )




(defstruct (PLANE                           ; structure for a plane (sub-network)
        (:print-function plane-printer))  ; printer function

  (cells nil :type array)     ; cell values for plane
  e-connections            ; list of offsets for excitatory connections
  i-connections            ; list of offsets for inhibitory connections
  e-weights            ; list of excitatory weights [real]
                 ;  {same order as list of connections}
  i-weights            ; list of inhibitory weights [real]
                 ;  {same order as list of connections}
  size               ; size of plane (list N x M)
  layer                       ; "ptr" back to layer of which plane is a part
  )





;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;
;;;            Structure Printer Functions
;;;
;;;                        Written by Harold S. Forbes
;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;;; ------------------------------------------------------------------------
;;; The function OBJECT-ADDRESS gets the memory address of any LISP object.

(defun OBJECT-ADDRESS (object)
  ;; Symbolics implementation.
  (sys:%pointer object)
  )


(defun NET-PRINTER (structure stream ignore)
  (declare (ignore ignore))
  (format stream "#<net ~X>" (object-address structure)))


(defun LAYER-PRINTER (structure stream ignore)
  (declare (ignore ignore))
  (format stream "#<~A-layer ~X>" (layer-type structure)
     (object-address structure)))


(defun PLANE-PRINTER (structure stream ignore)
  (declare (ignore ignore))
  (format stream "#<plane ~X>" (object-address structure)))



[LISTING TWO]


;;; -*- Mode: LISP; Syntax: Common-lisp; Package: andy; Base: 10 -*-


;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;
;;;
;;;     A Research Environment for the Instantiation of Neural Networks
;;;
;;;                 Andrew  J. Czuchry, Jr.
;;;
;;;                   Georgia Institute of Technology
;;;                   Georgia Tech Research Institute
;;;                   Artificial Intelligence Branch
;;;
;;;                    December 1, 1989
;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;



;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;
;;;                    Net CREATION functions
;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;



;; Create a new net.
;   Creates a version of a 5-layer neocognitron by default.

(defun CREATE-NET (&key
         (num-of-layers 5)
         (num-of-planes-per-layer-list '(1 24 24 24 24))
         (plane-size-list '((16 16) (16 16) (10 10) (4 4)
                                      (1 1)))
         (connection-pattern 'square)
         (mask-size-list '((5 5) (5 5) (5 5) (2 2)))
         (net-parameters '(: 0.5))
         (additional-parameter-list
          `(:net-type neocognitron
            (:r-val-list '(4.0 1.5 1.5))
            (:q-val-list '(1.0 16.0 16.0))
            (:b-val-list '(0.0 0.0 0.0))
            (:orientation-list '(8 1 8 1))
            )))
  (let* ((layer-list
     (create-layer-list num-of-layers
              num-of-planes-per-layer-list
              plane-size-list
              connection-pattern
              mask-size-list
              additional-parameter-list))  ; create layers
                                                     ;  meeting specifiaction
                                                     ;  parameters
    (net (make-net :local-parameters net-parameters
         :layers layer-list)))             ; create knowledge
                                                          ;  structure storing
                                 ;  new net
    net)
  )



;; Create a list of NUM-OF-LAYERS layers of the approriate type (type
;;   key recorded in ADDITIONAL-PARAMETER-LIST).

(defun CREATE-LAYER-LIST (num-of-layers planes-per-layer-list
               plane-size-list con-pattern
               mask-size-list
               additional-parameter-list)
  (let ((layer-type (zl:lexpr-funcall #'extract-type-key
                   additional-parameter-list)))

; determine type of net to which layers are to belong
;  and create appropriate type of layers

    (zl:selectq  layer-type
   (neocognitron
      (create-neocognitron-layer-list
         num-of-layers planes-per-layer-list
         plane-size-list con-pattern mask-size-list
         additional-parameter-list))
   (ART2
      (create-ART2-layer-list
         num-of-layers planes-per-layer-list
         plane-size-list con-pattern mask-size-list
         additional-parameter-list))
   (backpropagation
      (create-backprop-layer-list
         num-of-layers planes-per-layer-list
         plane-size-list con-pattern mask-size-list
         additional-parameter-list))
       )
    )
  )



;; Extract the NET-TYPE keyed value

(defun EXTRACT-TYPE-KEY (&key net-type &allow-other-keys)
  net-type)




;; Create a list of NUM-OF-LAYERS layers for the neocognitron

(defun CREATE-NEOCOGNITRON-LAYER-LIST (num-of-layers planes-per-layer-list
                       plane-size-list
                       con-pattern
                       mask-size-list
                       additional-parameter-list)

; extract parameters specific to neocognitron

  (let* ((r-val-list (zl:lexpr-funcall #'extract-r-val-list
                   additional-parameter-list))
    (q-val-list (zl:lexpr-funcall #'extract-q-val-list
                   additional-parameter-list))
    (b-val-list (zl:lexpr-funcall #'extract-b-val-list
                   additional-parameter-list))
    (orientation-list (extract-orientation-list
             additional-parameter-list))
    (total-number-of-layers (+ (* 2 num-of-layers) 1))
    (number-of-processing-layers
     (- total-number-of-layers 1)))

; error checking and layer creation

    (cond ((not (= num-of-layers (length r-val-list)
         (length q-val-list) (length b-val-list)))      ; check extracted parameters
      (ferror "Improper parameters for a net with ~D Layers:
                        r value list = ~s,
                        q value list = ~s,
                        b value list = ~s."
         num-of-layers r-val-list q-val-list b-val-list))
     ((not (= total-number-of-layers
         (length planes-per-layer-list)
         (length plane-size-list)))                     ; check passed parameters
      (ferror "Improper parameters for a net with ~D layers:
                        Either not enough planes sizes listed in ~s, OR
                         not enough plane sizes listed in ~s."
         total-number-of-layers planes-per-layer-list
         plane-size-list))
     ((not (= number-of-processing-layers (length mask-size-list)))  ; check projection masks
      (ferror "Improper parameters for a net with ~D layers beyond
                     input layer:
                         Not enough connection mask sizes listed in ~s."
         number-of-processing-layers mask-size-list))

; Create appropriate number of layers, one at a time, and record as a list.
; For each layer, extract appropriate parameter settings and sizes.

     (t
      (do* ((i 1 (+ i 1))
       (r-val-list r-val-list (cdr r-val-list))
       (r-val (car r-val-list) (car r-val-list))
       (q-val-list q-val-list (cdr q-val-list))
       (q-val (car q-val-list) (car q-val-list))
       (b-val-list b-val-list (cdr b-val-list))
       (b-val (car b-val-list) (car b-val-list))
       (rest-orientations orientation-list
                (cddr rest-orientations))
       (s-orientations (car rest-orientations)
             (car rest-orientations))
       (c-orientations (cadr rest-orientations)
             (cadr rest-orientations))
       (prev-plane-num (car planes-per-layer-list)
             (cadr planes-per-layer))
       (planes-per-layer (cdr planes-per-layer-list)
               (cddr planes-per-layer))
       (num-of-s-planes (car planes-per-layer)
              (car planes-per-layer))
       (mask-list mask-size-list (cddr mask-list))
       (mask-size (car mask-list) (car mask-list))
       (plane-sizes-list (cdr plane-size-list)
               (cddr plane-sizes-list))
       (prev-c-plane-size (car plane-size-list)
                c-plane-size)
       (s-plane-size (car plane-sizes-list)
                (car plane-sizes-list))
       (c-plane-size (cadr plane-sizes-list)
                (cadr plane-sizes-list))
; create input layer

       (input-layer
        (make-layer :planes
               (create-plane-list
                1 prev-c-plane-size
                (do ((i 1 (+ i 1))
                (times (apply #'* prev-c-plane-size))
                (res '((0)) (cons '(0) res)))
               ((>= i times) res))
                0 0 1 mask-size 'C)
               :type 'C))
; create connections

       (s-connection-list
        (create-connection-list 1 prev-c-plane-size s-plane-size
                 con-pattern mask-size 'S)        ; Connections same for
        (create-connection-list 1 prev-c-plane-size s-plane-size ;  for all planes
                 con-pattern mask-size 'S))
       (c-connection-list
        (create-connection-list 1 s-plane-size c-plane-size
                 con-pattern (cadr mask-list) 'C) ;C-cells connect
        (create-connection-list 1 s-plane-size c-plane-size      ; one S-plane
                 con-pattern (cadr mask-list) 'C))
; create planes
       (s-planes
        (create-plane-list num-of-s-planes s-plane-size s-connection-list
                 prev-plane-num b-val s-orientations mask-size 'S)
        (create-plane-list num-of-s-planes s-plane-size s-connection-list
                 prev-plane-num b-val s-orientations mask-size 'S))
       (c-planes
        (create-plane-list (cadr planes-per-layer) c-plane-size c-connection-list
                 num-of-s-planes b-val c-orientations mask-size 'C)
        (create-plane-list (cadr planes-per-layer) c-plane-size c-connection-list
                 num-of-s-planes b-val c-orientations mask-size 'C))
;assign layers
       (new-s-layer (make-layer :planes s-planes :connections s-connection-list
                 :r r-val :q q-val :prev-layer input-layer :type 'S)
               (make-layer :planes s-planes :connections s-connection-list
                 :r r-val :q q-val :prev-layer (car (last layers))
                 :type 'S))
       (new-c-layer (make-layer :planes c-planes :connections c-connection-list
                 :r r-val :q q-val :prev-layer new-s-layer :type 'C)
               (make-layer :planes c-planes :connections c-connection-list
                 :r r-val :q q-val :prev-layer new-s-layer :type 'C))
; add new layers to layer list

       (layers      ; S and C layers
        (list input-layer new-s-layer new-c-layer)
        (append layers
           (list new-s-layer new-c-layer))))
      ((>= i num-of-s-layers) layers)))           ; return list of layers
     )
    )
  )




[LISTING THREE]


;;; -*- Mode: LISP; Syntax: Common-lisp; Package: andy; Base: 10 -*-


;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;
;;;
;;;     A Research Environment for the Instantiation of Neural Networks
;;;
;;;                 Andrew  J. Czuchry, Jr.
;;;
;;;                   Georgia Institute of Technology
;;;                   Georgia Tech Research Institute
;;;                   Artificial Intelligence Branch
;;;
;;;                    December 1, 1989
;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;



;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;
;;;
;;;                     Net TRAINING functions
;;;
;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;


;; Top level function for training.  Trains on all patterns in PATTERN-LIST.
;;   Sets mappings.
;     Trains on *neocognitron* by default.

(defun TRAIN-MAIN-LOOP (&key
         (net *neocognitron*)
         (pattern-list (scl:send lrn-pattrns :item-list))
         (iteration-list '(4 4 4))
         print-data)

  (do* ((patterns pattern-list (cdr patterns))                 ; loop over all patterns
   (pattern-item (car patterns) (car patterns))
   (pattern (if pattern-item (item-misc pattern-item))
       (if pattern-item (item-misc pattern-item))))  ; check if pattern found
       ((null patterns))
       (cond (print-data
         (format t "~%~%")
         (time:print-current-time)
         (format t "~%% ** training ~s on pattern ~s.  (Iteration ~d of ~d, ~d more patterns.)"
            net pattern i actual-its (length (cdr patterns)))))

       (train pattern net iteration-list)                      ; perform training
       )

  (create-mappings net pattern-list)                           ; record mapping between pattern and
                                                               ;  most active "output layer" cell

  (cond (print-data
    (format t "~%~%")
    (time:print-current-time)))
  )



;; Trains a net to recognize a pattern
;   Returns the plane/cell of the final layer which responds most actively
;    to the pattern.

(defun TRAIN (pattern-plane net
         &optional (iterations-per-layer-list '(4 4 1)))
  (let* ((layers-to-be-trained (cdr (net-layers net)))
    (num-layers (length layers-to-be-trained)))
    (cond ((not (= layers (length iterations-per-layer-list)))
      (ferror "~% Improper training iteration count list for training ~D layers: ~s"
        num-layers iterations-per-layer-list))
     (t
      (setf (plane-cells (car (layer-planes (car (net-layers net)))))
       pattern-plane)                                           ; assign input pattern
      (do* ((layer-list (cdr (net-layers net)) (cdr layer-list))     ; loop over all layers
       (layer (car layer-list) (car layer-list))
       (iteration-list iterations-per-layer-list
             (cdr iteration-list))
       (iterations (car iteration-list) (car iteration-list))   ; # times to train layer
       (dummy (train-layer layer iterations)                     ; train each layer in succession
         (train-layer s-layer iterations))
       (result (caar (update-layer layer))                     ; update the layer
          (caar (update-layer layer))))
      ((null (cddr layer-list)) result))                        ; return most active cell in final layer
      )
     )
    )
  )



; Trains a layer in a net to recognize a pattern
;
;  Continues training until updating layer produces no more changes in the representative
;   list (returned by UPDATE-LAYER)

;  At some point I'd like to remove the ITERATIONS parameter and work only from changes in rep list, but
;   it is computationally prohibitive in the current version of the system.

(defun TRAIN-LAYER (layer &optional iterations)
  (do* ((old-reps nil new-reps)                                         ; record most active cell
   (new-reps (update-layer layer) (update-layer layer))      ; re-adjust after training
   (i 0 (+ i 1)))
    ((or (equal old-reps new-reps) (>= i iterations)) new-reps)         ; check if training complete
    (train-layer-aux layer new-reps))                          ; perform training
  )




; Trains a layer in a net to recognize a pattern
;

(defun TRAIN-LAYER-AUX (layer &optional (representative-data-list (update-layer layer)))
    (let ((layer-type (zl:lexpr-funcall #'extract-type-key
                (layer-local-parameters layer))))
; select appropriate training routine

      (zl:selectq  layer-type
   (neocognitron
      (train-neocognitron-layer-aux
         layer representative-data-list))
   (ART2
      (train-ART2-layer-aux
         layer representative-data-list))
   (backpropagation
      (train-backprop-layer-aux
         layer representative-data-list))
       )
      )
    )



; Trains a layer in a neocognitron to recognize a pattern
;

(defun TRAIN-NEOCONITRON-LAYER-AUX (layer &optional (representative-data-list (update-layer layer)))
  (cond ((> *trace* 3)
    (format t "~%training layer.~%  Chosen Representative list:~%  ~s"
       representative-data-list)))
  (mapcar
   #'(lambda (representative-data)
       (let* ((plane (car representative-data))
         (pos (cadadr representative-data))
         (prev-layer (layer-prev-layer layer))
         (q-val (layer-q layer))
         (results
          (do* ((all-connections (layer-connections layer)
                  (cdr all-connections))
           (connections (connections-for-pos layer plane pos)) ; extract connections
           (all-i-vals-list (plane-i-weights plane)
                  (cdr all-i-vals-list)) ; extract current weights
           (old-i-vals (car all-i-vals-list)
             (car all-i-vals-list))
           (all-e-vals-list (plane-e-weights plane)
                  (cdr all-e-vals-list))
           (all-e-vals (car all-e-vals-list)
             (car all-e-vals-list))
           (all-e-weights-list (plane-e-weights plane)
                (cdr all-e-weights-list))
           (all-e-weights (car all-e-weights-list)
                (car all-e-weights-list))
           (raw-result (train-plane prev-layer connections old-i-vals
                     all-e-vals all-e-weights q-val) ; perform training on
               ;  excitatory connections
             (train-plane prev-layer connections old-i-vals
                     all-e-vals all-e-weights q-val))
           (result raw-result
              (list (append (car result)
                  (car raw-result))
               (+ (cadr result)
                  (cadr raw-result))))) ; record result
          ((null (cdr all-connections)) result))) ; return result
         (new-i-weight-list (car results))   ; adjust inhibitory weights
         (new-b-val (* q-val (compute-inhib-input (connections-for-pos layer plane pos)
                         (c-weights-for-pos  plane pos)
                         prev-layer))))
    (setf (plane-i-weights plane)
          (transpose-on-type new-i-weight-list (layer-type layer)))
    (setf (plane-b plane) new-b-val)))
   representative-data-list)
  )




;Trains a plane's CONNECTIONS to all planes in previous layer

(defun TRAIN-PLANE (prev-layer connections old-i-val-lists all-e-val-lists all-e-weight-lists q-val)
  (do* ((connection-list connections (cdr connection-list))      ; extract connections
   (connection (car connection-list) (car connection-list))
   (c-val-list all-e-val-lists (cdr c-val-list))            ; extract current weights
   (c-vals (car c-val-list) (car c-val-list))
   (c-weight-list all-e-weight-lists (cdr c-weight-list))
   (c-weights (car c-weight-list) (car c-weight-list))
   (rev-old-i-vals (reverse old-i-val-lists))
   (old-i-vals (car old-i-val-lists)
          (nth (- (length c-val-list) 1) rev-old-i-vals))
   (con-vals (train-plane-aux prev-layer connection old-i-vals
               c-vals c-weights q-val)
        (train-plane-aux prev-layer connection old-i-vals
               c-vals c-weights q-val))    ; perform actual training
   (new-i-weights (list (car con-vals))
             (nconc new-i-weights
                (list (car con-vals))))
   (vtotal (cadr con-vals) (+ vtotal (cadr con-vals))))
       ((null (cdr connection-list)) (list (list new-i-weights) vtotal))) ; return new connection weights
  )





[LISTING FOUR]


;;; -*- Mode: LISP; Syntax: Common-lisp; Package: andy; Base: 10 -*-


;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;
;;;
;;;     A Research Environment for the Instantiation of Neural Networks
;;;
;;;                 Andrew  J. Czuchry, Jr.
;;;
;;;                   Georgia Institute of Technology
;;;                   Georgia Tech Research Institute
;;;                   Artificial Intelligence Branch
;;;
;;;                    December 1, 1989
;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;



;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;
;;;
;;;                      IDENTIFICATION functions
;;;
;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;




; Attempts to recognize a pattern using NET as a trained net
;  Returns the plane of the final layer which responds to the pattern.

(defun IDENTIFY (pattern-plane net)
  (setf (plane-cells (car (layer-planes (car (net-layers net)))))
   pattern-plane)                                            ; assign input pattern
  (caaar (last (update-net net))))                                ; update net and return most active cell of final layer









; Updates entire net
;  Returns maximum value as nested set of lists
;   (((plane (value, pos)) ... (plane (value, pos)) layer1)
;   (((plane (value, pos)) ... (plane (value, pos)) layer2) ...)

(defun UPDATE-NET (net)
  (let (( (net- net)))
    (do* ((layer-list  (cdr (net-layers net)) (cdr layer-list))      ; loop over all layers
     (layer (car layer-list) (car layer-list))
     (layer-max (update-layer layer) (update-layer layer))      ; update the layer
     (max (list (append layer-max (list layer)))                ; max value ((value, pos) plane)... layer) list
          (append max (list (append layer-max (list layer)))))) ;  append each layer
    ((null (cdr layer-list)) max)
      )
    )
  )








[LISTING FIVE]


;;; -*- Mode: LISP; Syntax: Common-lisp; Package: andy; Base: 10 -*-


;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;
;;;
;;;     A Research Environment for the Instantiation of Neural Networks
;;;
;;;                 Andrew  J. Czuchry, Jr.
;;;
;;;                   Georgia Institute of Technology
;;;                   Georgia Tech Research Institute
;;;                   Artificial Intelligence Branch
;;;
;;;                    December 1, 1989
;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;



;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;
;;;           Sample Variables of networks to be instantiated
;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;



(defvar *neocognitron-net*
  '(create-net
    :num-of-layers 7
    :num-of-planes-per-layer-list '(1 24 24 24 24 24 24)
    :plane-size-list '((16 16) (16 16) (10 10) (8 8) (6 6) (2 2) (1 1))
    :connection-pattern 'square
    :mask-size-list '((5 5) (5 5) (5 5) (5 5) (5 5) (2 2))
    :net-parameters '(: 0.5)
    :additional-parameters '(:net-type neocognitron
                   :r-val-list '(4.0 1.5 1.5)
                   :q-val-list '(1.0 16.0 16.0))
    )
  )


(defvar *neocognitron-net2*
  '(create-net
    :num-of-s-layers 7
    :num-of-planes-per-layer-list '(1 15 20 20 24 20 10)
    :plane-size-list '((16 16) (16 16) (12 12) (10 10) (8 8) (4 4) (1 1))
    :connection-pattern 'square
    :mask-size-list '((7 7) (5 5) (3 3) (5 5) (5 5) (4 4))
    :net-parameters '(: 0.5)
    :additional-parameters '(:net-type neocognitron
                   :r-val-list '(4.0 1.5 1.5)
                   :q-val-list '(10.0 16.0 16.0))
    )
  )