Directed Acyclic Graph Unification

An object-oriented approach to building constraint systems

David Perelman-Hall

David earned his PhD in computational linguistics at the University of Texas at Austin. He can be contacted at phall@ccwf.cc.utexas.edu.

One approach to parsing natural language is to build a two-part parsing system, where one part is the grammar and the second part is a set of rules associated with the grammar. What I call the "Part One" grammar, commonly a set of rewrite rules in Backus-Naur form (BNF) used to recognize the tokens of the language, lays down the entire road map of the basic parsing process. The rules in "Part Two" constrain the parser's actions and serve as navigators, indicating which paths the Part One parser can investigate. Each constraint represents an assertion which linguists have determined to be a requirement for grammaticality.

The familiar BNF rules that usually form Part One contain a left-hand side (LHS) and a right-hand side (RHS) separated by an arrow. The rules describe how one side of the arrow may be written as the other side, as in Example 1. These rules are order independent, so the arrows in Example 1 don't actually imply that action must occur from left to right (top-down, from start node S to the terminal symbols).

Linguists frequently refer to the rewrite rules of Part One as "phrase-structure rules" and use them to parse input; for example, the sentence, "the cow eats the grass" would be accepted by Example 1. However, using only Part One to parse natural language tends to fail because Part One accepts ungrammatical input strings as though they were grammatical. Suppose, for example, that you admit in Example 1 the perfectly legitimate verb "eat" by introducing the rewrite rule: V --> "eat". The grammar then would accept the ungrammatical input, "the cow eat the grass." With Part Two, you can at least try to prevent acceptance of ungrammatical input such as this.

What linguists attempt to codify in Part Two is the inherent capacity of humans to determine what is or isn't grammatical. This is not an easy thing to do; most people can tell you only that a sentence is ungrammatical, but not why. The rule set which holds this knowledge, and its application, is what differentiates between a dumb token recognizer and an expert system.

A commonly accepted method of employing Part Two is to associate one or more constraints (taken from the rule set of Part Two) with the terms of phrase-structure rules in such a way that the phrase-structure rules can be successfully applied only if all associated constraints are satisfied.

For example, you can associate a value for grammatical number with the input terms forming verbs and nouns. Therefore, "eat" would have the value plural, while "eats," "grass," and "cow" would have the value singular. Now the rule S --> NP VP can be rewritten only if the values for grammatical number match. In the association of the elaborated phrase-structure rule in Example 2, the value for number has been percolated up through the rule system from its original entry with the input terms. Use of this feature-value equivalence metric prevents the parsing system from accepting the ill-formed sentence, "the cow eat the grass," because the required match between VP number and NP number would fail, and therefore the rewrite rule would not be applicable.

In this article, I'll describe "feature-value unification," a general, object-oriented method of enacting such a constraint system adapted from directed acyclic graph (DAG) unification. While feature-value unification can be used in natural-language parsing, it can also be used in applications where you want an object to enter into relations only with other objects that meet the constraints specified in the feature-value DAGs. In an object-oriented system, you would include DAG objects as data members in those objects whose behaviors must be constrained and require their successful unification before allowing the objects to interact. There are numerous issues surrounding natural-language parsing that I won't cover, including building DAGs directly from the elaborated phrase-structure rules and implementing a parsing engine. Instead, I'll focus on feature-value (or DAG) unification through an example drawn from natural-language processing.

Dispensing with the Formalities

A directed acyclic graph is rooted at a single node. Traversal through the graph is unidirectional (from root to node) and noncyclic (no arc can provide a path for returning to its originating node). DAG unification can net two results:

A Boolean value based on the success of the unification.

A new DAG, bearing the results of the unification.

Figure 1 shows a DAG which illustrates the number agreement between subject and predicate phrases of a sentence, and is the representation of the S node of the elaborated grammar in Example 3. This rule applies only if the value of case for the noun phrase (NP) is nominative, the subcat for the verb phrase (VP) is intransitive, and the agreement for NP matches that of VP. The value of agreement itself in this example is another DAG, which expands into feature-value constructs such as number -->

singular, gender -->

masculine, animacy -->

animate, and so on.

Structures in the feature-value notation which are merged during unification include:

Atomic values, as for features case and subcat in Figure 1.
Variable values, or equivalencies, as for feature agreement of Figure 1.
Dag values, which themselves are comprised of atomic, variable, or DAG values, and so on, iteratively.

Successful unification produces a DAG that results from merging the information from two or more DAGs. Unification proceeds by iterating through the feature-value sets of both DAGs thus:

Where the features of both DAGs match, and the associated values match if they are either atomic or variables instantiated with atomic values, add the feature-value to the resulting DAG.
Where the features of both DAGs match, but the value for the feature in one of the DAGs is a variable, add the feature-value found in the other DAG to the resulting DAG.
Where the features of both DAGs match, but the value for the feature in both of the DAGs is a variable, add a feature-value indicating the identity of the variables to the resulting DAG.
Where only one DAG contains the feature, add its feature-value pair to the resulting DAG.

DAG-valued features obey this algorithm recursively. Unification is an additive procedure. It can succeed or fail. Failed unification does not produce a resulting DAG. Failure occurs when atomic values do not match for like features. Possible results of unification are pictured in Figure 2. (In principle, you can unify as many DAGs as you want, but in practice I unify them two at a time.)

C++ Requirements

To use DAGs, you need a structure and mechanism to construct them and an algorithm for performing unification. I'll build DAGs using inheritance and C++ templates and unify them using virtual functions and double indirection. I have typedefed sets, lists, and maps for use throughout the code, making extensive use of templates.

In its abstract essence, feature-value unification relies on DAGs built from objects of types Feature and Value. A Feature is typedefed as a String. A Value is either an Atomic, Variable, or Dag value--all of which are formed by inheritance from an abstract base Value class; see Figure 3. The physical representation, and hence the behavior, differs among the derived concrete atomic-, variable-, and DAG-valued classes; however, the types of activity in which they participate--most importantly, unification--are identical. This allows you to declare in an abstract base class a virtual unify() function that anything considered a Value should uniquely define in its derived concrete class. This lets you use polymorphism in the implementation of unification among differently valued DAGs.

Listing One is the declaration of the Value and ValuePtr classes. (Listing Two is the Value class header file.) Notice that the Value class declares no data members; it merely proposes the behavior that a value should be able to implement. To take advantage of polymorphism with this class, I've implemented a smart-pointer wrapper class, ValuePtr, which takes as one of its constructors a reference to a Value object, but maintains as a data member a pointer to an object of type Value. The three Values--Atomic, Variable, and Dag--are handled as ValuePtrs throughout the program. Because ValuePtrs internally handle newing, copying, and deleteing of pointed-to Values, you can safely ignore issues of memory management with DAGs by using ValuePtrs. ValuePtrs actually alias Values, so they can be used to implement polymorphism.

The "Value" of Polymorphism

Every member function of the Value class is declared virtual, and many of them are pure virtual functions. This means that every function in the Value class can be redefined in classes derived from Value, and pure virtual functions must be redefined in derived classes in order for an object of that class to be constructed. We want the functionality of copying, writing to output, equality testing, and unifying to differ depending on the type of value.

Notice in the base class that the operator==() and unify() functions each have four prototypes, differing in the type of the value parameter, taking references to a base Value class object and to all three derived classes. This arrangement lets you employ polymorphism and double indirection to gain default definitions for inequality and failed unification.

Let's use the equality function to examine how polymorphism achieves the default behavior. The base class contains the only pure virtual version of the equality test, taking a base Value class reference, as might be expected. The base class also defines three (not pure) virtual equality tests. These tests take a reference to each of the derived types and by default returns False. If an object of the base Value class (actually a pointer or reference will be required) ever makes the call ValueObject --> operator==(SomeDerivedValueType& DV); for any of the three derived Value types, it returns False by default.

In Listings Three through Eight, each derived class defines two versions of the equality test: one taking a reference to an object of its own type, the other taking a reference to the base Value type. For example, the derived concrete AtomicValue class (Listing Three) defines an operator==() taking an AtomicValue reference, and one taking a Value reference. During program execution, the only type of object calling operator==() can be one of the concrete derived types. If one atomically valued DAG seeks to test equality with another, AtomicValue::operator==(AtomicValue&) is called. This function performs an actual equality test between AtomicValue objects. Every equality test in a derived class defines the actual test for objects of its own class.

If, however, the type of the object being tested for equality differs from the calling type, this scenario occurs: Because the type of the parameter differs from the type of the calling object, it can not test equality with an object of its own type, so by default the operator==(Value&) of the derived class is called. This works because in each derived class there is no operator==() prototyped to take the other two derived classes, and each derived concrete value is by inheritance an instance of the base Value class. In each derived class, the equality test prototyped for a base Value reference is the same: It turns the Value reference parameter into the calling object, passing it the derived object as the parameter. This calls the specific Value::operator==(someDerivedType&) that corresponds to the type of the derived-object parameter.

In the example of the AtomicValue, the call would resolve to the base Value part of an AtomicValue testing equality with an AtomicValue (the call would be: Value::operator==(AtomicValue&)). Recall that--as defined in the base class--any equality test made by a base Value object with a derived object returns False. Hence, the test between unlike derived classes for operator==() defaults to False in each derived class without actually testing the elements of the value objects. Figure 4 shows the calling sequence for both AtomicValue::operator==(AtomicValue& avr) and AtomicValue::operator==(DagValue&dvr).

This handy method of relying on polymorphism to implement a default base behavior has a compile-time drawback: It results in warnings letting you know that the derived functions prototyped for the base class hide the base-class functions prototyped for the derived class. These warnings are correct, but in this case can be safely ignored because you make no calls to the base-class functions prototyped for the hidden derived classes.

Unifying DAGs

Unifying DAGs is more complicated than testing for equality of Values. The same basic arrangement of double indirection exists (including that for each derived Value type, there is a definition of what it means for that type to unify with itself). DAG unification is also complicated by the fact that it produces more than a True or False; it yields both a resulting DAG (if successful) and a Boolean indication of the attempt to unify. Furthermore, as the unification results in Figure 2 show, more than one possible unification of types has a legitimate chance of success. This means that, because VariableValues can unify with any type of Value, there must be a way of tracking the Values that VariableValues acquire while unification takes place so that the final resulting DAG contains the proper Values which instantiated the VariableValues.

The simple part is implementing atomically valued unification. If the atomically valued elements are identical, just add the identical Feature-AtomicValue pair to the resulting DAG (see Listing Three).

Here the simplicity ends. The result of a call not prototyped for an AtomicValue depends on the type of the non-AtomicValue. Possibilities are either DagValue or VariableValue. The call prototyped for a DagValue, AtomicValue::unify(DagValue&,_), has no definition in the AtomicValue class. Eventually, it defaults to AtomicValue::unify(Value&,_) via double indirection, returning FALSE from the base class without performing any unification.

The same call in the VariableValue class, however, is a different story. A VariableValue has to attempt to unify with any type of Value. Consequently, the VariableValue class can't merely pass unification on to a base class for default behavior. Indeed, you can see in Listings Five and Six that the VariableValue class defines a unify function for all derived Value class parameters. In this case, AtomicValue::unify(VariableValue&,_) is turned into the call VariableValue::unify(AtomicValue&,_), which has a legitimate chance of succeeding and therefore must have a definition.

DagValues can unify with themselves; however, the DagValue class's implementation of unify() uses double indirection (see Listings Seven and Eight). DAGs can't unify with atoms, so double indirection directs the calling sequence toward failure. DAGs might unify with Variables, so a call to VariableValue::unify(DagValue&, _) might succeed. The qualification exists because there is always a chance that the variable has already been instantiated to some non-VariableValue which will not successfully unify with the other Value.

DAG unification is implemented as a global function, taking the two DAGs under consideration. The complete set of features in both DAGs is culled together and iterated over. The Dag::value(Feature&) method is used to acquire from both DAGs the values associated with the current feature. If the feature exists in both DAGs, attempt to unify the acquired values and add the resulting Feature-Value pair to the resulting DAG; if it exists in one DAG only, add that Feature-Value pair to the resulting DAG. If both values are DagValues, then DagValue::unify(DagValue&, _) calls the global unify function.

The Substitution List

DAG unification can be destructive or nondestructive. Destructive unification manipulates one of the original DAGs, changing its data structure so that it houses the unification of the two DAGs. The original DAG is permanently changed, hence the term "destructive." I've opted to build the resulting unified DAG from scratch, nondestructively incorporating the elements of the participating DAGs into the newly created resulting DAG. To do this, there has to be a mechanism for tracking the various instantiations which Variable-valued DAGs take on. This tracking is done in the SubstitutionList class. To test or perform the unification of two DAGs, you need to supply a SubstitutionList object to track the variable instantiations.

The SubstitutionList inherits privately from a Map template instantiated with Variables as keys and ValuePtrs as values. The source code for the SubstitutionList class is provided electronically; see "Availability," page 3. Private inheritance models the "has-a" relationship, and no portion of the base class is accessible outside of the inheriting class proper. The interface to the SubstitutionList class, therefore, is comprised of methods unique to the derived class which you declare public. These consist mostly of methods for setting Values into the SubstitutionList, setting variables identical in the SubstitutionList, checking whether the SubstitutionList contains a variable, checking whether it is empty, returning from the SubstitutionList a ValuePtr associated with a variable, returning the set of variables set equal to each other.

Not only is the SubstitutionList a map (thereby containing a private data-storage facility; see Figure 5), but it also contains as a private (protected) member a list of sets of variables. This protected member is a nested template instantiated as ListOf <SetOf<VariableValues>>. Each separate set of variables is a set for which all variables will bear the same value upon successful unification. If the value of at least one and only one variable within a set should instantiate to a nonvariable value (meaning that it is either Dag valued or Atomic valued), the value for that entire set of variables is obtained from the base variable-to-value map class. This is because if at least one and only one variable of a set bears a nonvariable value, then one variable from that set will also be in the map, acting as a kind of place holder to indicate the value to which every member of the set will instantiate (by mapping). The code prevents more than one place holder from being built. A set which will not instantiate to a nonvariable value will be a set of variables, all of which bear the same identity as a variable.

Before unifying DAGs, a SubstitutionList is created to hold all substitutions made for variables during unification. During the actual unification, the SubstitutionList is filled with any substitutions made for variables. Immediately upon completion of unification, any variables in the resulting DAG that have substitutions as defined by the contents of the SubstitutionList are then replaced by the appropriate substitutions. In this sense, in order to accommodate variables in DAG unification, unification is a two-phase process: The first phase is unifying anything which isn't a variable, and the second is the exchange of values for variables.

How They All Operate Together

To illustrate the concepts presented here, I've written a demonstration program (available electronically) that builds a DAG compliant with Grammar 2 from an NP of "the cow" and a VP of "eats grass". The construction of both DAGs involves building an inner DAG for the set of Feature-Value arcs which contain requirements under the agreement node. I have determined that these should be animacy and number. As long as the atomic values for animacy and number match, the agreement subDAGs will unify. The values for animacy and number can be supplied by the lexical items themselves, so that the word "cow", for instance, supplies "animate" and "singular" for the animacy and number features, and "eats" supplies values for number and subcategorization.

For a natural-language parser, once a sentence is presented for parsing, a lexical look-up routine can build the DAGs for each lexical item as each item is encountered (such a DAG factory is not supplied with this article). This allows you to create phrase-structure rewrite rules for terminal elements which bear fully informational DAGs--that is, adumbrating Grammar 1 with requirements for DAG unification to produce a flushed out Grammar 2. Once parsing starts, if a rewrite rule successfully applies to a terminal element, the result of the rewrite will contain a new DAG created by the unification of the DAGs specified in the addenda to the rewrite rule. In this way, each successfully applied rule results in the passing-on of DAGs so that the final end-state node (S in this case, working bottom-up) contains a single DAG bearing the complete history of its parse. This percolating activity is handy for investigating grammatical reasons for parse failure because it builds up a structure which may be examined for the semantics attached in the form of Feature-Values.

In fact, for grammar sticklers, the sentence being parsed in main() in accordance with Grammar 2 purposefully contains an additional inconsistency which should be remedied in order to conform with good grammar. Perhaps finding this breach of grammar is enough indication of the difficulty that people have in pointing out why some input is ungrammatical.

References

Kay, M. "Functional Grammar." 5th Annual Meeting of the Berkeley Linguistic Society, 1979. Berkeley: Berkeley Linguistics Society.

Knight, Kevin. "Unification: A Multidisciplinary Survey." ACM Computing Surveys. Vol. 21, 1989.

Figure 1 DAG with equivalence for "agreement." Figure 2 Possible results of unification. Figure 3 Value inheritance hierarchy and ValuePtr class relationship. Figure 4 Double-indirection function-call sequence. Figure 5 SubstitutionList's private data storage.

Example 1: Part One grammar.

S        NP VP   (Sentence may be written as Noun Phrase then Verb Phrase)
VP       V NP    (Verb Phrase may be written as Verb then Noun Phrase)
NP       DET N   (Noun Phrase may be written as Determiner then Noun)
N        "cow"   (Noun may be written as "cow")
N        "grass" (Noun may be written as "grass")
V        "eats"  (Verb may be written as "eats")
DET      "the"   (Determiner may be written as "the")

Example 2: The value for number has been percolated up through the rule system.

S  NP VP
<NP number>=<VP number>

Example 3: Part Two grammar.

SNP VP
<NP agreement>=<VP agreement>
<NP case>=nominative
<VP subcat>=transitive
VPV
<VP agreement>=<V agreement>
<VP subcat>=<intransitive>
<VP subcat>=<V subcat>
VPV NP
<VP agreement>=<V agreement>
<VP subcat>=<transitive>
<VP subcat>=<V subcat>
NPDET N
<NP agreement>=<N agreement>
<NP agreement number>=<DET number>
N"cow"
<N number>=<singular>
N"grass"
<N number>=<singular>
V"eat"
<V number>=<plural>
<VP subcat>=<transitive>
V"eats"
<V number>=<singular>
<VP subcat>=<transitive>
DET"the"
<N number>=<N agreement>

Listing One


/*************************************************************************
       Base Value class and ValuePtr class 
       Copyright David Perelman-Hall & Jamshid Afshar 1994.
*************************************************************************/

#ifndef VALUE_H
#define VALUE_H

#include "misc.h"    // GATHERED INCLUDES.  HELPS WITH PRE-COMPILED HEADERS.

// Forward References
class SubstitutionList;
class Value;

// Pointer (or Reference) class to a Value
class ValuePtr {
friend ostream& operator << (ostream& os, const ValuePtr& vp);
private:
   Value* _ptr;
public:
   ValuePtr();
   ValuePtr(const ValuePtr& vp);
   ValuePtr(const Value& v);
   ~ValuePtr();
   void operator=(const ValuePtr& vp);
   bool operator==(const ValuePtr& vp) const;
   bool operator!=(const ValuePtr& vp) const;
   const Value* operator->() const;
   const Value& operator*() const;
   operator const void*() const { return _ptr; }
};
// FORWARD REFERENCES TO CLASSES DERIVING FROM VALUE
class AtomicValue;
class VariableValue;
class DagValue;
class Value {
   public:
      //void CTOR
      Value() {}
      virtual Value* copy() const = 0;
      virtual void write(ostream& os, int level=0) const = 0;
      virtual bool operator == (const Value& value) const = 0;
      virtual bool operator == (const AtomicValue&) const { return FALSE; }
      virtual bool operator == (const VariableValue&) const { return FALSE; }
      virtual bool operator == (const DagValue&) const { return FALSE; }
      virtual bool unify(const Value& value, SubstitutionList&, 
                                                          ValuePtr&) const = 0;
      virtual bool unify(const AtomicValue&, SubstitutionList&, 
                                                              ValuePtr&) const;
      virtual bool unify(const VariableValue&, SubstitutionList&, 
                                                              ValuePtr&) const;
      virtual bool unify(const DagValue&, SubstitutionList&, ValuePtr&) const;
      virtual ValuePtr substitute(const SubstitutionList& substList) const = 0;
};
inline
ostream& operator << (ostream& os, const Value& v)
{
   v.write(os);
   return os;
}
inline ValuePtr::ValuePtr() : _ptr(0) { }
inline ValuePtr::ValuePtr(const ValuePtr& vp) : _ptr(vp._ptr->copy()) { }
inline ValuePtr::ValuePtr(const Value& v) : _ptr(v.copy()) { }
inline ValuePtr::~ValuePtr() { delete _ptr; _ptr=0; }
inline void ValuePtr::operator=(const ValuePtr& vp){
   if(_ptr!=0)
      delete _ptr; 
   if(vp._ptr != 0)
      _ptr = vp._ptr->copy();
   else
      _ptr = 0;
}
inline bool ValuePtr::operator==(const ValuePtr& vp) const
   { return (*_ptr) == *(vp._ptr); }
inline bool ValuePtr::operator!=(const ValuePtr& vp) const
   { return !( *this == vp ); }
inline const Value* ValuePtr::operator->() const { assert(_ptr != 0); 
                                                                return _ptr; }
inline const Value& ValuePtr::operator*() const { assert(_ptr != 0); 
                                                                return *_ptr; }
inline ostream& operator << (ostream& os, const ValuePtr& vp)
{
   assert(vp._ptr != 0);
   os << *(vp._ptr);
   return os;
}
#endif

Listing Two


#include "misc.h"    // GATHERED INCLUDES.  HELPS WITH PRE-COMPILED HEADERS
#pragma hdrstop      // END OF PRE-COMPILED HEADER INCLUDES

//PROTOTYPED FOR EACH DERIVED CLASS
bool Value::unify(const AtomicValue&, SubstitutionList&, ValuePtr&) const
   { return FALSE; }
bool Value::unify(const VariableValue&, SubstitutionList&, ValuePtr&) const
   { return FALSE; }
bool Value::unify(const DagValue&, SubstitutionList&, ValuePtr&) const
   { return FALSE; }

Listing Three


/*************************************************************************
   AtomicValue class -- Copyright David Perelman-Hall & Jamshid Afshar 1994.
*************************************************************************/

#ifndef ATOMIC_H
#define ATOMIC_H

#include <cstring.h>    // BORLAND HEADER
#include "value.h"      // BASE VALUE & VALUEPTR CLASSES

typedef string String;
#define Atomic String

class AtomicValue : public Value, public Atomic {
   public:
      //EMPTY CTOR
      AtomicValue(){}
      //COPY CTOR
      AtomicValue( const AtomicValue& value ) : Atomic(value) { }
      //CONSTRUCTOR
      AtomicValue( const String& str ) : Atomic(str) { }
      //CONSTRUCTOR
      AtomicValue( const char* s ) : Atomic(s) { }
      //RETURN POINTER TO NEW ATOMIC VALUE COPY CONSTRUCTED FROM THIS
      Value* copy() const { return new AtomicValue(*this); }
      // EXPLICIT STRING CONVERSION
      const String& str() const { return (const Atomic&)*this; }
      //ASSIGNMENT OPERATOR
      void operator = (const AtomicValue& value) { Atomic::operator=(value); }
      //EQUALITY
      virtual bool operator == (const Value& value) const;
      virtual bool operator == (const AtomicValue& value) const;
      // UNIFY
      virtual bool unify(const Value& value, SubstitutionList& subst, 
                                                      ValuePtr& result) const;
      virtual bool unify(const AtomicValue& value, SubstitutionList& /*subst*/,
                                                      ValuePtr& result) const;
      // OUTPUT
      virtual void write(ostream& os, int level) const 
                                                { os << (const Atomic&)*this; }
      // SUBSTITUTE
      virtual ValuePtr substitute(const SubstitutionList& /*substList*/) const
         {  return *this; }
};
#endif

Listing Four


#include "misc.h"
#pragma hdrstop

//PROTOTYPED FOR BASE CLASS
bool AtomicValue::operator == (const Value& value) const
   { return value==*this; }
//PROTOTYPED FOR ATOMIC VALUE
bool AtomicValue::operator == (const AtomicValue& value) const
   { return (Atomic&)*this == (Atomic&)value; }
//PROTOTYPED FOR BASE CLASS
bool AtomicValue::unify(const Value& value, SubstitutionList& subst, 
                                                       ValuePtr& result) const
      { return value.unify(*this, subst, result); }
//PROTOTYPED FOR ATOMIC VALUE
bool AtomicValue::unify(const AtomicValue& value, SubstitutionList& /*subst*/,
                                                        ValuePtr& result) const
{ 
   if (value == *this) {
      result = *this;
      return TRUE; 
   }
   else {
      return FALSE; 
   }
}

Listing Five


/*************************************************************************
VariableValue class -- Copyright David Perelman-Hall & Jamshid Afshar 1994.
*************************************************************************/

#ifndef VARIABLE_H
#define VARIABLE_H

#include <cstring.h>    // BORLAND HEADER
#include "value.h"      // BASE VALUE CLASS

typedef string Variable;  // BORLAND

Variable gensym();

class VariableValue : public Value {
private:
   Variable _v;
public:
   //empty constructor
   VariableValue(){}
   //copy constructor
   VariableValue( const VariableValue& value ) : _v(value._v) { }
   VariableValue( const Variable& var ) : _v(var) { }
   Value* copy() const { return new VariableValue(*this); }
   const Variable& str() const { return _v; }

   //assignment operator
   void operator = (const VariableValue& value) { _v = value._v; }
   virtual bool operator == (const Value& value) const;
   virtual bool operator == (const VariableValue& value) const;
   virtual bool unify(const Value& value, SubstitutionList& subst, 
                                                       ValuePtr& result) const;
   virtual bool unify(const AtomicValue&, SubstitutionList&, 
                                                       ValuePtr& result) const;

   virtual bool unify(const VariableValue& var, SubstitutionList& subst, 
                                                       ValuePtr& result) const;
   virtual bool unify(const DagValue&, SubstitutionList&, 
                                                       ValuePtr& result) const;
   virtual void write(ostream& os, int level) const { os << _v; }
   virtual void read(istream& is) { is >> _v; }
   virtual ValuePtr substitute(const SubstitutionList& substList) const;
};
#endif

Listing Six


#include "misc.h"
#pragma hdrstop
#include <stdio.h>

Variable gensym()
{
   static unsigned i = 0;
   char sym[20];
   sprintf(sym, "_X%ud", i++);
   return Variable(sym);
}
bool VariableValue::unify(const Value& value, SubstitutionList& subst, 
                                                        ValuePtr& result) const
{ 
   if (!value.unify(*this, subst, result)) {
      if (!subst.set(_v, value)) 
         return FALSE;
      result = value;
   }
  return TRUE;
}
bool VariableValue::unify(const AtomicValue& value, SubstitutionList& subst, 
                                                        ValuePtr& result) const
{ 
   if( !subst.set(_v, value) )
      return FALSE;
  result = value; 
  return TRUE;
}
bool VariableValue::unify(const VariableValue& var, SubstitutionList& subst, 
                                                        ValuePtr& result) const
{ 
   if( !subst.set_identical(_v, var.str()) ) {
      //cerr << "unify vars (" << *this << "," << var << ")\n"; //##
      return FALSE;
   }
   result = var; 
   return TRUE;
}
bool VariableValue::unify(const DagValue& value, SubstitutionList& subst, 
                                                        ValuePtr& result) const
{
   if( !subst.set(_v, value) )
      return FALSE;
   result = value; 
   return TRUE; 
}
ValuePtr VariableValue::substitute(const SubstitutionList& substList) const
{ 
   if( substList.contains(_v) )
      return substList.val(_v);
   else
      return VariableValue(_v);
}
bool VariableValue::operator == (const Value& value) const
{ return value==*this; }
bool VariableValue::operator == (const VariableValue& /*value*/) const
{ return TRUE;/*## _v == value._v;*/ }

Listing Seven


/*************************************************************************
   DagValue class -- Copyright David Perelman-Hall & Jamshid Afshar 1994.
*************************************************************************/

#ifndef DAG_H
#define DAG_H

#include "misc.h"
#include "value.h"
#include "atomic.h"
#include "variable.h"

typedef MapOf<Feature, ValuePtr> FVList;
typedef MapOfIter<Feature, ValuePtr> FVListIter;

class Dag {
public:
   // CONSTRUCTOR
   Dag():_datap(new Data){}
   // COPY CONSTRUCTOR 
   Dag(const Dag& dag): _datap(dag._datap){}
   // DESTRUCTOR
   ~Dag(){}
   // ASSIGNMENT OPERATOR
   void operator=(const Dag& dag){_datap=dag._datap;}
   // ADD A FEATURE-VALUE PAIR TO THIS DAG
   void add(const Feature& f, const ValuePtr& vp){own_fvlist().enter(f,vp);}
   // EQUALITY OPERATORS
   bool operator==(const Dag& dag) const;
   bool operator!=(const Dag& dag) const {return !operator==(dag);}
   // CLEAR DATA FROM THIS DAG
   void clear(void){own_fvlist().clear();}
   // TEST FOR EXISTENCE OF FEATURE IN THIS DAG
   bool contains(const Feature& feature) const {
      return fvlist().contains(feature);
   }
   // RETRIEVE VALUE ASSOCIATED WITH FEATURE 
   ValuePtr value(const Feature& feature) const {
      if(!fvlist().contains(feature)){
         cerr << "Dag " << *this << " does not have feature " << feature
         << endl;
      }
      assert(fvlist().contains(feature));
      return fvlist().valueOf(feature);
   }
   // RETRIEVE THE LIST OF FEATURES OF THIS DAG
   FeatureSet list_features() const;
   // SET SUBSTITUTION FOR VARIABLES 
   Dag substitute(const SubstitutionList& substList) const;
   // APPLY SUBSTITUTION
   static Dag apply_substitution(const Dag& dag,
      const SubstitutionList& substList){return dag.substitute(substList);}
   // OUTPUT OPERATOR
   void write(ostream& os, int level) const;
   // IOSTREAM OPERATORS
   friend ostream& operator << ( ostream& os, const Dag& dag );
   //friend istream& operator >> ( istream& is, Dag& dag );
private:
   struct Data {
      FVList fvs;
      Data() {}
      Data(const FVList& map) : fvs(map) {}
   };
   Data *_datap;
   const FVList& fvlist() const {return _datap->fvs;}
   FVList& own_fvlist() {return _datap->fvs;}
};
/*
inline
istream& operator >> ( istream& is, Dag& dag )
{ 
   is >> dag.own_fvlist(); 
   return is;
}
*/
// DECLARATION OF GLOBAL UNIFY WHICH ACTUALLY UNIFIES TWO DAGs
bool unify(const Dag& dag1, const Dag& dag2, SubstitutionList& subst,
                                                                  Dag& result);
// DECLARATION & DEFINITION OF GLOBAL UNIFY TAKING TWO DAG REFERENCES,
// DOES NOT ACTUALLY UNIFY, BUT RETURNS BOOLEAN TEST ON THEIR UNIFICATION
inline
bool unify(const Dag& dag1, const Dag& dag2){
   SubstitutionList subst;
   Dag result;
   return unify(dag1, dag2, subst, result);
}
// MAKE DAG_LIST CLASS
typedef ListOf<Dag> DagList;
typedef ListOfIter<Dag> DagIter;

ostream& operator<<(ostream& os, const ListOf<Dag>& list);

class DagValue : public Value {
private:
   Dag _d;
public:
   //empty constructor
   DagValue(){}
   //copy constructor
   DagValue( const DagValue& value ) : _d(value._d) { }
   DagValue( const Dag& dag ) : _d(dag) { }
   Value* copy() const { return new DagValue(*this); }
   //assignment operator
   void operator = (const DagValue& value) { _d = value._d; }

   virtual bool operator == (const Value& value) const
      { return value==*this; }
   virtual bool operator == (const DagValue& value) const
      { return _d==value._d; }
   virtual bool unify(const Value& value, SubstitutionList& subst, 
                                                        ValuePtr& result) const
      { return value.unify(*this, subst, result); }
   virtual bool unify(const DagValue& value, SubstitutionList& subst, 
                                                       ValuePtr& result) const;
   virtual void write(ostream& os, int level) const;
   //virtual void read(istream& is) { is >> _d; }
   virtual ValuePtr substitute(const SubstitutionList& substList) const;
};
#endif

Listing Eight


#include "misc.h"
#pragma hdrstop

// DAG SPECIFIC OSTREAM OPERATOR
ostream& operator << ( ostream& os, const Dag& dag )
{
   dag.write(os, 0);
   return os;
}
// DAG SPECIFIC OUTPUT OPERATOR
void Dag::write(ostream& os, int level) const
{
   os << "[";
   for (FVListIter i(&fvlist()); !i.offEnd(); i.next()) {
      os << i.key() << " : ";
      i.value()->write(os, level+1);
      if (!i.last()) {
               os << "\n";
               for (int i=0; i<level; i++)
                  os << "   ";
               os << " ";
         os << " | ";
         }
      }
   os << "]";
}
// RETRIEVE THE LIST OF FEATURES OF THIS DAG
FeatureSet Dag::list_features() const 
{
   // CREATE A SET OF KEYS TO RETURN
   FeatureSet tmp_features(fvlist().keys(),fvlist().size());
   // RETURN IT
   return tmp_features;
}
// EQUALITY OPERATOR
bool Dag::operator==(const Dag& d) const
{
   // TYPEDEF AN ITERATOR OVER A MAP FROM FEATURES TO VALUE POINTERS
   typedef MapOfIter<Feature, ValuePtr> FVIter;
   // STEP SIUMULTANEOUSLY THROUGH LIST OF FEATURES 
   for (FVIter fv1(&this->fvlist()), fv2(&d.fvlist()); 
                       !fv1.offEnd() && !fv2.offEnd(); fv1.next(), fv2.next()){
      // IF EITHER CURRENT FEATURES OR CURRENT VALUES DON'T MATCH, RETURN FALSE
      if (fv1.key()!=fv2.key() || fv1.value()!=fv2.value())
         return FALSE;
   }
   // RETURN TRUE IF STEPPED ENTIRELY THROUGH BOTH LISTS, ELSE RETURN FALSE
   return fv1.offEnd() && fv2.offEnd();
}
// SUBSTITUTE
Dag Dag::substitute(const SubstitutionList& substList) const
{
   // RETURN DAG
   Dag tmpDag;
   // LIST OF FEATURES IN THIS DAG
   FeatureSet features=list_features();
   // STEP THROUGH LIST OF FEATURES FROM THIS DAG
   for (SetOfIter<Feature> feature(&features); feature; ++feature)
      // ADD TO RETURN DAG THE FEATURE AND ANY VALUE LISTED FOR IT IN THE
      // SUBSTITUTION LIST, AS FOLLOWS:
      // 1) IF THERE IS A SUBSTITUTION FOR IT AND THE VALUE IS A
      //    VARIABLE, ADD WHATEVER EXISTS IN THE SUBSTITUTION LIST
      //    FOR THAT VARIABLE,
      // 2) IF THERE IS NO SUBSTITUTION FOR IT AND THE VALUE IS A
      //    VARIABLE, ADD THE VARIABLE,
      tmpDag.add(feature(),value(feature())->substitute(substList));
   // RETURN THE CREATED DAG
   return tmpDag;
}
// OUTPUT OPERATOR FOR A DAGVALUE
void DagValue::write(ostream& os, int level) const
{
   _d.write(os, level);
}
bool DagValue::unify(const DagValue& value, SubstitutionList& subst, 
                                                        ValuePtr& result) const
{
   Dag unification;
   // CALL GLOBAL FUNCTION, PASSING 2 DAGs
   if (::unify(_d, value._d, subst, unification)) {
      result = DagValue(unification);
      return TRUE;
   }
   else {
      return FALSE;
   }
}
ValuePtr DagValue::substitute(const SubstitutionList& substList) const
{ return DagValue(_d.substitute(substList)); }
// GLOBAL UNIFY FUNCTION
bool unify(const Dag& dag1,const Dag& dag2,SubstitutionList& subst,Dag& result)
{
   // GET SET OF ALL FEATURES FROM BOTH DGAS
   FeatureSet features = dag1.list_features() + dag2.list_features();
   {
   for (SetOfIter<Feature> feature(&features); feature; ++feature) {
      // TRY TO UNIFY VALUES WHERE FEATURE IS COMMON TO BOTH DAGS
      if (dag1.contains(feature()) && dag2.contains(feature())) {
         // cout << "Attempting to unify feature " << feature() 
                                                   << " in both dags." << endl;
         ValuePtr unified_value; // HOLD THE RESULT OF SUCCESSFUL UNIFICATION

         // CALL VALUE::UNIFY THROUGH VALUEPTR WRAPPER, BUILD UP SUBSTITUTION 
         // LIST & RESULTING DAG
         if (dag1.value(feature())->unify(*(dag2.value(feature())), subst, 
                                                              unified_value)) {
            result.add(feature(), unified_value);
         }
         else { // FAILED TO UNIFY VALUES FOR FEATURE COMMON TO BOTH DAGS
            cout << dag1.value(feature()) << " and " << dag2.value(feature()) 
                                                  << " did not unify." << endl;
            return FALSE;
            }
      }
      else { // ADD FEATURE-VALUE WHERE EXISTS IN ONE DAG ONLY
         if(dag1.contains(feature()))
            result.add(feature(), dag1.value(feature()));
         else
            result.add(feature(), dag2.value(feature()));
      }
   }
   }
   Dag result_with_subst;
   {
   // DO VARIABLE SUBSTITUTIONS BUILT UP IN PREVIOUS LOOP
   for (SetOfIter<Feature> feature(&features); feature; ++feature) {
      result_with_subst.add(feature(), 
                                   result.value(feature())->substitute(subst));
   }
   }
   result = result_with_subst;
   return TRUE;
}     End Listings