How To Erase Reference Domains from a Project
Prolog Development Center
Tutorial Version 1.0
Last updated: 21-06-06
1
Erasing Reference Domains
In this article, I will explain how to remove the use of
reference domains from a program. Or rather, I will explain how it is done in
the PIE (Prolog Interpreter Engine) program. The method is rather simple and
can be applied to a wide range of reference domain usages. The overall
principle of the method is to explicitly introduce variables in the form of
objects with determ facts. If the fact is set, the variable is bound; if it is
unset, the variable is free. The method for setting the variable will unbind the
variable when backtracking across the variable. It is necessary to write our
own unification algorithm, but this is rather trivial.
2 PIE
with Reference Domains
In the PIE, reference domains are used to represent terms in
the target language (i.e. an Edinburg like Prolog language). Using reference
domains makes it possible to use the built in unification from Visual Prolog for
performing the unification in the target language.
The term domain looks like this:
domains
refint = reference integer.
refsymb = reference string.
refstr = reference string.
refchar = reference char.
vid = reference string.
domains
terml = reference term*.
term = reference
var(vid);
cmp(refsymb,terml);
list(term,term);
nill;
atom(refsymb);
int(refint);
str(refstr);
char(refchar).
It is only the reference on the term domain itself that
is interesting, all the others are there for technical reasons, which are
irrelevant for this problem.
The domain is interpreted as follows:
·
a free term represents a free term in the target language;
·
var(vid) represent a variable in its textual form in the target
language e.g., var("X") represents
X;
·
cmp(refsymb,terml) represents a compound term, e.g.,
cmp("a", [var("X")]) represents
a(X);
·
list(term,term)
represents a list term with head and tail, e.g.,
list(var("H"),var("T"))
represents [H|T];
·
nill represents the empty
list [];
·
atom(refsymb) represents an atomic symbol, e.g.
atom("hello")
represents hello;
·
int(refint) represents an integer literal, e.g.
int(17) represents
17;
· str(refstr) represents a string literal, e.g.
str("Hello World!")
represents "Hello World!";
·
char(refchar) represents a character literal, e.g. char('a')
represents 'a'.
In this context, the only really interesting thing is that
a free term represents a free term in the target language, since this is where
the reference domain comes in play.
3 PIE
without Reference Domains
The way to eliminate the reference domains is by
introducing an explicit class that represents variables. Here I think about the
terms that can be either free or bound, rather than the syntactic variables in
the target language (i.e. var(vid)).
We add an extra functor to the term domain and remove all
the reference stuff, like this:
domains
terml = term*.
term =
refVar(variable);
var(string);
cmp(string,terml);
list(term,term);
nill;
atom(string);
int(integer);
str(string);
char(char).
Notice that the actual code in PIE differs from
the one presented here. This is due to other considerations than those
presented in this paper. I will briefly explain them below.
variable
is an interface describing objects that can represent a term or represent being
unbound.
interface variable
predicates
getTerm : () -> engine::term Term.
predicates
setTerm : (engine::term Term) nondeterm.
end interface variable
getTerm/0-> will return a term, this will either be the
term that the variable is bound to, or a term refVar(V) representing the
variable itself.
setTerm/1 is used for setting the term.
Given this, we can write a unification algorithm. First,
we will, however, create a normalization routine:
clauses
normalize(refVar(Var)) = Var:getTerm() :-
!.
normalize(T) = T.
clauses
normalizeList([]) = [].
normalizeList([H|T]) = [normalize(H)|normalizeList(T)].
The purpose of the normalization is to normalize
variables, that is to replace a variable by the term it is bound to (or if
unbound the variable itself).
The unification algorithm is quite simple:
clauses
unify(A, B) :-
unify_1(normalize(A), normalize(B)).
clauses
unify_1(list(H1, T1), list(H2, T2)) :-
!,
unify(H1, H2),
unify(T1, T2).
unify_1(cmp(ID, L1), cmp(ID, L2)) :-
!,
unify_list(L1, L2).
unify_1(T, T) :-
!.
unify_1(T, refVar(Var)) :-
!,
Var:setTerm(T).
unify_1(refVar(Var), T) :-
!,
Var:setTerm(T).
unify_1(atom(T), str(T)).
unify_1(str(T), atom(T)).
clauses
unify_list([], []).
unify_list([A|AL], [B|BL]) :-
unify(A, B),
unify_list(AL, BL).
The algorithm simply unifies terms structurally, the most
interesting part is where V:setTerm is called. This is where a variable is
bound to a term (which can also be a variable).
The implementation of the variable class looks like this:
implement variable
facts
term_fact : (engine::term Term) determ.
clauses
getTerm() = getTerm_term(Term) :-
term_fact(Term),
!.
getTerm() = var(This).
class predicates
getTerm_term : (engine::term Term1) -> term Term0.
clauses
getTerm_term(var(V)) = V:getTerm() :-
!.
getTerm_term(T) = T.
clauses
setTerm(Term) :-
assert(term_fact(Term)).
setTerm(_) :-
retractAll(term_fact(_)).
fail.
end implement variable
Most of the code is trivial, there is deterministic
term_fact for being free or bound, getTerm will return the last term in a
(potential) chain of variables.
setTerm is more interesting, the first clause is obvious,
the second is there to release the variable if backtracking takes place.
Let us look at some code to see how it works:
…, V:setTerm(T), fail.
When executing this code the following takes place:
·
we enter setTerm and assert the
term_fact
· setTerm is left with a backtrack point to the second clause of
setTerm
·
we meet fail and therefore backtrack to the second clause of
setTerm
·
The second clause of setTerm retracts the
term_fact and then fails
·
Therefore we backtrack to the previous backtrack point
So the net effect of the second clause is to unbind the
variable on backtracking, and otherwise let backtracking proceed as it would
without the second clause.
The rest of the updates necessary is to replace the build
in unification with calls to our own unification routine. In the old code, we,
for example, have this clause:
call("date", [int(Y), int(M), int(D)]) :-
free(Y), free(M), free(D),
!,
platformSupport5x::date(Y, M, D).
Here (even though it may not be obvious)
Y from the head is unified
with the first term returned from platformSupport5x::date, etc. We replace this implicit built-in
unification with an explicit call to our own unification algorithm:
call("date", [TY, TM, TD]) :-
!,
platformSupport5x::date(Y, M, D),
unify(TY, int(Y)),
unify(TM, int(M)),
unify(TD, int(D)).
TY
can be a free refVar in which case
unify will bind it to int(Y).
Alternatively, it can be bound. If it is bound, the
unify will only succeed if it is bound
to the correct integer literal.
It is worth noticing that the updates necessary to remove
reference domains from PIE are rather well behaved. Naturally, we had to change
the reference domain itself. We also had to implement the variable class and a
unification algorithm, but, as you can see, this is rather simple. Finally, we
have to replace built-in unification with our own unification predicate. These
updates have the good property that they are local, one clause it rewritten to
another clause, but the overall predicate is unchanged. All in all, the main
structure of PIE is completely unchanged. There are only updates in the engine
module and the module for writing terms.
4
Deviations in the Real PIE
As mentioned above the real updates made to PIE are
slightly different from what is presented here. The principle is however
completely the same. The main difference is that the
var() functor is removed
from the term domain. The reason for this is that terms that contain variables
in PIE does not need to be able to represent terms that contain syntactic
variables, as they do not exist in PIE syntax. Also refVar() functor has been
renamed to var() functor. On the other hand, there are also some updates
relating to writing terms with variables.
5
Conclusion
We have presented a quite simple method for eliminating
the use of reference variables in a program, which maintains the overall
structure of the program. It is my belief that the method can be applied in
most if not all cases. The update of PIE was made in aprx. 16 hours. So, even
though the PIE implementation was deeply based on reference domains the
elimination of these was rather simple.