Search Top Index
HELP FACETS R. Evans, June 1983 Revised by Fran Evelyn, Aug 85 This help file describes the FACETS package, a facility for the semantic interpretation of sentences for use with LIB GRAMMAR. The package is loaded by typing LIB FACETS (in addition to LIB GRAMMAR - but the order doesn't matter). The package allows you to define FACETS of meaning of sentences and phrases, and is best described by way of an example. Suppose you have the following simple phrase structure grammar, which you are going to use with LIB GRAMMAR: s -> np + vp np -> det + noun | det + adj + noun | propn vp -> vi | vt + np det -> A | AN | THE noun -> MAN | WOMAN | BLOCK adj -> RED | BLUE | GREEN propn -> JOHN | MARY vi -> RUNS | SMILES vt -> LIKES | TOUCHES The normal way of coding this grammar for LIB GRAMMAR to use is as follows: lib grammar; vars grammar, lexicon; [ [s [np vp] ] [np [det noun] [det adj noun] [propn] ] [vp [vi] [vt np] ] ] -> grammar; [ [det a an the] [noun man woman block] [adj red blue green] [propn john mary] [vi runs smiles] [vt likes touches] ] -> lexicon; setup(grammar, lexicon); Having done this we can ask for syntax trees of sentences and other phrases (eg noun phrases, transitive verbs etc) to be printed out. Now we add some semantics - what sort of things would we like to know about various parts of the sentence? Well, for a start we might want to know what the SUBJECT and OBJECT (if any) are, and what ACTION the sentence describes. But we might also want to know things about parts of the sentence - for any noun phrase we could ask what GENDER it is, and also what COLOUR it is. (More likely we would mean the gender and colour of the referent of the noun phrase - but this is a distinction we shall ignore here). In fact there are lots of other things we might add - what sort of object a noun phrase is, when the event described happened, what mood the people in it are in etc etc, but for this example let's stick to the five given above. First of all (but after typing LIB FACETS) we tell the system that they are names of facets. To do this we use the 'facet' statement which is like a 'vars' statement. So we say : facet subject, object, action, colour, gender; NOTE: once a variable name has been used as a facet, it cannot be used as an ordinary variable as well. Now we can talk about the facet values of phrases - that is, of subtrees of the syntax trees produced by LIB GRAMMAR. The simplest thing we can do is assign and print out values for the facets. So we could say: "blue" -> colour(np([the block])); or colour(np([the block]))=> If you try this, you will get a mishap message following the second command - the system seems to have forgotten about your first statement! In fact what has happened is that LIB FACETS thinks of these two noun phrases as different objects because they were parsed separately. Every time you parse something the facet values for it start afresh. If you want it to remember a noun phrase you must save it in a variable like this : vars x; np([the block]) -> x; "blue" -> colour(x); colour(x)=> That works. Now if you try gender(x)=> you get a mishap message - the same one as before. This is because the system doesn't know the answer and you haven't specified a way to work it out. So now let's move on to specifying SEMANTIC RULES. Think about how you might get the colour of a noun phrase like THE RED BLOCK. It's pretty easy - you just use the colour specified by the adjective. How about THE BLOCK WHICH IS RED ? Again, it's the colour of the adjective, but it will be in a different place in the syntax tree. Clearly, how to find a facet for a phrase depends on what syntactic structure that phrase has. So the sensible place to put rules for working out the facet values is with the syntax rules. In a moment we shall see how we do this. First of all we have another question - where does the information come from ultimately? The answer to this is - THE WORDS THEMSELVES, i.e. the lexicon. So we can begin to get an idea of the way we construct facets - we specify values for facets for words in the lexicon, and then with each grammar rule, we specify how to combine the facet values of the subcategories to get the values for the main category. So we see that we must add semantic rules, assignments, etc. to each syntactic rule and lexicon entry. -- ADDING THE SEMANTIC RULES ------------------------------------------------ Before looking at some actual semantic rules, let us see how we specify the new grammar. The grammar set out below is the same grammar as that given above, but with a slightly different layout to accomodate the new bits: lib grammar; lib facets; vars grammar, lexicon facet colour, gender, action, subject, object; defgram [s [np vp] semrule ...(pop11 code)... endsemrule ] [np [det noun] semrule ...(pop11 code)... endsemrule [det adj noun] semrule ...(pop11 code)... endsemrule [propn] semrule ...(pop11 code)... endsemrule ] [vp [vi] semrule ...(pop11 code)... endsemrule [vt np] semrule ...(pop11 code)... endsemrule ] endgram -> grammar; deflex [det a semrule ...(pop11 code)... endsemrule an semrule ...(pop11 code)... endsemrule the semrule ...(pop11 code)... endsemrule ] [noun man semrule ...(pop11 code)... endsemrule woman semrule ...(pop11 code)... endsemrule block semrule ...(pop11 code)... endsemrule ] . .(etc) . [vt likes semrule ...(pop11 code)... endsemrule touches semrule ...(pop11 code)... endsemrule ] endlex -> lexicon; setup(grammar, lexicon); NOTES: ----- The outermost list brackets in the definition of grammar and lexicon have been replaced by DEFGRAM and ENDGRAM, DEFLEX and ENDLEX respectively. Each alternative decomposition in the grammar rules, and each word in the lexicon, has been extended by adding a semantic rule, which is bracketed by semantic rule brackets (i.e. SEMRULE and ENDSEMRULE). -- SEMANTIC RULES ----------------------------------------------------------- The body of a semantic rule is a piece of POP-11 code - just like a procedure. This code specifies how to determine the values of any facets associated with the grammar rule it is paired with. So for example, if we have a facet COLOUR for a noun phrase, then our semantic rules for noun phrases might look like this : [np [det noun] semrule colour is "dont_know"; endsemrule [det adj noun] semrule adj1 gives colour; endsemrule [propn] semrule colour is "dont_know"; endsemrule ] In this example, colour is given by the adjective. If there is no adjective, then the colour facet is set to 'dont_know'; otherwise, you get the colour of the noun phrase from the colour of the adjective. Notice the following features of the rules: LIB FACETS provides two special operators IS and GIVES to make defining rules easier. If you write <facet> is <value>; this is shorthand for assigning the <value> to the <facet>; i.e. it is like <value> -> <facet> (<the bit of tree matching the rule-pattern>); If you write <category> gives <facet>; this is shorthand for "the value of this facet for the bit of tree matching the rule pattern is the same as the value of the facet for this constituent category"; i.e. it is like <facet> (<category>) -> <facet> (<the bit of tree matching the rule-pattern>); In fact, for more complicated code, these two operations might not be sufficient, in which case you might want to use <the bit of the tree matching the rule-pattern> itself; and so it is contained in a special variable SELF. Also each syntax rule gives a decomposition in terms of subcategories, and for each of these there is a subtree. To access these subtrees, you have variables like ADJ1, DET1, ADJ2 etc. which contain them. Thus in the first rule of the example above [np [det noun] semrule colour is "dont_know"; endsemrule [det adj noun] semrule adj1 gives colour; endsemrule [propn] semrule colour is "dont_know"; endsemrule ] we have the following relations: self = [np ^det1 ^noun1] det1 = [det .....] noun1= [noun ......] If a syntax rule has more than one instance of a particular category, the instances are counted from left to right; e.g. the syntax rule [vp [v np np]] would have the following variables in its semantic rule: self = [vp ^v1 ^np1 ^np2] v1 = [v .......] np1 = [np .......] np2 = [np .......] Finally, notice what happens in the second semantic rule given above. When the system executes adj1 gives colour; it first has to find out what 'colour(adj1)' is. To do this, it uses the semantic rule provided for adjectives adj1 has the value [adj .....] which gets matched against a syntax rule, and the appropriate semantic rule is run. So we might have rules in the lexicon like: [adj red semrule colour is "red"; endsemrule blue semrule colour is "blue"; endsemrule big semrule colour is "dont_know"; endsemrule ] In the last case, the adjective doesn't tell you the colour, so you have to return 'don't_know'. In general, if your semantic rule does not specify what the value of some facet is, then the value returned is 'undef'. Thus there are three ways to signal that the information required is not available. Firstly, if there is no semantic rule to apply, a mishap occurs; secondly, you can choose some value to return, e.g. 'dont_know'; and thirdly, when there is a rule but it doesn't provide a value, the value remains 'undef'. When it has found the colour of the adjective, the original rule can use it in any way it wishes - in the example, it just passes it up as the value of the colour facet of the noun phrase (using GIVES). Armed with these facts, we can write the semantic rules for our example system : lib grammar; lib facets; vars grammar, lexicon; facet colour, gender, action, subject, object; defgram [s [np vp] semrule subject is np1; vp1 gives action; vp1 gives object; endsemrule ] [np [det noun] semrule colour is "dont_know"; noun1 gives gender; endsemrule [det adj noun] semrule adj1 gives colour; noun1 gives gender; endsemrule [propn] semrule colour is "dont_know"; propn1 gives gender; endsemrule ] [vp [vi] semrule vi1 gives action; endsemrule [vt np] semrule vt1 gives action; object is np1; endsemrule ] endgram -> grammar; deflex [det a ;;; our simple system doesn't have rules for an ;;; determiners - so we just leave them out. the ] [noun man semrule gender is "male" endsemrule woman semrule gender is "female" endsemrule block semrule gender is "neuter" endsemrule ] [adj red semrule colour is "red"; endsemrule blue semrule colour is "blue"; endsemrule green semrule colour is "green"; endsemrule ] [propn john semrule gender is "male"; endsemrule mary semrule gender is "female";endsemrule ] [vi runs semrule action is "run"; endsemrule smiles semrule action is "smile"; endsemrule ] [vt likes semrule action is "like"; endsemrule touches semrule action is "touch"; endsemrule ] endlex -> lexicon; setup(grammar, lexicon); Try loading all that (mark it as a range and then do ENTER lmr) and then try the following : vars x; s([john touches a red block]) -> x; ;;; x now has the syntax tree for the sentence colour(subject(x)) => colour(object(x)) => gender(object(x)) => gender(x) => You should get the following results: ** dont_know ;;; this is the value we assigned for noun phrases whose ;;; colour we don't know ** red ** neuter ** undef ;;; x is a sentence and we didn't specify how to work out ;;; the gender of a sentence The difference between questions which produce 'don't_know' and questions which produce 'undef' demonstrated above allows us to distinguish between "silly questions" (e.g. 'what is the gender of a sentence?') and "questions which are sensible but to which we don't know the answer" (e.g. 'what colour is John?'). -- CLEARING DOWN THE FACET DATA --------------------------------------------- There are two ways of making the FACETS package forget things, one working through the semantic rules and one through the facet values: resetfacets() - This causes all the semantic rules to be forgotten, and should be used when you want to reload your grammar - otherwise any old rules left over from an earlier load might interfere with the new ones. clearfacet(<facet>) - clear down all the values stored for <facet>. This is probably rarely necessary, but might result in greater speed in some cases. -- FURTHER POSSIBILITIES ---------------------------------------------------- The grammar system above is, of course, very simple. Here are a few suggestions of ways it could be extended: (1) Adding DEFAULT values. For example, some nouns have default colours (e.g. grass, the sun etc). This colour could be included in the semantic rule for the noun itself. Then the rule for NP's with adjectives could be modified so that if an attempt to get the colour from the adjective produces "dont_know", we use the default colour given by the noun, e.g. semrule vars clr; colour(adj1) -> clr; if clr ="dont_know" then colour(noun1) else clr endif -> colour(self); endsemrule Notice the use of SELF here instead of IS - both work equally well. Of course, to see this work properly you must add some non-colour adjectives to the lexicon! (2) Turning sentences into assertions in the database. Try writing a routine which inputs and parses a sentence, and then adds assertions about it to the database. For example, from the sentence [john touches a red block] we might get [touch john block] [colour block red] [gender block neuter] [gender john male] (3) Giving names to noun phrases. The data from (2) would be better if it gave a different name to each block it encountered, e.g. [a red block touches a green block] gives [touch block block] [colour block red] [colour block green] ... which is not very helpful. To improve things, add another facet 'name' which returns the name of an np. If the name is a proper noun then it should be the person's name; if an ordinary noun then *GENSYM could be used to produce a new unique name for it. (4) Once you have names, you might like to introduce pronouns. A pronoun specifies a few facets (e.g. GENDER) which could be turned into database assertions with an 'unknown' in them (eg [gender ?obj male]). Then you could match this with the assertions actually in the database (from earlier sentences) to get a value for OBJ (the referent of the pronoun) and then use its name facet where you would have used the pronoun. (5) Add some facets for the determiners which give you information about quantifiers etc. For example with the determiner 'the' you might want to do some database matching (as for pronouns) to find out which object is referred to, while for 'a' you might generate a new object (with a new name - see (3) above) and return a structure like [exists block3 *] which is then processed by the SENTENCE semantic rule - for instance it might look for the '*' and replace it with assertions about block3 from other parts of the sentence. For example, the sentence [a red block smiles] could generate [exists block1 *] ... from the np smile ... action from the vp [exists block1 [smile block1]] ...the sentence rule puts them together. -- TRACING SEMANTIC RULES --------------------------------------------------- LIB FACETS provides four commands for tracing semantic rules: ftrace unftrace ftraceall unftraceall You use these just like trace, untrace, untraceall (there's no traceall!) - for the first two you supply names of semantic rules to be traced (see below) and the last two refer to ALL the semantic rules (i.e. trace them all or untrace them all). Each semantic rule you specify has a name of the following form. Names of rules in the GRAMMAR start with "g_", names of rules in the LEXICON start with "l_". Then comes the category name on the left hand side of the rule, followed by a number to distinguish it from other rules for the same category. (Note: this is when you use DEFGRAM and DEFLEX only - freestanding rules as described in *MOREFACETS have their name as part of the definition of the rule.) So, for example, the three semantic rules in the NP grammar rule from above are named as follows: [np [det noun] semrule ... endsemrule - g_np1 [det adj noun] semrule ... endsemrule - g_np2 [propn] semrule ... endsemrule - g_np3 ] and the lexicon rules for NOUN are named: [noun man semrule ... endsemrule - l_noun1 woman semrule ... endsemrule - l_noun2 block semrule ... endsemrule - l_noun3 ] Note: rules for the same category are numbered from top to bottom. These names do not normally appear anywhere except in the trace messages discussed above, and sometimes in mishap messages, but you use them in FTRACE and UNFTRACE commands to refer to the rules. For example, if you type: ftrace g_np1 l_noun1; then you will get trace messages only for those rules. If you then type unftrace l_noun1; you will be left with only the rule named g_np1 traced. (You can specify as many rule names as you like in these two statements). To trace some more rules (in addition to those traced already), you just give another FTRACE command with more rule names, and similarly you can untrace any of the rules which are currently traced. When a rule is being traced, it produces tracing messages (just like TRACE) whenever it runs (i.e. whenever its pattern successfully matches a bit of tree). These look just like procedure calls with two arguments (the tree and the name of the facet) and one output (the facet value if specified). So if you FTRACE g_np2 above (the rule [np [det adj noun]]) and ask for colour(np([the red block])); you would get tracing messages: > g_np2 [np [det the] [adj red] [noun block]] colour < g_np2 red -- TRACING FACETS THEMSELVES ------------------------------------------------ The actual facets (eg COLOUR) can be traced by using TRACE in the normal way, e.g. trace colour; Two sorts of trace messages will appear. If your program looks up the value of a facet (for example, adj1 gives colour; causes colour(adj1) to be looked up) then the trace message will look like > colour [adj green] < colour green But if you SET the value of a facet, you will also get a tracing message - e.g. if you do colour is "green"; (which causes "green" -> colour(self)) you get > colour green [adj green] < colour This may look a bit confusing at first - you have been warned! - for those interested HELP *UPDATER will explain what's going on, although it is not important for the use of FACETS. See also HELP *MOREFACETS - for a more detailed discussion of the FACETS package HELP *GRAMMAR - for references to GRAMMAR packages TEACH *GRAMMAR - for an introduction to LIB GRAMMAR --- C.all/help/facets --- Copyright University of Sussex 1992. All rights reserved. ----------