close

Вход

Забыли?

вход по аккаунту

код для вставкиСкачать
Syntactical analysis
and Prolog
1
Methods for syntactical analysis
Notation:
Alphabet - finate set of symbols.
Word of alphabet A is a finite sequence of symbols from A.
A*
=
the set of all word nade from symbols of the alphabet A.
Grammar G is defined by <N, T, P, S>, where S (start) is one fixed element of N
denoting the starting symbol and the other symbols are sets with following content
 N
nonterminal symbols,
 T
terminal symbols,
 P
rewriting rules
such that:

N and T are disjunctive.

Rewriting rule has the form L --> R,
s.t. L, R are some words in the alphabet N U T meeting following constraints
L contains at least 1 nonterminal (condition referred to as ntl ).
In other words, P is a set of pairs of words s.t.

P is a subset of (N U T)* N (N U T)* x (N U T)*
2 / 23
Umělá inteligence I.
Language and grammars
Language LG generated by grammar G consists of all words in the alphabet
of terminals, ie. from T*, which can be obtained from the starting symbol S
using the rewriting rules P.
Let us denote concatenation of finite sequence of applications of a rule by -->*.
LG = { ω T* : S -->* ω }
Chomsky’s hiearchy of grammars based on the format of rewriting rules
Notation: Let
α, β be any words from (N U T)*
γ nonempty words from (N U T)*
G. of typ 0: The most general case the rewriting rules have to meet ntl
condition, only. These languages (of type 0) are exactly those languages, which
are recognized by Turing machines.
G. of typ 1: Context grammars. Any rewriting rule must have the form
α X β --> α γ β, where X is a symbol from N.
G. of typ 2: Context-free grammars. Any rewriting rule must have the form
X --> α, where X is a symbol from N.
G. of typ 3: Regular grammars. Any rewriting rule must have the form X --> aY
or X --> a, where X, Y are some symbols from N and a is fromUmělá
T. inteligence I.
3 / 23
.
Notation: Let a, b, c be terminals, X, Y and S nonterminals
Example 1:
Context-free grammar G1 with 2 rules: S --> a S b, S --> ab
LG1 = {an bn : n is any natural number }
Example 2:
Context-free grammar G2 with this set of rules:
S --> a S a, S --> b S b, S --> aa, S --> bb, S --> a, S --> b,
LG2 = { α α R : α is any word from symbols a,b, α R is reverse of α }
LG2 contains all the palindroms of the considered alphabet (p. is a word with
same reading from left and from the right).
Example 3:
Find the simplest grammar generating just the words an bn cn, where n is any
natural number.
4 / 23
Umělá inteligence I.
Eaxmple 4:
Find out what which words are created by the grammer G3 with this set of
rewriting rules:
S --> LR, L --> LaA, L --> LbB, L --> ε
AR -->Ra, BR --> Rb, R --> ε, Aa --> aA, Ab -->bA, Ba -->aB,Bb -->bB ,
where ε is the empty word.
Indicate the type of this grammar.
5 / 23
Umělá inteligence I.
NLP and the means of Prologu
Context-free grammars correspond to rules of Prolog. The rule
sentence
--> noun_phrase, verb_phrase.
Can be expressed as
sentence(S):noun_phrase(N),verb_phrase(V),append(N,V,S).
Definite Clause Grammars (DCG)
Prolog has build in operator “-->”, complemented by several useful predicates,
e.g..
phrase(Non_terminal, List_of_words, Remainder).
Success if List_of_words = (List_of_words1 concatenated with
Remainder) and List_of_words1 is correctly created from Non_terminal
using the rules of considered grammar
…
6 / 23
Umělá inteligence I.
DCG in Prologu
Shortcut
notation:
a --> [].
a --> [z],a.
a --> []|[z],a.
--> build-in binary operator,
[Expression] is
a terminal symbol.
To generate by all the Words of the considered
grammar, we can apply predicate phrase/3:
phrase(Non_terminal,
String,
Remainder_which_could_not_be_analyzed_as_the non_terminal)
?- phrase(a,Y, []).
Y=[z]; Y=[z,z]; Y=[z,z,z];…
7 / 23
Umělá inteligence I.
NLP and means of Prolog
sentence
--> noun_phrase, verb_phrase.
noun_phrase
--> proper_noun.
noun_phrase
--> article,adjective,noun.
noun_phrase
--> article,noun.
verb_phrase
--> itransitive_verb.
verb_phrase
--> transitive_verb,noun_phrase.
article
--> [the].
adjective
--> [lazy];[rapid].
proper_noun
--> [achilles].
noun
--> turtle];[president];[queen];[cat];[fish].
itransitive_verb
--> [sleeps].
transitive_verb
--> [beats].
?- phrase(sentence,Y, []). Y = [achilles,sleeps]
?- phrase(sentence,[achilles, beats, the, lazy, turtle], []).
8 / 23
Umělá inteligence I.
How to ensure matching between subject
and the other parts of the sentence?
A)
sentence
--> noun_phrase_singular,
verb_phrase_singular.
sentence --> noun_phrase_plural,
verb_phrase_plural.
B)
sentence --> noun_phrase(case),
verb_phrase(case).
DCG allow for the solution B) with a new argument: this
solution is elegant, transparent and easily modifiable!!!
9 / 23
Umělá inteligence I.
Example: Achilles
sentence(s(NP,VP)) --> noun_phrase(NP),verb_phrase(VP).
noun_phrase(np(N)) --> proper_noun(N).
noun_phrase(np(Art,Adj,N))--> article(Art),adjective(Adj),noun(N).
noun_phrase(np(Art,N)) --> article(Art),noun(N).
verb_phrase(vp(IV))
--> itransitive_verb(IV).
verb_phrase(vp(TV,NP))--> transitive_verb(TV),noun_phrase(NP).
article(art(the)) --> [the].
adjective(adj(lazy)) --> [lazy].
adjective(adj(rapid)) --> [rapid].
proper_noun(pn(achilles)) --> [achilles].
noun(n(turtle)) --> [turtle].
noun(n(president)) --> [president].
noun(n(cat)) --> [cat].
itransitive_verb(iv(sleeps)) --> [sleeps].
transitive_verb(tv(beats)) --> [beats].
?- phrase(sentence(X),S,[]).
X = s(np(pn(achilles)),vp(tv(beats), np(art(the), adj(lazy), n(turtle))))
S = [achilles, beats, the, lazy, turtle]
10 / 23
Umělá inteligence I.
Utilization of arguments in DCG
?- phrase(sentence (X), Sent).
X = s( np (pn (achilles)), vp( iv (sleeps)) )
Veta
= [achilles, sleeps], …
Arguments help to identify selected parts of the sentence.
?- phrase(sentence (T), [achilles,beats,the, lazy,turtle]).
T = s(np(pn(achilles)),
vp(tv(beats),
np(art(the),
adj(lazy),
n(turtle))))
It can be complemented by a self-explanatory printing:
-----s-----np-----pn---achilles
-----vp-----tv-----beats
-----np-------art------the
-------adj-----lazy
---------n----turte
Argument can have a numeric value – see example „numbers“
11 / 23
Umělá inteligence I.
Example: Achilles - continued
?- phrase(sentence(X), [the, rapid, cat, sleeps], []).
X = s(np(art(the), adj(rapid), n(cat)), vp(iv(sleeps)))
?- phrase(sentence(X), [the, rapid, cat, beats, achilles,
with, bag], []).
No (0.00s cpu)
?- phrase(sentence(X), [the, rapid, cat, beats, achilles,
with, bag], Z).
X = s(np(art(the), adj(rapid), n(cat)), vp(tv(beats),
np(pn(achilles))))
Z = [with, bag]
The last argument returns the rest of the input string, which could not be
analyzed wrt. considered nonterminal.
12 / 23
Umělá inteligence I.
Example: numbers
numeral(N) --> n_999(N).
numeral(N) --> n1_9(N1),[thousand], n_999(N2),{N is N1*1000 + N2}.
n_999(N) --> n_99(N).
n_999(N) --> n1_9(N1), [hundred], n_99(N2), {N is N1*100 + N2}.
n_99(N)
--> n0_9(N).
n_99(N)
--> n10_19(N).
n_99(N)
--> n20_90(N).
n_99(N)
--> n20_90(N1), n1_9(N2), {N is N1 + N2}.
n0_9(0) --> [].
n0_9(N) --> n1_9(N).
n1_9(1) --> [one].
n1_9(2) --> [two].
n10_19(10) --> [ten]. n10_19(11) --> [eleven].
n20_90(20) --> [twenty].
n20_90(30) --> [thirty].
?- phrase(numeral(X), Y, []). X = 1 Y = [one];X = 2 Y = [two]
13 / 23
Umělá inteligence I.
Example: numbers - continued
?- phrase(numeral(X), [one, thousand, two, hundred, eleven], []).
X = 1211
?- phrase(numeral(X), [one, thousand, two], []).
X = 1002
?- phrase(numeral(X), [thousand, two, hundred, one], []).
No
?- phrase(numeral(X), [two, thousand, two, hundred, one], []).
X = 2201
?- phrase(numeral(X), [two, thousand, eleven], []).
X = 2011
?- phrase(numeral(X), [two, thousand], []).
X = 2000
Caution!
?- phrase(numeral(X), [zero], V).
X = 0 V = [zero]
More (0.00s cpu)
No
14 / 23
Umělá inteligence I.
p.140
Interpretation
The meaning of the proper noun ‘Socrates’ is the term
socrates
proper_noun(socrates)
--> [socrates].
The meaning of the property ‘mortal’ is a mapping from
terms to literals containing the unary predicate mortal
property(X=>mortal(X))
--> [mortal].
The meaning of a proper noun - verb phrase sentence is a
clause with empty body and head obtained by applying
the meaning of the verb phrase to the meaning of the
proper noun
sentence((L:-true))
--> proper_noun(X),verb_phrase(X=>L).
?-phrase(sentence(C),[socrates,is,mortal].
C = (mortal(socrates):-true)
15 / 23
Umělá inteligence I.
A transitive verb is a binary mapping from a
pair of terms to literals
transitive_verb(Y=>X=>likes(X,Y)) --> [likes].
A proper noun instantiates one of the
arguments, returning a unary mapping
verb_phrase(M) -->
transitive_verb(Y=>M),proper_noun(Y).
16 / 23
Umělá inteligence I.
sentence((L:-true)) --> proper_noun(X),verb_phrase(X=>L).
sentence((H:-B))--> [every],noun(X=>B),verb_phrase(X=>H).
% NB. separate ‘determiner’ rule removed, see later
verb_phrase(M)
--> [is],property(M).
property(M) --> [a],noun(M).
property(X=>mortal(X)) --> [mortal].
proper_noun(socrates)
--> [socrates].
noun(X=>human(X)) --> [human].
?-phrase(sentence(C),S).
C = human(X):-human(X)
S = [every,human,is,a,human];
C = mortal(X):-human(X)
S = [every,human,is,mortal];
C = human(socrates):-true
S = [socrates,is,a,human];
C = mortal(socrates):-true
S 17=/ 23[socrates,is,mortal];
Umělá inteligence I.
18 / 23
Umělá inteligence I.
1/--страниц
Пожаловаться на содержимое документа