close

Вход

Забыли?

вход по аккаунту

код для вставкиСкачать
Evaluating the Inferential Utility of
Lexical-Semantic Resources
Shachar Mirkin
Joint work with: Ido Dagan, Eyal Shnarch
EACL-09
[email protected]
Quick Orientation
You are here
Mirkin, Dagan, Shnarch
EACL-09
Evaluating the Inferential Utility of Lexical-Semantic Resources 2
Quick Orientation – Lexical Inference
Who won the football match between Israel and Greece on Wednesday?
Mirkin, Dagan, Shnarch
EACL-09
Evaluating the Inferential Utility of Lexical-Semantic Resources 3
Quick Orientation – Lexical Inference
Who won the football match between Israel and Greece on Wednesday?
ATHENS, April 1 (Reuters) – Greece beats Israel 2-1 in their World Cup
Group Two qualifier.
Mirkin, Dagan, Shnarch
EACL-09
Evaluating the Inferential Utility of Lexical-Semantic Resources 4
Motivation
•
•
•
Common knowledge: Lexical relations are useful for
semantic inference
Common practice: Exploit lexical-semantic resources
•
•
WordNet - synonymy, hyponymy
Distributional-similarity
Yet, no clear picture:
•
•
•
Which semantic relations are needed?
How and when they should be utilized?
What’s available in current resources and what’s missing?
 Our goal - clarify the picture
• thru comparative evaluation
Mirkin, Dagan, Shnarch
EACL-09
Evaluating the Inferential Utility of Lexical-Semantic Resources 5
A Textual Entailment Perspective
•
Generic framework for semantic inference
•
•
•
Recognizing that one text (h) is entailed by another (t)
Addresses variability in text
Applied semantic inference reducible to entailment
 Useful for generic evaluation of lexical inference
Mirkin, Dagan, Shnarch
EACL-09
Evaluating the Inferential Utility of Lexical-Semantic Resources 6
Lexical-semantic relationships
t:
Dear EACL 2009 Participant,
We are sorry to inform you that an Air Traffic and Public Transportation
strike has been announced for Thursday 2 April, 2009.
h:
Athens’ Metro services disrupted in April 2009.
Mirkin, Dagan, Shnarch
EACL-09
Evaluating the Inferential Utility of Lexical-Semantic Resources 8
Lexical-semantic relationships
t:
Dear EACL 2009 Participant,
We are sorry to inform you that an Air Traffic and Public Transportation
strike has been announced for Thursday 2 April, 2009.
verb entailment
h:
located in
Terminology:
Athens’ Metro services disrupted in April 2009.
• Lexical Entailment
hypernymy
• Entailment Rules : LHS  RHS
strike  disrupt
• Should be found in knowledge resources
• Rule Application
• but often not available
Mirkin, Dagan, Shnarch
EACL-09
Evaluating the Inferential Utility of Lexical-Semantic Resources 9
Lexical-semantic relationships
t:
Dear EACL 2009 Participant,
We are sorry to inform you that an Air Traffic and Public Transportation
strike has been announced for Thursday 2 April, 2009.
h:
Athens’ Metro services disrupted in April 2009.
Mirkin, Dagan, Shnarch
EACL-09
Evaluating the Inferential Utility of Lexical-Semantic Resources10
Lexical-semantic relationships
t:
Dear EACL 2009 Participant,
We are sorry to inform you that an Air Traffic and Public Transportation
strike has been announced for Thursday 2 April, 2009.
h:
Athens’ Metro disruptions
• Same Inference when h is a lexical phrase (e.g. IR)
Mirkin, Dagan, Shnarch
EACL-09
Evaluating the Inferential Utility of Lexical-Semantic Resources11
Evaluating
Lexical-semantic Resources
Mirkin, Dagan, Shnarch
EACL-09
Evaluating the Inferential Utility of Lexical-Semantic Resources12
Resources for Lexical Semantic Relationships
•
•
•
Plenty of resources are out there
•
None dedicated for lexical entailment inference
We evaluated 7 popular resources, of varying nature:
•
•
Construction method
Relation types
Extracted relations which:
•
•
Are commonly used in applications
Correspond to lexical entailment
Mirkin, Dagan, Shnarch
EACL-09
Evaluating the Inferential Utility of Lexical-Semantic Resources13
Evaluated Resources
Statistical
extension of
WordNet
Corpus-based
Snow
Based on human knowledge
CBC
Lin-Dep
Lin-Prox
Wiki
Mirkin, Dagan, Shnarch
EACL-09
WordNet
XWN
Evaluating the Inferential Utility of Lexical-Semantic Resources14
Evaluation Rational
•
•
Evaluation Goal
•
Assess the practical utility of resources
Resource’s utility
•
•
Depends on the validity of its rule applications
Vs. % of correct rules
•
Many correct & incorrect rules may hardly be applied
 Simulate rule applications and judge their validity
•
Instance-based evaluation (rather than rule-based)
Mirkin, Dagan, Shnarch
EACL-09
Evaluating the Inferential Utility of Lexical-Semantic Resources15
Evaluation Scheme
Input:
•
Entailment rules from each resource
•
A sample of test hypotheses
•
•
25 noun-noun queries from TREC 1-8
– railway accidents; outpatient surgery; police deaths
Texts from which the hypotheses may be inferred
•
TREC corpora
Evaluation flow:
•
•
Apply rules to find possibly entailing texts
Judge rule applications
•
Utilize human annotation to avoid dependence on a specific system
Mirkin, Dagan, Shnarch
EACL-09
Evaluating the Inferential Utility of Lexical-Semantic Resources16
Evaluation Methodology
Generate intermediate
hypotheses
for each word in h
Test Hypotheses
h = water pollution
…
h’1 = lake pollution
h’2 = soil pollution
Rules Resource
r1 = lake  water
r2 = soil  water
…
valid rule
application
corpus
does
t entail
h?
yes
Chemicals dumped into
the lake are the main
cause for its pollution
P
#
# # ( )
Mirkin, Dagan, Shnarch
Retrieve
matching texts
t1 , t2 , t3 , …
yes
does
t entail
h’ ?
no
no
invalid rule application
sample
texts
High levels of air pollution were
measured around the lake
t is discarded
Soil pollution happens when
contaminants adhere to the soil
EACL-09
Evaluating the Inferential Utility of Lexical-Semantic Resources18
Results - Metrics
Resource
Original h
Snow30k
WordNet
XWN
Wiki
CBC
Lin-Dep
Lin-Prox
Precision
(%)
72
56
55
51
45
33
28
24
Recallshare (%)
8
24
9
7
9
45
36
Precision:
Percentage of valid rule
applications for the resource
Total number of texts entailing
the hypothesis is unknown
 Absolute recall cannot be
measured
Recall-share:
% of entailing sentences
retrieved by the resource rules,
relative to all entailing texts
retrieved by both the original
hypothesis and the rules
Macro-average figures
Mirkin, Dagan, Shnarch
EACL-09
Evaluating the Inferential Utility of Lexical-Semantic Resources19
Results
Resource
Original h
Snow30k
WordNet
XWN
Wiki
CBC
Lin-Dep
Lin-Prox
Mirkin, Dagan, Shnarch
Precision
(%)
72
56
55
51
45
33
28
24
EACL-09
Recallshare (%)
8
24
9
7
9
45
36
Precision:
• Precision generally quite low
• Relatively high precision for
resources based on human
knowledge
• Vs. corpus-based methods
• Snow – still high precision
Recall:
• Some resources’ obtain very
little recall
• WordNet’s recall limited
• Many more relations are found
within (inaccurate) distributionalsimilarity resources
Evaluating the Inferential Utility of Lexical-Semantic Resources20
Results Analysis:
Current Scope and Gaps
Mirkin, Dagan, Shnarch
EACL-09
Evaluating the Inferential Utility of Lexical-Semantic Resources21
Missing Relations
•
Coverage of most resources is limited
•
Lin’s coverage substantially larger than WordNet’s
•
•
•
But not usable due to low precision
Missing instances of existing WordNet relations
• Proper names
• Open class words
Missing non-standard relation types
 next slide
Mirkin, Dagan, Shnarch
EACL-09
Evaluating the Inferential Utility of Lexical-Semantic Resources22
Non-Standard Entailment Relations
•
Such relations had significant impact on recall
•
Don’t comply with any WordNet relation
•
Mostly in Lin’s resources (1/3 of their recall)
•
Sub-types examples:
•
•
•
•
Topical entailment - IBM (company) computers
Consequential - childbirth  motherhood
Entailments of arguments by predicate – breastfeeding  baby
Often non-substitutable
Mirkin, Dagan, Shnarch
EACL-09
Evaluating the Inferential Utility of Lexical-Semantic Resources23
Required Auxiliary Info (1)
•
•
Additional information needed for proper rule application:
•
•
Should be attached to rules in resources;
and considered by Inference systems
Rules’ priors
•
•
Likelihood of a rule to be correctly applied in arbitrary context
Some information is available (WordNet’s sense order, Lin’s ranks)
•
Empirically tested - not sufficient on its own (too much recall lost)
– Using top-50 rules, Lin-prox loses 50% of relative recall
– Using first-sense: WordNet loses 60%
Mirkin, Dagan, Shnarch
EACL-09
Evaluating the Inferential Utility of Lexical-Semantic Resources24
Required Auxiliary Info (2)
Lexical context
•
•
•
Known issue: rules should be applied only in appropriate contexts
Main reason for relatively low precision of WordNet
Addressed by WSD or context-matching models
Logical context
•
Some frequently-ignored relations in WordNet are significant:
•
•
•
•
•
efficacy  ineffectiveness
arms  guns
government  official
(antonymy)
(hypernymy)
(holonymy)
1/7 of Lin-Dep recall
Require certain logical conditions to occur

Include info about suitable lexical & logical contexts of rules
• Combine prior with context models scores (Szpektor et al. 2008)

Needed: typology of relations by inference types
Mirkin, Dagan, Shnarch
EACL-09
Evaluating the Inferential Utility of Lexical-Semantic Resources25
Conclusions
•
•
Current resources far from being sufficient
Lexical relations should be evaluated relative to applied
inference
•
Rather than on correlations with human associations or WordNet
• Need dedicated resources for lexical inference rules
• Acquire additional missing rule instances
• Specify and add missing relation types
• Add auxiliary information needed for rule application
Mirkin, Dagan, Shnarch
EACL-09
Evaluating the Inferential Utility of Lexical-Semantic Resources26
Conclusions – Community Perspective
•
Observation: missing feedback about resource utility for
inference in applications
•
•
•
Resources and applications typically developed separately
Need tighter feedback between them
Community effort required:
•
•
•
•
Publicly available resources for lexical inference
Publicly available inference applications
Application-based evaluation datasets
Standardize formats/protocols for their integration
Mirkin, Dagan, Shnarch
EACL-09
Evaluating the Inferential Utility of Lexical-Semantic Resources27
Thank you!
Shachar Mirkin
[email protected]
[email protected]
1/--страниц
Пожаловаться на содержимое документа