Research/Blog

CellStrat > Research/Blog > Artificial Intelligence > Natural Language Processing > Dependency Parsing for NLP

Dependency Parsing for NLP

April 16, 2020
Posted by: vsinghal
Category: Natural Language Processing

No Comments

#CellStratAILab #disrupt4.0 #WeCreateAISuperstars

Last Saturday, AI Researcher Indrajit Singh presented a superb workshop on Dependency Parsing used in NLP.

The topics covered in this workshop included :-

Understand Dependency Parsing
Syntactic Structure: Consistency and Dependency
Dependency Grammar and Treebanks
Transition-based dependency parsing

Dependency Parsing involves detecting which words depend on which other words.

Dependencies are often good for semantic tasks, as related words are close in the tree.

It is also possible to create labeled dependencies, that explicitly show the relationship between words.

Ambiguities in natural language :-

Check this sentence – it can be interpreted two ways :-

I saw a girl with a telescope.

We have prepositional phrase (PP) attachment ambiguity.

PP attachment ambiguity is a problem for parsing. As it is syntactically ambiguous, to resolve the ambiguity properly, some form of semantics is needed.

A relatively simple way to solve the problem is to use examples of PP attachment ambiguity where the result is known.

(https://pdfs.semanticscholar.org/568c/a3fa2184a431a9dda216b0c4b3623414fe7e.pdf)

In English, the basic ambiguity occurs when the sequence VP – NP – PP occurs.

The above ambiguity can multiply in larger sentences. A key parsing decision is how we ‘attach’ various constituents : PPs, adverbial or participial phrases , infinities , coordination’s. E.g. :-

The board approved [its acquisition] [by Royal Trustco Ltd. ] [of Toronto] [for $27 a Share] [at its monthly meeting].

A scope ambiguity (or coordination scope ambiguity) is an ambiguity that occurs when two quantifiers or similar expressions can take scope over each other in different ways in the meaning of a sentence. Here are some examples.

“Every man loves a woman.”

Is it every man loves same woman (specific person) or that every man loves some woman (gender).

We can have verb phrase attachment ambiguity.

“He looked at the dog with one eye.”

Is it that he looked with one eye OR dog has one eye ?

Types of Parsing :-

There are two types of parsing :-

Dependency – focuses on relations between words

Phrase structure – focuses on identifying phrases and their recursive structure

Dependency Grammar :-

Let’s look at dependency grammar. It can be of two types :-

Typed – Label indicating relationship between words

Untyped – Only which words depend

When we speak about Dependency structure :-

Here “AI” is modifier, dependent, child, subordinate.

Whereas, “Lab” is head, governor, parent, regent.

Formal Definition of DP :-

“Dependency parser analyzes the grammatical structure of a sentence, establishing relationships between “head” words and words which modify those heads. “

In dependency parsing the goal is to produce a directed tree representing the syntactic structure of the input sentence.

A sentence is parsed by choosing for each word what other word (including ROOT) is it a dependent of.

Usually some constraints:

Only one word is dependent of ROOT
Don’t want cycles A –> B , B –> A

This makes the dependencies a Tree.

Annotated Data :-

There is some annotated data (Universal Dependencies treebanks) available on the websites like http://universaldependencies.org/.

What are the advantages of a treebank ?

When starting off, building a treebank seems a lot slower and less useful than building a grammar. But a treebank gives us many things :-

Reusability of the labor
- Many parsers, part-of-speech taggers, etc. can be built on it
- Valuable resource for linguistics
Broad coverage, not just a few intuitions
Frequencies and distributional information
A way to evaluate systems

Methods of Dependency Parsing :-

Dynamic Programming
- Eisner (1996) gives a clever algorithm with complexity O(n3), by producing parse items with heads at the ends rather than in the middle.
Constraint Satisfaction
- Edges are eliminated that don’t satisfy hard constraints. Karlsson (1990), etc.
Graph Algorithms
- You create a Minimum Spanning Tree for a sentence
- McDonald et al.’s (2005) MSTParser scores dependencies independently using an ML classifier (he uses MIRA, for online learning, but it can be something else)
- Neural graph-based parser: Dozat and Manning (2017)
“Transition Based Parsing” or “deterministic dependency parsing“.
- Greedy choice of attachments guided by good machine learning classifiers
- E.g., MaltParser (Nivre et al. 2008). Has proven highly effective.

Current methods for DP :-

Nowadays – we use two main approaches for DP :-

Graph-based: better results
Transition-based: far faster

Google uses transition-based dependency parsing for their MT systems.

The standard methodology for evaluating dependency parsers, as well as other kinds of parsers, is to apply them to a test set taken from a treebank and compare the output of the parser to the gold standard annotation found in the treebank.

Dependency parsing has been evaluated with many different evaluation metrics. The most widely used metrics are listed below.

Current state-of-the-art results give us upto 93.9% accuracy.

Merge of transition-based and graph-based
With features of 2nd and 3rd level
The precision “jump” occurs when you look at the history – (what the parser has done so far) in order to make decisions

The most commonly used metrics are the labeled attachment score (LAS) and the unlabeled attachment score (UAS). (https://www.aclweb.org/anthology/K15-1033.pdf)

CellStrat Training Course on “Natural Language Processing with Deep Learning” :-

Learn advanced NLP with CellStrat’s hands-on course on “Natural Language Processing with Deep Learning”.

Details and Enrollment : https://bit.ly/CSNLPC

Questions ? Please contact us at +91-9742800566 !