Natural language processing (Computer science); Parsing (Computer grammar); Sequential machine theory
Larger-first partial parsing is a primarily top-down approach to partial parsing that is opposite to current easy-first, or primarily bottom-up, strategies. A rich partial tree structure is captured by an algorithm that assigns a hierarchy of structural tags to each of the input tokens in a sentence. Part-of-speech tags are first assigned to the words in a sentence by a part-of-speech tagger. A cascade of Deterministic Finite State Automata then uses this part-of-speech information to identify syntactic relations primarily in a descending order of their size. The cascade is divided into four specialized sections: (1) a Comma Network, which identifies syntactic relations associated with commas; (2) a Conjunction Network, which partially disambiguates phrasal conjunctions and llly disambiguates clausal conjunctions; (3) a Clause Network, which identifies non-comma-delimited clauses; and (4) a Phrase Network, which identifies the remaining base phrases in the sentence. Each automaton is capable of adding one or more levels of structural tags to the tokens in a sentence. The larger-first approach is compared against a well-known easy-first approach. The results indicate that this larger-first approach is capable of (1) producing a more detailed partial parse than an easy first approach; (2) providing better containment of attachment ambiguity; (3) handling overlapping syntactic relations; and (4) achieving a higher accuracy than the easy-first approach. The automata of each network were developed by an empirical analysis of several sources and are presented here in detail.
If this is your thesis or dissertation, and want to learn how to access it or for more information about readership statistics, contact us at STARS@ucf.edu
Doctor of Philosophy (Ph.D.)
College of Engineering and Computer Science
Department of Electrical Engineering and Computer Science
Electrical Engineering and Computer Science
Written permission granted by copyright holder to the University of Central Florida Libraries to digitize and distribute for nonprofit, educational purposes.
Length of Campus-only Access
Doctoral Dissertation (Open Access)
Dissertations, Academic -- Engineering; Engineering -- Dissertations, Academic
Van Delden, Sebastian Alexander, "Larger-first partial parsing" (2003). Retrospective Theses and Dissertations. 1059.