Abstracts for Literature and Linguisitics Computing
Volume 11, Number 4/


TABLE OF CONTENTS

  1. "Automatic Morphological Analysis of Basque"
    Inaki Alegria, Xabier Artola, and Kepa Sarasola

  2. "Feature-Finding for Text Classification"
    Richard S. Forsyth and David I. Holmes

  3. "Tampering with the Text to Increase Awareness of Poetry's Art"
    Estelle Irizarry

  4. "Virginia Woolf's 'The Waves' in French and German Waters: Computer Assisted Study in Literary Translation"
    Jan-Mirko Maczewski

  5. "A Hybrid Disambiguation Model for Prepositional Phrase Attachment"
    Haodong Wu and Teiji Furugori

ABSTRACTS

"Automatic Morphological Analysis of Basque"
Inaki Alegria, Xabier Artola, and Kepa Sarasola

Abstract
This paper describes the components of a robust and wide-coverage morphological analyser for Basque. The analyser is based on the two-level formalism and has been designed in an incremental way with three main modules: the standard analyser, the analyser of linguistic variants, and the analyser without lexicon which can recognize word-forms without having their lemmas in the lexicon. Using lexical transducers for our analyser we have improved both the performance of the different components of the system and the description itself. The analyser is a basic tool for current and future work on automatic processing of Basque and its first two applications are a commercial spelling corrector and a general purpose lemmatizer/tagger.

"Feature-Finding for Text Classification"
Richard S. Forsyth and David I. Holmes

Abstract
Stylometrists have proposed and used a wide variety of textual features or markers, but until recently very little attention has been focused on the question: where do textual features come from? In many text-categorization tasks the choice of textual features is a crucial determinant of success, yet is typically left to the intuition of the analyst. We argue that it would be desirable, at least in some cases, if this part of the process were less dependent on subjective judgement. Accordingly, this paper compares five different methods of textual feature finding that do not need background knowledge external to the texts being analysed (three proposed by previous stylometers, two devised for this study). As these methods do not rely on parsing or semantic analysis, they are not tied to the English language only. Results of a benchmark test on ten representative text-classification problems suggest that the technique here designated Monte-Carlo Feature-Finding has certain advantages that deserve consideration by future workers in this area.

"Tampering with the Text to Increase Awareness of Poetry's Art"
Estelle Irizarry

Abstract
Theoreticians have linked the act of poetic creation inextricably to the principle of linguistic 'play'. A number of Hispanic poets have experimented with transformational and permutational creativity of the type that computers can accomplish quite easily. Such computer-induced play enhances the study of poetry by imbuing the poetic text with a new and dynamic dimension in which on-screen manipulation destabilizes the text, allowing the reader to explore it more thoroughly than is possible in the fixed printed medium and to appreciate it as a unique blend of word, structure and pattern. Well-known poems from writers who have themselves experimented with textual alteration, as well as works of others who have not, serve to illustrate diverse modalities of textual alteration, which are grouped by the types of transformation carried out by the computer.

"Virginia Woolf's 'The Waves' in French and German Waters: Computer Assisted Study in Literary Translation"
Jan-Mirko Maczewski

Abstract
The case study analyses the first chapter of Virginia Woolf's novel The Waves and its two French and three German translations with the help of the PALIMPSEST suite of programs. Created specifically for such tasks, the software provides assistance with viewing the texts in an interlinear format and offers facilities for the automatic generation of multilingual and -textual word and phrase based concordances and statistics. Pursuing the typical aims of literary translation studies, the investigation focuses on an analysis of the relationships between the translations and the original text as well as on a consideration of the influences that can be identified within the corpus of translations; apart from encoding, technical matters are not elaborated upon. The new ways of assessing literary translations offered by PALIMPSEST yield noteworthy results which contribute new empirical evidence to the critical debate on translation in general and on the translations of The Waves in particular. As a result, computer assisted literary translation studies appear as a field of research worth exploring further.

"A Hybrid Disambiguation Model for Prepositional Phrase Attachment"
Haodong Wu and Teiji Furugori

Abstract
Prepositional Phrase (PP) attachment is a major cause of structural ambiguity in natural language. Many proposals have increasingly relied on large-scale corpus to resolve this problem. However, this approach encounters the notorious sparse-data problem that produces poor results on disambiguation. We in this paper offer a hybrid method which integrates corpus-based approach with knowledge-based techniques for PP attachment disambiguation. It explores a wide-variety of information, including co-occurrence frequencies from annotated corpora, conceptual relationships and conceptual features from a machine-readable dictionary, and syntactic clues from our linguistic observations. We use dictionary definitions and human knowledge to overcome the sparse-data problem. An experiment shows an accuracy rate of 87.7% of our method over 3043 sentences in real English text that contain ambiguous PPs. This result is better than those of any existing methods.


The full text of the articles to which these abstracts refer will be published in Literary & Linguistic Computing, Vol 11, No 4, which is due to be published in December 1996. If you would like further details about the journal, including details of subscription rates, please contact:

Caroline Lock
Journals Marketing (H-CLC)
Oxford University Press
Walton Street
Oxford OX2 6DP
UK

Tel: +44 (0)1865 56767
Fax: +44 (0) 1865 267782
lockc@oup.co.uk

Further information on all Oxford journals can be found on the OUP Web site at: http//www.oup.co.uk/

Copyrights to preprints remain with Oxford University Press. Users may download preprints and reproduce them for their own personal use, but downloading of preprints for any other activity, including reposting to other electronic bulletin boards or archives, may not be done without the writen consent of Oxford University Press. It is the responsibility of Oxford University Press to notify the H-CLC editors when they wish to have abstracts removed from the H-CLC WWW site.

Return to H-CLC Homepage