Archive for the ‘Reviews’ Category

Review: Pycha, Nowak, Shin and Shosted (2003)

2008 Feb 1 in Reviews | Comments (0)

Anne Pycha, Pawel Nowak, Eurie Shin, and Ryan Shosted (2003) Phonological Rule-Learning and Its Implications for a Theory of Vowel Harmony. In Proceedings of WCCFL 22.
[at scholar.google.com]

Gist: Formal simplicity (using fewer phonological features) is more important than phonetic naturalness for the learnability of morphological rules. English speakers can learn a vowel harmony morphological alternation or vowel disharmony alternation more quickly than an arbitrarily conditioned alternation. The phonetic naturalness of harmony did not give it a statistically significant advantage over disharmony.

In the introduction, they set up the problem of the role of phonetics in phonology. They contrast the positions of phonetics-based phonology, in which ease of perception and articulation are encoded directly in the grammar during learning, to evolutionary phonology, in which they affect grammar only diachronically, by shaping misperception and reanalysis. They note the synchronic productivity of vowel harmony processes has been tested, and some studies of formal simplicity versus phonetic naturalness in learnability have also been done, but that these studies were flawed.

Their experiment was an artificial language-learning experiment with three conditions:
- In the Vowel Harmony condition, the vowel in the suffix agreed in backness with the stem vowel, so it was formally simple and phonetically natural.
- In the Vowel Disharmony condition, the vowel in the suffix disagreed in backness with the stem vowel, so it was formally simple but phonetically unnatural.
- In the Arbitrary Control condition, the vowel in the suffix was front for [i, æ, ʊ] and back for [ɪ, ɑ, u], so it was formally complex and phonetically unnatural.
Each condition was assigned 10 English speakers, and there were three phases of the procedure: a passive listening phase, a learning phase with guessing and feedback, and a testing phase with no feedback.

The participants in the Harmony and Disharmony condition learned the patterns significantly better than the Arbitrary condition. Accuracy in the Harmony condition was a little higher than in Disharmony condition, but this difference was not significant, smaller than the difference from Arbitrary. This suggests that formal simplicity is more important for learnability than phonetic naturalness. That is, since Harmony and Disharmony are synchronically just as easy, as cross-linguistic preference for harmony arises from diachronic misperception and reanalysis of variation.

This study has some potential confounds, however. Pycha et al. note that there maybe some interference from rounding or orthography, and mention that one participant in the arbitrary condition apparently gave up. Actually, all three conditions have outliers at the low accuracy end, which is left unexplained. The possible interference from orthography could be quite serious. The task’s paradigm pushes participants to think symbolically with familiar explicit concepts. Orthography figures prominently in how naive learners conceptualize language, so even though the stimuli were presented aurally, the orthographic association is unavoidable, and the alphabetic sequence and typical orthographic similarities among the vowels likely influence the result.

Review of Wedel 2007

2007 Oct 16 in Reviews | Comments (0)

Wedel, A. B. (2007). Feedback and regularity in the lexicon. Phonology, 24:147–185.
[online at Google scholar]

This paper shows how simple simulations can model phonologic phenomena as emergent from the dynamics of a rich memory model. Using an exemplar-based model of the lexicon, Wedel discusses two sets of simulations.

One shows how cross-linguistic patterns of allophony can be seen as different attractors in a dynamic system. Like how some languages only use apical /l/, some only use velar /l/, and some use both in different contexts, random production or perception error and articulatory or perceptual markedness preferences, along with similarity bias resulting from analogical error over the variety of contexts in the lexicon produces three metastable states, corresponding to the three kinds of patterns witnessed in natural languages. Furthermore, this series of simulations shows how strict dominance in OT is often true, but can nevertheless sometimes be violated. The model predicts that strict dominance is violated when a majority of examples in the lexicon have the constraints in conflict, whereas strict dominance does hold true when constraints conflict less often in the lexicon.

The second simulation shows how word stress patterns can emerge from a similar interaction of kinds of error, driven by the distribution of features in the lexicon. The simulation is set up with a bias toward alternating stress and a bias toward vowel+sonorant as the minimum heavy rhyme. The interaction of these constraints in the lexicon gives two different patterns, depending on how large a portion of syllables in the lexicon have vowel+sonorant rhymes. If the number is small, the alternating stress bias dominates, and the stress pattern becomes consistently alternating, basically ignoring the heavy syllable constraint. However, if the number is large, the pattern is generalized as any vowel+consonant is heavy also, because the [+sonorant] feature together with half the [+consonantal] that are in alternating syllables looks like a [+cons] constraint. This is the pattern observed in the languages of the world, that CVV = heavy is common, and CVV+CVC = heavy is common, but CVV + CV[+son] = heavy is rare.

These studies provide a new perspective into the causes underlying phonological patterns. This model apparently currently lacks an analytic formulation, but the general patterns exhibited here suggest that a more phonetically-driven, dynamical model of phonology promises a better understanding of how phonological patterns arise.

Review of Ussishkin, Twist and Velan 2007

2007 Oct 2 in Reviews | Comments (0)

Ussishkin, A., Twist, A., and Velan, H. (2007). Lexical organization in Semitic: Psycholinguistic evidence from Modern Hebrew and Maltese. Distributed by email.

This paper describes two studies that begin a research program to investigate the psychological validity of consonantal root vs whole word theories of Semitic morphology. The traditional analysis of Semitic verbal and nominal morphology states that each lemma class is based on a single consonantal root, such that each word form is produced by the combination of the consonantal root with a syllabic (vowel & consonant) affix, into a bare (C*V*)* template. However, other analyses have proposed that words might be accessed as whole words. (And that the template merely reflects historical word formation processes.) For example, a previous proposal by Ussishkin theorized that the word forms in a semantically-related lemma class are all derived from a single base form, with overriding infixes.

There are two experiments described here, one in Hebrew and one in Maltese. The basic experimental paradigm is that of lexical decision. Whereas previous related research has typically used visual presentation of stimuli, in these studies the stimuli were presented auditorily. Since the orthography of Semitic languages typically gives preferential status to consonants, orthographic presentation of stimuli creates a potential confound. The primary stimuli were words and pseudo-words created using two common conjugation patterns (called binyanim for Hebrew, themes for Maltese). The consonants were also common roots, so that the pseudo-words were incidentally unattested forms rather than morphologically illicit.

In both studies, the frequency effect was not properly controlled, in the first study because of an error in the design (Ussishkin, personal communication — but didn’t elaborate), and in Maltese because there was no corpus available at that time. The results of these studies are difficult to interpret, but with a later study (presented in person) they support the theory that both the consonantal root and the whole word are stored in long term memory. The facilitatory effect of word form frequency as well as the facilitatory effect of morphological family size (the number of word forms in a lemma class) indicate that information about whole words is stored in memory, but little or no delay for more complex forms and the organization into lemma classes, as well as other studies, make it clear that the consonantal root does have strong psychological validity.

Review of Carbone et al. 2004

2007 Sep 17 in Reviews | Comments (0)

Carbone, M., Gal, Y., Shieber, S., and Grosz, B. (2004). Unifying annotated discourse hierarchies to create a gold standard. In Proceedings of 4th SIGDIAL Workshop on Discourse and Dialogue.
online at Citeseer
online at Google Scholar

This paper discusses work attempting to create more authoritative discourse annotations by automatically combining annotations produced by a few human annotators. It uses the Boston Directions Corpus, which has discourse annotations based on Grosz and Sidner’s theory. They note in the introduction that work has focused on LDS over HDS because of the difficulties in evaluating HDS: annotation is more difficult and takes more time, annotation is more subjective (i.e. lower inter-annotator agreement), and it is unclear what metric to use for measuring agreement or similarity between two annotations. They consider 5 different methods of automatically combining annotations — consensus (full agreement) with and without hierarchical information, majority consensus with and without hierarchical information, and conflict-free union — and compare them to complete union and taking the best single annotation as measured by inter-annotator agreement. They then evaluate the original annotations against the unified annotation, using kappa, recall, precision, and non-crossing brackets. As is typically the case, the methods with high recall had low precision. The high recall methods had high kappa and the high precision methods had high non-crossing brackets scores. The conflict-free union method and flat majority consensus did well on kappa and recall metrics, but the authors suggest the hierarchical majority consensus is better for the purpose of having a high precision, hierarchical gold standard.

They compare the combined annotation with the contributing annotations instead of with other annotations created for the purpose “for the sake of scientific validity”. However, considering that only three annotations are used to create the the gold standard, the similarities used here are probably artificially high, like testing on training data. This method gives no indication of how applicable these comparisons are to other annotations. For example, it would have been better to compare these against similarly prepared annotations unified from the non-specialist annotations.

Review of Hirschberg and Nakatani 1996

2007 Sep 11 in Reviews | Comments (0)

Hirschberg, J. and Nakatani, C. (1996). A prosodic analysis of discourse segments in direction-giving monologues. In Proceedings of the 34th ACL.
online at Citeseer
online at Google Scholar

This article describes an analysis of annotation reliability for the hierarchical discourse segmentation of the Boston Directions Corpus. The Boston Directions Corpus is a set of direction-giving monologues collected by Hirschberg and Grosz. Speakers were prompted to tell another person how to accomplish 9 navigation tasks around Boston. Oral reading samples for the same tasks were also obtained by having the subjects return several weeks later to read aloud transcripts of their own monologues. The transcripts were annotated by linguists familiar with ToBI prosodic annotation conventions and Grosz and Sidner’s theory of discourse structure. In this study, the sub-corpus from one speaker is used to compare the reliability of annotation with versus without access to the audio corpus. Using raw agreement, the kappa coefficient and Flammia’s generalized kappa, the annotation with the audio corpus was shown to be markedly better, bumping kappa from “unreliable” levels near .5 to “reliable” levels near .7. Furthermore, they report on acoustic features that were found to correlate with a phrase’s position within its discourse segment. On average, initial phrases were found to be higher pitch and louder, and have longer pauses after and shorter pauses before. Medial phrases and final phrases had lower pitch and volume and shorter pauses before. They differ in that final phrases are spoken faster and have long pauses after, where medial phrases have short pauses.

None of the results they report are surprising, but they do confirm a wide number of previous studies, using a fairly reliable and quantitative methodology. Of course using audio files helps with identifying discourse structure! But here we see how much it helps — without the audio files, they would not be able to achieve reliable annotation, whereas with it they do. On the other hand, the segments where there is agreement show much the same intonation correlations in the text-only annotation and the text+speech annotation. Again, of course we know there are pauses at discourse boundaries and it’s not too hard to notice that pitch and loudness descend during a discourse segment. But here by being quantitative and comprehensive, they provide a foundation for further studies.