Creative Positivity: Characters, Subsets, Rules, Lexicon

For this lab, I introduced 4 characters to the alphabet. These are

J is Used to represent the verb coger. so the lexical form for coger is coJer which is g by default and takes the surface form j before a suffix with o or a

The g-j mutation is formulated in the rule J realized as j. This rule also uses the subset B to represent a, o. This rule simply assures J is not realized as j unless it is in the left context of an environment starting with a character from subset B

The z-c mutation is handled by z realized as c rule. This rule converts lexical z to c only in the right context of a morpheme starting with an e. This situation can also arise in e insertion for pluralization. The rule also takes that into account. This rule interacts with pluralization because z is also a consonant. In this rule, we explicitly allow passing of e following a character from consonant set CO

Pluralization, likewise, takes z-c mutation into account. It takes a word ending with a consonant and insert and e if the string takes the plural suffix s.

The other characters introduced are for verbs in the lexicon. The irregular irregular verb cocer is represented as KOQ. (coc) and its surface form is cue. Since the examples in the batch file just take cocer’s irregular forms starting with cue, this solution could suffice.

For the irregular verbs such as ejercer, the character C was used. Lexical C becomes null in surface form. There are two types of irregular verbs written with C. One of them is the as two groups of irregular verbs : first group conocer, paracer drop the c in the root of the verb and takes zc in their conjugation. (suffixes are written concanated with zc as in zcamos) The other group with verbs vencer, ejercer , drops the c in the rot and takes z instead. They and their suffixes are represented similarly to the first group.

There are other irregular verbs such as cruzar which has irregularities in indicative and subjunctive tenses.

I must note that wherever a suffix was ambigious, a set of suffixes were created to avoid the situation of multiple & incorrect interpretations of a verb. E.g.
We group a suffix for all kinds of verbs (ar, er,ir, irregular ones together)
V_SUFFIX6:
+e V_Suffix6 (pres.subj,1p,sg)
+a V_Suffix6 (pres.subj,1p,sg)
+zca V_Suffix6 (pres.subj,1p,sg)
+za V_Suffix6 (pres.subj,1p,sg)
now when I have an irregular verb like cruce, it may also be interpreted as 3p,sg because it takes a suffix group that does not explicitly relate a group of verbs to their suffixes but rather aggregates them. I might have created redundant suffixes in the end.

The big design issue for me was to represent irregular verbs and being as close to the original as possible.

As for extensibility, I feel I had improved over my one automaton solution in the first assignment. The rules are self-contained to the extent of interactions with other rules.
I believe same goes for the lexicon. The representation of irregularity of verbs may be my one concern, as with this approach it seems you can run out of surface characters quickly if you try to handle a lot of irregularities.

Creative Positivity

About Me

Previous Posts

Friday, March 10, 2006

Characters, Subsets, Rules, Lexicon

0 Comments: