Anatomy of Estonian declension, part I

2023-01-28 18:48:00 +0000
Reading time: 17 mins

What is “declension”? As per Wikipedia:

In linguistics, declension (verb: to decline) is the changing of the form of a word, generally to express its syntactic function in the sentence, by way of some inflection. Declensions may apply to nouns, pronouns, adjectives, adverbs, and articles to indicate number (e.g. singular, dual, plural), case (e.g. nominative case, accusative case, genitive case, dative case), gender (e.g. masculine, neuter, feminine), and a number of other grammatical categories.

Estonian language is exactly one of the languages, where declension is quite important. Maybe not so much as in Slavic languages, but way more important, compared to English or Dutch. Estonian names (primarily nouns and adjectives) count 14 (by some definition - 15) cases, but most of those are simple agglutinations, i.e. appending a certain constant suffix to some basic form. However, some cases are not so simple, and getting used to them takes quite some time. In this blog post, we’ll try to identify some of the ways Estonian declension works.

Let’s go! In fact, I have a database of Estonian lexemes (described in another blog post), and we’ll be using it for our research. I will not be using the Wikidata this time, however it must be fully possible to reproduce with Wikidata Lexemes as well.

Let’s begin with formally identifying the “real” cases, i.e. the ones where forms are NOT (or not always) produced by simply appending a suffix to basic form. In order to that, we’ll find the lexemes (words) where inflected form + case syntax is different from the actually observed form.

Let’s begin with the usual suspects: “nina taga” group of cases (i.e. forms, ending in -ni, -na, -ta and -ga):

with base as (
select r.representation, ef.paradigm_id as id
from ekilex_forms ef  
join representations r on ef.word_representation_id = r.id
where ef.form_type_combination_id  = 2 and r.representation <> '-'
),
declined as (
select r.representation, ef.paradigm_id as id
from ekilex_forms ef  
join representations r on ef.word_representation_id = r.id
where ef.form_type_combination_id  = 14 and r.representation <> '-'
)
select base.id, base.representation, declined.representation as inflected, concat(base.representation, 'ga') as suffixed
from base 
join declined on base.id = declined.id
where declined.representation <> concat(base.representation, 'ga');

this produces ~70 results, which is basically nothing. In all cases, these can be explained by multiple observed basic forms (basic form in Estonian is omastav, i.e. genitive case). So, obviously, where there is more than one omastav, we’ll see “discrepancies”.

If we throw away all the words with more than one singular genitive, we’ll get just these 4 results:

Word Base form Translation Actual -ga form Formula-based form
keegi kellegi someone kellegagi kellegiga
kumbki kummagi one of both kummagagi kummagiga
miski millegi something / nothing millegagi millegiga
ükski ühegi no one/ anyone ühegagi ühegiga

And here we see very interesting phenomenon, remnants of a system, that is still very much present in Finnish: relative positioning of suffixes. -gi/-ki is so-called emphatic suffix, i.e. you add it to an end of the word when you want to put special emphasis, highlight this word in a phrase. And, as we can see, it has to be placed in the last position, after the case suffix -ga. By the way, words with -gi/-ki normally never make it in the dictionary, because, it being emphatic, it can be easily applied to any worm of any word. But these particular words, initially being just emphatic forms of other words, ended up being very important and deserving own place in the vocabulary.

Now, when we know of these exceptions, it’s safe to exclude them as well and exclude the -ga case (which is called comitativ, and means “with, or including, this word”), from the consideration as the real inflectional case.

Let’s now look at -ta case (abessive, which means “without this word”). Same thing! In all of the cases (except those we’ve already excluded), the observed (as provided by our reference dictionary.

Same results for -na (essive) and -ni (terminative) cases.

We get some interesting results with translative -ks case:

Word Base form Translation Actual case form Formula-based form
see selle this seks selleks
seesama sellesama this same sekssamaks sellesamaks
seesama sellesama this same sellekssamaks sellesamaks

These observed discrepancies are in fact again consequences of duplication, but in this case, not of the base form but of the “cased” form. The word see has 2 forms for singular translative: seks/selleks, so we get 1 result where dictionary form is different from the formula of “base form + -ks”. It’s even more “interesting” with seesama, which according to dictionary has 2 translative case forms: sellekssamaks and sekssamaks, so this word can be considered one and only true deviation from the case formula for the translative case in the whole of Estonian language (in singular only, there’s entirely different story with plural).

Now, let’s look at so-called locative cases, that are used (for the most part, but not only) to reflect positioning (placement) of object or direction of action. Let’s begin with ablative case that has -lt case suffix.

Word Base form Translation Actual case form Formula-based form
see selle this selt sellelt
seesama sellesama this same selleltsamalt sellesamalt
seesama sellesama this same seltsamalt sellesamalt
too tolle that tolt tollelt

Again, usual suspects, which we should probably exclude from further analysis. These words and these forms, apparently, occur so often in everyday speech, that they have developed multiple realizations, selt/sellelt, and, by analogy, tolt/tollelt, all of which have gained enough traction to be included in the dictionary.

Next one: adessive case, with an -l suffix. Having excluded words from above tables, we see these new forms popping up in naughty forms list:

Word Base form Translation Actual case form Formula-based form
kes kelle who kel kellel
mis mille what mil millel

Same story as above, but also there’s some lesson lurking there: we observe that a form with -llel- is very frequently reduced to just -l: millel/mil, sellelt/selt, kellel/kel, tollelt/tolt, and so on. This is probably our first hint on existing of two parallel paradigms (spoiler: the longer one, and the shorter one) for most of the personal pronouns, on which we probably will discover more going forward.

Next one: allative case with -le ending.

Word Base form Translation Actual case form Formula-based form
ma mu I mulle mule
sa su you (informal) sulle sule
ta ta he/she talle tale

These are again, three instances of shortened paradigm for personal pronouns, and we can observe that -le in these words is applied as -lle, and I thus far have no idea as of “why”. I guess it just sounds more natural this way, to a native speaker!

With this, we have covered so-called “exterior” group of locative cases (the ones that might be translated using “on” or “onto”).
Next, we’ll cover “interior” ones (that might be translated with “in” or “into”).

First, elative case, with the -st formula. And here, we observe one exception, that occurs in many words, but all of these words are just compounds with kodu as basic component:

Word Base form Translation Actual case form Formula-based form
isakodu isakodu father’s home isakodunt isakodust
kodu kodu home kodunt kodust
koolkodu koolkodu school house koolkodunt koolkodust

So, one more exception in our collection! Not sure how it is explained though

Next, inessive, with -s formula. No exception! Phew…

And, final locative case: illative, with -sse formula. No exception again! Wait a minute, this is… fishy. I know for a fact that in this case there’s tons of words that do not comply with a formula. Let’s check out some paradigm, for example that very same kodu, on Sõnaveeb. Aha! Here, we see what has happened: Sonaveeb (which a website, maintained by Eesti Keele Instituut, i.e. Institute of Estonian Language), simply considers all such “exceptions” to be instances of a separate case, so-called “lühike sisseütlev”, i.e. “short illative”, this is where it diverges from the English Wikipedia on the subject (or, rather, wikipedia diverges a bit).

Anyway, this “short illative case” does not have a formula. This way, we have established that Estonian language has 4 “real” cases, at least in singular: nominative, genitive, partitive, and “short illative”. All the rest can be, by and large, produced with a simple formula (minus some exceptions).

If we look at the plurals, it is for the most part the same situation:

Case Suffix Number of exceptions
Comitative -ga 0
Abessive -ta 95
Essive -na 0
Terminative -ni 148
Translative -ks 74 556

Wow, there clearly is something going on there! Let’s have a look at those massive discrepancies for a plural translative case:

Word Base form Translation Actual case form Formula-based form
aabits aabitsate alphabet aabitsaiks aabitsateks
äädikas äädikate vinegar äädikaiks äädikateks
aadel aadlite nobility aadleiks aadliteks

and so on…

So, what we’re dealing with here is the so-called ghost form: aabitsai/ äädikai /aadlei which is not given in dictionaries, but is used as a basic form for some words for forming some of the cases for plural number. Now, how prevalent is it? Well, in my database, there’s 85,261 name paradigms, and 74,556 of those seem to use this plural root shadow form (for translative at least), which is ~87%.

Let’s see how many discrepancies other cases have:

Case Suffix Number of exceptions  
Comitative -ga 0  
Abessive -ta 95  
Essive -na 0  
Terminative -ni 148  
Translative -ks 74 556  
Ablative -lt 74 556  
Adessive -l 74 556  
Allative -le 74 556  
Elative -st 74 556  
Inessive -s 74 556  
Illative -sse 74 556  

Surprising, even suspicious, uniformity. Let’s have a look at, say, translative case forms and try to extract a plural root form out of those.

N days later…

Now… once we have identified the mysterious, shadow “root plural” form, let’s build a list of discrepancies while taking also these forms into account.

What do we have for translative case: 0. Cool

Let’s have a look at (plural) ablative case now. 0 discrepancies again.

Ok, let’s compose a table now, with shadow “root-plural” form included as a base for producing suffixed forms:

Case Suffix Number of exceptions  
Translative -ks 0  
Ablative -lt 0  
Adessive -l 0  
Allative -le 0  
Elative -st 0  
Inessive -s 0  
Illative -sse 0  

So, basically, (almost) all the discrepancies we’ve seen above, for the plural forms, can be explained by “root plural” form, which is defined on ~87% of all lexemes.

Now let’s get back to the small number of discrepancies for abessive and terminative cases that we have seen above.

For abessive case these are:

nominal base case inflected suffixed
ebajalg ebajalge genitive ebajaluta ebajalgeta
ebajalg ebajalgade genitive ebajaluta ebajalgadeta
eesjalg eesjalge genitive eesjaluta eesjalgeta
eesjalg eesjalgade genitive eesjaluta eesjalgadeta
esijalg esijalge genitive esijaluta esijalgeta
esijalg esijalgade genitive esijaluta esijalgadeta

… and so on, but all of these are for lexemes, that have jalg (leg) as it’s root. And we can see that *jalu is the shadow “root plural” form for these words. So, the exception to memorize here is that Xjalg also uses shadow “root plural” form for this case, unlike the other words, that only use plural genitive form for this case.

And we’ve also seen discrepancies for the terminative case, let’s look at those:

nominal meaning base case inflected suffixed
kõrv basket kõrvade genitive kõrvuni kõrvadeni
põlv knee põlvede genitive põlvini põlvedeni
rind breast rindade genitive rinnuni rindadeni
rind breast rinde genitive rinnuni rindeni
silm eye silmade genitive silmini silmadeni
silm eye silme genitive silmini silmeni

And a lot of other words, all based on one of these roots. So, another exception to remember is that words, based on one of these roots, also use it’s shadow “root plural” form to build a terminative case.

Let’s also quickly confirm probably the first rule one ever learns about Estonian morphology:

nominative plural = genitive singular + -d

My teacher even used to say:

Don’t memorize the second form (a.k.a the genitive), memorize the plural instead! You’ll get two for the price of one.

Oh wow, there are some exceptions indeed. But all of them are pronouns, which are irregular anyway, and declension of pronouns just has to be remembered.

And finally, let’s summarize what we have learned about oh-so-scary Estonian case system:

Declension of names (no pronouns)

Case (en.) Case (est.) Singular Plural
Nominative Nimetav the word as in dictionary singular genitive + d
Genitive Omastav has to be memorized has to be memorized
Partitive Osastav has to be memorized has to be memorized
Short illative Lühike sisseütlev has to be memorized has to be memorized
Illative Sisseütlev singular genitive + -sse plural genitive + -sse
or root plural + -sse
Inessive Seesütlev singular genitive + -s plural genitive + -s
or root plural + -s
Elative Seestütlev singular genitive + -st
*except for words, ending with kodu - these will have -nt ending
plural genitive + -st
or root plural + -st
Allative Alaleütlev singular genitive + -le plural genitive + -le
or root plural + -le
Adessive Alalütlev singular genitive + -l plural genitive + -l
or root plural + -l
Ablative Alaltütlev singular genitive + -lt plural genitive + -lt
or root plural + -lt
Translative Saav singular genitive + -ks plural genitive + -ks
or root plural + -ks
Terminative Rajav singular genitive + -ni plural genitive + -ni
*except for words based on silm, põlv, kõrv, rind - these will also have root plural + -ni
Essive Olev singular genitive + -na plural genitive + -na
Abessive Ilmaütlev singular genitive + -ta plural genitive + -ta
*except for words based on root jalg - these will also have root plural + -ta
Comitative Kaasütlev singular genitive + -ga plural genitive + -ga

So, as we can see: out of 30 possible case forms, only 8 have to be memorized, all the rest can be easily produced by simply adding a well known suffix to one of three basic forms. And there’s just 6 roots, that produce certain exceptions.

I hope this post helps de-mystify the subject and bring down your fear of learning Estonian!!

Palju õnne, ja ruttu nägemiseni! (which means “Good luck, and see you soon!”)