Anatomy of Estonian declension, part I

2023-01-28 18:48:00 +0000

Reading time: 17 mins

What is “declension”? As per Wikipedia:

In linguistics, declension (verb: to decline) is the changing of the form of a word, generally to express its syntactic function in the sentence, by way of some inflection. Declensions may apply to nouns, pronouns, adjectives, adverbs, and articles to indicate number (e.g. singular, dual, plural), case (e.g. nominative case, accusative case, genitive case, dative case), gender (e.g. masculine, neuter, feminine), and a number of other grammatical categories.

Estonian language is exactly one of the languages, where declension is quite important. Maybe not so much as in Slavic languages, but way more important, compared to English or Dutch. Estonian names (primarily nouns and adjectives) count 14 (by some definition - 15) cases, but most of those are simple agglutinations, i.e. appending a certain constant suffix to some basic form. However, some cases are not so simple, and getting used to them takes quite some time. In this blog post, we’ll try to identify some of the ways Estonian declension works.

Let’s go! In fact, I have a database of Estonian lexemes (described in another blog post), and we’ll be using it for our research. I will not be using the Wikidata this time, however it must be fully possible to reproduce with Wikidata Lexemes as well.

Let’s begin with formally identifying the “real” cases, i.e. the ones where forms are NOT (or not always) produced by simply appending a suffix to basic form. In order to that, we’ll find the lexemes (words) where inflected form + case syntax is different from the actually observed form.

Let’s begin with the usual suspects: “nina taga” group of cases (i.e. forms, ending in -ni, -na, -ta and -ga):

with base as (
select r.representation, ef.paradigm_id as id
from ekilex_forms ef  
join representations r on ef.word_representation_id = r.id
where ef.form_type_combination_id  = 2 and r.representation <> '-'
),
declined as (
select r.representation, ef.paradigm_id as id
from ekilex_forms ef  
join representations r on ef.word_representation_id = r.id
where ef.form_type_combination_id  = 14 and r.representation <> '-'
)
select base.id, base.representation, declined.representation as inflected, concat(base.representation, 'ga') as suffixed
from base 
join declined on base.id = declined.id
where declined.representation <> concat(base.representation, 'ga');

this produces ~70 results, which is basically nothing. In all cases, these can be explained by multiple observed basic forms (basic form in Estonian is omastav, i.e. genitive case). So, obviously, where there is more than one omastav, we’ll see “discrepancies”.

If we throw away all the words with more than one singular genitive, we’ll get just these 4 results:

Word	Base form	Translation	Actual `-ga` form	Formula-based form
keegi	kellegi	someone	kellegagi	kellegiga
kumbki	kummagi	one of both	kummagagi	kummagiga
miski	millegi	something / nothing	millegagi	millegiga
ükski	ühegi	no one/ anyone	ühegagi	ühegiga

And here we see very interesting phenomenon, remnants of a system, that is still very much present in Finnish: relative positioning of suffixes. -gi/-ki is so-called emphatic suffix, i.e. you add it to an end of the word when you want to put special emphasis, highlight this word in a phrase. And, as we can see, it has to be placed in the last position, after the case suffix -ga. By the way, words with -gi/-ki normally never make it in the dictionary, because, it being emphatic, it can be easily applied to any worm of any word. But these particular words, initially being just emphatic forms of other words, ended up being very important and deserving own place in the vocabulary.

Now, when we know of these exceptions, it’s safe to exclude them as well and exclude the -ga case (which is called comitativ, and means “with, or including, this word”), from the consideration as the real inflectional case.

Let’s now look at -ta case (abessive, which means “without this word”). Same thing! In all of the cases (except those we’ve already excluded), the observed (as provided by our reference dictionary.

Same results for -na (essive) and -ni (terminative) cases.

We get some interesting results with translative -ks case:

Word	Base form	Translation	Actual case form	Formula-based form
see	selle	this	seks	selleks
seesama	sellesama	this same	sekssamaks	sellesamaks
seesama	sellesama	this same	sellekssamaks	sellesamaks

These observed discrepancies are in fact again consequences of duplication, but in this case, not of the base form but of the “cased” form. The word see has 2 forms for singular translative: seks/selleks, so we get 1 result where dictionary form is different from the formula of “base form + -ks”. It’s even more “interesting” with seesama, which according to dictionary has 2 translative case forms: sellekssamaks and sekssamaks, so this word can be considered one and only true deviation from the case formula for the translative case in the whole of Estonian language (in singular only, there’s entirely different story with plural).

Now, let’s look at so-called locative cases, that are used (for the most part, but not only) to reflect positioning (placement) of object or direction of action. Let’s begin with ablative case that has -lt case suffix.

Word	Base form	Translation	Actual case form	Formula-based form
see	selle	this	selt	sellelt
seesama	sellesama	this same	selleltsamalt	sellesamalt
seesama	sellesama	this same	seltsamalt	sellesamalt
too	tolle	that	tolt	tollelt

Again, usual suspects, which we should probably exclude from further analysis. These words and these forms, apparently, occur so often in everyday speech, that they have developed multiple realizations, selt/sellelt, and, by analogy, tolt/tollelt, all of which have gained enough traction to be included in the dictionary.

Next one: adessive case, with an -l suffix. Having excluded words from above tables, we see these new forms popping up in naughty forms list:

Word	Base form	Translation	Actual case form	Formula-based form
kes	kelle	who	kel	kellel
mis	mille	what	mil	millel

Same story as above, but also there’s some lesson lurking there: we observe that a form with -llel- is very frequently reduced to just -l: millel/mil, sellelt/selt, kellel/kel, tollelt/tolt, and so on. This is probably our first hint on existing of two parallel paradigms (spoiler: the longer one, and the shorter one) for most of the personal pronouns, on which we probably will discover more going forward.

Next one: allative case with -le ending.

Word	Base form	Translation	Actual case form	Formula-based form
ma	mu	I	mulle	mule
sa	su	you (informal)	sulle	sule
ta	ta	he/she	talle	tale

These are again, three instances of shortened paradigm for personal pronouns, and we can observe that -le in these words is applied as -lle, and I thus far have no idea as of “why”. I guess it just sounds more natural this way, to a native speaker!

With this, we have covered so-called “exterior” group of locative cases (the ones that might be translated using “on” or “onto”).
Next, we’ll cover “interior” ones (that might be translated with “in” or “into”).

First, elative case, with the -st formula. And here, we observe one exception, that occurs in many words, but all of these words are just compounds with kodu as basic component:

Word	Base form	Translation	Actual case form	Formula-based form
isakodu	isakodu	father’s home	isakodunt	isakodust
kodu	kodu	home	kodunt	kodust
koolkodu	koolkodu	school house	koolkodunt	koolkodust

So, one more exception in our collection! Not sure how it is explained though

Next, inessive, with -s formula. No exception! Phew…

And, final locative case: illative, with -sse formula. No exception again! Wait a minute, this is… fishy. I know for a fact that in this case there’s tons of words that do not comply with a formula. Let’s check out some paradigm, for example that very same kodu, on Sõnaveeb. Aha! Here, we see what has happened: Sonaveeb (which a website, maintained by Eesti Keele Instituut, i.e. Institute of Estonian Language), simply considers all such “exceptions” to be instances of a separate case, so-called “lühike sisseütlev”, i.e. “short illative”, this is where it diverges from the English Wikipedia on the subject (or, rather, wikipedia diverges a bit).

Anyway, this “short illative case” does not have a formula. This way, we have established that Estonian language has 4 “real” cases, at least in singular: nominative, genitive, partitive, and “short illative”. All the rest can be, by and large, produced with a simple formula (minus some exceptions).

If we look at the plurals, it is for the most part the same situation:

Case	Suffix	Number of exceptions
Comitative	-ga	0
Abessive	-ta	95
Essive	-na	0
Terminative	-ni	148
Translative	-ks	74 556

Wow, there clearly is something going on there! Let’s have a look at those massive discrepancies for a plural translative case:

Word	Base form	Translation	Actual case form	Formula-based form
aabits	aabitsate	alphabet	aabitsaiks	aabitsateks
äädikas	äädikate	vinegar	äädikaiks	äädikateks
aadel	aadlite	nobility	aadleiks	aadliteks

and so on…

So, what we’re dealing with here is the so-called ghost form: aabitsai/ äädikai /aadlei which is not given in dictionaries, but is used as a basic form for some words for forming some of the cases for plural number. Now, how prevalent is it? Well, in my database, there’s 85,261 name paradigms, and 74,556 of those seem to use this plural root shadow form (for translative at least), which is ~87%.

Let’s see how many discrepancies other cases have:

Case	Suffix	Number of exceptions
Comitative	-ga	0
Abessive	-ta	95
Essive	-na	0
Terminative	-ni	148
Translative	-ks	74 556
Ablative	-lt	74 556
Adessive	-l	74 556
Allative	-le	74 556
Elative	-st	74 556
Inessive	-s	74 556
Illative	-sse	74 556

Surprising, even suspicious, uniformity. Let’s have a look at, say, translative case forms and try to extract a plural root form out of those.

N days later…

Now… once we have identified the mysterious, shadow “root plural” form, let’s build a list of discrepancies while taking also these forms into account.

What do we have for translative case: 0. Cool

Let’s have a look at (plural) ablative case now. 0 discrepancies again.

Ok, let’s compose a table now, with shadow “root-plural” form included as a base for producing suffixed forms:

Case	Suffix	Number of exceptions
Translative	-ks	0
Ablative	-lt	0
Adessive	-l	0
Allative	-le	0
Elative	-st	0
Inessive	-s	0
Illative	-sse	0

So, basically, (almost) all the discrepancies we’ve seen above, for the plural forms, can be explained by “root plural” form, which is defined on ~87% of all lexemes.

Now let’s get back to the small number of discrepancies for abessive and terminative cases that we have seen above.

For abessive case these are:

nominal	base	case	inflected	suffixed
ebajalg	ebajalge	genitive	ebajaluta	ebajalgeta
ebajalg	ebajalgade	genitive	ebajaluta	ebajalgadeta
eesjalg	eesjalge	genitive	eesjaluta	eesjalgeta
eesjalg	eesjalgade	genitive	eesjaluta	eesjalgadeta
esijalg	esijalge	genitive	esijaluta	esijalgeta
esijalg	esijalgade	genitive	esijaluta	esijalgadeta

… and so on, but all of these are for lexemes, that have jalg (leg) as it’s root. And we can see that *jalu is the shadow “root plural” form for these words. So, the exception to memorize here is that Xjalg also uses shadow “root plural” form for this case, unlike the other words, that only use plural genitive form for this case.

And we’ve also seen discrepancies for the terminative case, let’s look at those:

nominal	meaning	base	case	inflected	suffixed
kõrv	basket	kõrvade	genitive	kõrvuni	kõrvadeni
põlv	knee	põlvede	genitive	põlvini	põlvedeni
rind	breast	rindade	genitive	rinnuni	rindadeni
rind	breast	rinde	genitive	rinnuni	rindeni
silm	eye	silmade	genitive	silmini	silmadeni
silm	eye	silme	genitive	silmini	silmeni

And a lot of other words, all based on one of these roots. So, another exception to remember is that words, based on one of these roots, also use it’s shadow “root plural” form to build a terminative case.

Let’s also quickly confirm probably the first rule one ever learns about Estonian morphology:

nominative plural = genitive singular + -d

My teacher even used to say:

Don’t memorize the second form (a.k.a the genitive), memorize the plural instead! You’ll get two for the price of one.

Oh wow, there are some exceptions indeed. But all of them are pronouns, which are irregular anyway, and declension of pronouns just has to be remembered.

And finally, let’s summarize what we have learned about oh-so-scary Estonian case system:

Declension of names (no pronouns)

Case (en.)	Case (est.)	Singular	Plural
Nominative	Nimetav	the word as in dictionary	singular genitive + `d`
Genitive	Omastav	has to be memorized	has to be memorized
Partitive	Osastav	has to be memorized	has to be memorized
Short illative	Lühike sisseütlev	has to be memorized	has to be memorized
Illative	Sisseütlev	singular genitive + `-sse`	plural genitive + `-sse` or root plural + `-sse`
Inessive	Seesütlev	singular genitive + `-s`	plural genitive + `-s` or root plural + `-s`
Elative	Seestütlev	singular genitive + `-st` *except for words, ending with `kodu` - these will have `-nt` ending	plural genitive + `-st` or root plural + `-st`
Allative	Alaleütlev	singular genitive + `-le`	plural genitive + `-le` or root plural + `-le`
Adessive	Alalütlev	singular genitive + `-l`	plural genitive + `-l` or root plural + `-l`
Ablative	Alaltütlev	singular genitive + `-lt`	plural genitive + `-lt` or root plural + `-lt`
Translative	Saav	singular genitive + `-ks`	plural genitive + `-ks` or root plural + `-ks`
Terminative	Rajav	singular genitive + `-ni`	plural genitive + `-ni` except for words based on `silm`, `põlv`, `kõrv`, `rind` - these will also have root plural* + `-ni`
Essive	Olev	singular genitive + `-na`	plural genitive + `-na`
Abessive	Ilmaütlev	singular genitive + `-ta`	plural genitive + `-ta` except for words based on root `jalg` - these will also have root plural* + `-ta`
Comitative	Kaasütlev	singular genitive + `-ga`	plural genitive + `-ga`

So, as we can see: out of 30 possible case forms, only 8 have to be memorized, all the rest can be easily produced by simply adding a well known suffix to one of three basic forms. And there’s just 6 roots, that produce certain exceptions.

I hope this post helps de-mystify the subject and bring down your fear of learning Estonian!!

Palju õnne, ja ruttu nägemiseni! (which means “Good luck, and see you soon!”)