Lexicography and Dictionary Structure Principles
Lexicology and Lexicography
The adjective lexicographic is related to the names of two linguistic disciplines: lexicology and lexicography, both of which come from the Greek word lexikón meaning ‘speech’, ‘way of speaking’ or, simply, ‘word’. They are not the same. Lexicology is a theoretical discipline focused on the study of words; i.e., it explores the form, structure, and meaning of words and the processes whereby new words are formed. In contrast, although using information from lexicology, lexicography is a practical discipline that deals with the concrete forms of words—that is, it collects, explores, and describes lexical units (words and word combinations) in dictionaries and other lexicographic tools. In other words, lexicology provides the theoretical basis for lexicography, and lexicography may be seen as applied lexicology.
The History and Definition of Dictionaries
Dictionaries are typical examples of reference books—that is, books that provide information about things or people rather than extensive reading material. The Oxford English Dictionary defines dictionaries as books that deal with the individual words of a language (or certain specified classes of them), so as to set forth their orthography, pronunciation, signification, and use, their synonyms, derivation, and history, or at least some of these facts. For convenience of reference, the words are arranged in some stated order, now, in most languages, alphabetical; and in larger dictionaries, the information given is illustrated by quotations from literature; a word-book, vocabulary, or lexicon (www.oed.com). However, dictionaries have not always included this information about words.
The earliest dictionaries in English were glossaries and consisted of lists of foreign words (usually French, Italian, and Latin words) and their English equivalents. In fact, the word dictionary comes from Medieval Latin dictionarium, meaning ‘collection of words and phrases’ (from Latin dictionarius ‘of words’ and dictio or ‘word’). Early dictionaries were glossaries and provided compilations of those words considered to be difficult for readers in Latin manuscripts, which were inserted at the end of the texts in the form of lists. At first, these glosses did not follow a particular order or criterion, but in the 15th and 16th centuries, glossary authors started to organize them in alphabetical order. The main goal of these glossaries was to enable learners to master Latin (taking into account that Latin was the language used in educated and educational contexts for a long time).
The Evolution of Modern Lexicography
However, they were too extensive and random, and finding words in them was not easy. This situation started to change in the late Renaissance due to the attention paid to vernacular languages—i.e., the languages used by ordinary people versus the literary or official languages used by scholars, the church, or royalty—which resulted in the emergence of bilingual and multilingual dictionaries. However, these early dictionaries were rather limited: their main goal was to help readers understand highly valued texts (usually religious and literary) and, therefore, the words chosen came from those texts and were the terms that dictionary makers considered most difficult or hard to understand.
In the 18th century, dictionaries started to include words other than difficult ones and also paid attention to their etymology. Their goal was to educate ‘ignorant’ or illiterate readers (again, remember that access to education was limited at that time). Two very influential people in this respect were the English writer Samuel Johnson (1709-1784) and the American lexicographer Noah Webster (1758-1843). Their main contribution to lexicography is Samuel Johnson’s A Dictionary of the English Language and Noah Webster’s An American Dictionary of the English Language.
Samuel Johnson wrote his dictionary in two stages. In 1747, he published The Plan of a Dictionary of the English Language, where he explained his motivation and goals. A Dictionary of the English Language (two volumes) was finally published in 1755 and included 42,773 words. As stated on its cover, this was “a dictionary in which the words are deduced from their originals and illustrated in their different significations by examples from the best writers.” Johnson’s main goal was to preserve and standardize the English language; therefore, his dictionary was essentially prescriptive. In order to “fix” English, the dictionary traced the origins and history of words, illustrating them with quotations from Middle English to show changes in meaning, offered readers guidance on pronunciation, and provided information on all sorts of words (not just ‘hard’ ones), such as vulgar words, professional jargon, and barbarisms.
Another important figure is Noah Webster. He was an influential writer and lexicographer in the United States of America and published his first dictionary in 1806 under the name A Compendious Dictionary of the English Language. However, his main work was An American Dictionary of the English Language (with 70,000 words), first published in 1828 and republished in two volumes in 1840. In contrast to Johnson’s dictionary, Webster’s dictionary was basically descriptive and synchronic, and the words included in it were described following usage examples rather than the works of canonical authors. Accordingly, the dictionary was highly influential in its own time and remained the main lexicographic reference in the US after its acquisition by Charles Merriam in 1843 and the publication of the various Merriam-Webster dictionaries currently in use.
Glossaries and Encyclopedias
Glossaries consist of alphabetical lists of terms with their definitions or brief explanations. They are typically placed at the end of books and include words from a particular domain of knowledge which are (a) considered important in their respective domains, (b) newly introduced, or (c) highly specialized. The main goal of glossaries is to explain the concepts referred to by such specialized words and, in this sense, they are related to defining vocabularies and ontologies. For instance, at the end of this course book, you will find a glossary that includes the most important terms and notions seen in this subject.
Encyclopaedias (also spelled encyclopedia) come from the Greek enkyklios paideia, meaning ‘general education’. An encyclopaedia is a reference work that offers general and comprehensive information about topics and subjects of human knowledge, such as historical events, geographical features, and short biographies of important people. Because of their large scope, encyclopaedias typically consist of several books. This information is usually organized in alphabetical order and is introduced or ‘summarized’ in a subject index located either at the back of the encyclopaedia or in a separate volume. Encyclopaedias relate words to things and events in the extra-linguistic world and provide extensive information about them. They often include graphic information (pictures, maps, charts, etc.) alongside verbal explanations. Finally, because they are focused on explaining concepts in the world around us, they only cover nouns; they do not include prepositions, conjunctions, or verbs.
Encyclopedic Dictionaries and Linguistic Corpora
An encyclopaedic dictionary is a reference book that offers in-depth explanations of concepts arranged alphabetically. It is a combination of an encyclopaedia and a dictionary in that it often incorporates images into the verbal explanations and covers not only nouns but also verbs and adjectives. Linguistic corpora (singular: corpus) are compilations or collections of texts (either written texts or transcriptions of oral speech). Although these collections do not provide explicit descriptions or explanations about the data they contain, they can be searched and, therefore, allow for exploring words in their real context of use. Thanks to technology, contemporary corpora are machine-readable and are the main source of data in lexicography.
Summary of Lexicographic Tools
Lexicographic tools play an important role in learning a language. With their help, you can look up words and explore things such as their spelling, pronunciation, collocations, meanings in various contexts, translations, syntactic properties, and usage. To use a lexicographic tool effectively, you need to consider:
- What you really want to find
- The type of information each tool offers
- How each tool works
In other words, you must learn how to find information accurately and quickly. You may also need to use several tools. For instance, if you want to know how to use the verb realize, you will need to consider:
- The spelling (realize or realise?)
- How many senses the verb has
- The translation in your own language
- Whether the meaning depends on the context
In order to find out, you may start by using a monolingual or bilingual dictionary, but you may also need a thesaurus, a glossary, or a textual corpus.
Structure and Language of Dictionaries
Dictionaries are reference books, meaning people use them to find specific information rather than reading them from beginning to end. They are characterized by a typical textual structure covering four main types: megastructure, macrostructure, mesostructure, and microstructure.
Megastructure
The megastructure refers to the global physical organization of the dictionary, including:
- A lemma list: The main body of the dictionary.
- Front matter: Sections before the lemma list (preface, introduction, user guide, abbreviations).
- Back matter: Sections after the lemma list (grammatical information, irregular verbs, conversion tables, bibliography).
Macrostructure
The macrostructure concerns the organization of words within the main body. This falls into two main types:
Alphabetical Macrostructure
Conventional dictionaries follow an onomasiological (formal) approach, presenting words in alphabetical order. General-purpose dictionaries use a strict alphabetical order, while learner dictionaries might group lemmas in horizontal clusters (niches and nests) to show morphological relationships.
Thematic Macrostructure
Some dictionaries follow a semasiological (meaning-based) approach. This avoids hiding relations between synonyms like tub and vat, which would otherwise be far apart. Technical dictionaries and thesauri often use this systematic thematic order.
Mesostructure
The mesostructure (or mediostructure) refers to the set of cross-references that take users from one place to another. These help users locate information and understand word relationships. Cross-references can be:
- Lexical: Directives like see, look up, or cf.
- Visual: Symbols like arrows (→), pointing fingers (☞), or typographic changes (bold, italics, SMALL CAPITALS).
- Hypertext: Links in online dictionaries.
Microstructure
The microstructure (entry structure) refers to how information about a lemma is presented within an entry. It includes:
- Formal aspects: Spelling, pronunciation, and grammar.
- Semantic aspects: Definitions, senses, connotations, and semantic relations (synonyms, antonyms).
- Pragmatic aspects: Register, currency, and dialectal variants.
- Additional information: Etymology, usage notes, and illustrations.
Classification and Types of Dictionaries
Dictionaries are classified according to three main criteria:
- Number of languages: Monolingual, bilingual, or multilingual.
- Target user: General users, language learners, or professionals (terminology and jargon).
- Temporal span: Diachronic (historical) or synchronic (current usage).
Defining the Word and Collocation
What is a Word?
Leonard Bloomfield defined words as “minimum free forms.” Beyond linguistics, words are distinguished by pauses (phonological) or spaces (orthographic). However, some words like have and been in “have been studying” do not fulfill the meaning criterion on their own. Linguists also use terms like word form, lexeme, and lemma to clarify these concepts.
What is Collocation?
Collocation is the tendency of words to occur with other words. For example, handsome typically collocates with men, while beautiful collocates with women. While big rain is grammatically possible, native speakers automatically use the collocation heavy rain. Collocation is a matter of natural usage rather than just grammatical correctness.
