Most recent update: 2023-3-19
Elexis
This site implements the ELEXIS Protocol for accessing dictionaries (version 1.0).
- A brief overview of the interface is here.
- A more complete description is here.
-
The queries are simply added to the base URL:
- https://old-norse.net/about/CleasbyVigfusson
- https://old-norse.net/list/CleasbyVigfusson?limit=30&offset=16900
- https://old-norse.net/lemma/CleasbyVigfusson/há?partOfSpeech=NOUN
- https://old-norse.net/json/CleasbyVigfusson/h:372:1
- I've included partOfSpeech filtering on the main /list/ query (the specs only list this as an option for /lemma/ queries).
-
I've included a non-standard query "orig-html" that uses the same
unique ID's as the Elexis lemmas for requesting the fully tagged HTML5
entry for each headword.
- https://old-norse.net/orig-html/CleasbyVigfusson/h:372:1
- Note: I'm still proofreading the auto-tagged meta-data, so there may be minor changes in the unique IDs from time to time as I update specific entries.
Help Improve the Dictionary
If you find typos, formatting errors, OCR errors, and/or internal references that are missing a link (or existing links that point to the wrong location), please
The quality of the search function depends on the accuracy of the semantic tags and the meta-data (these are used to generate the search database). If you want to help proof-read these semantic tags you can view this color-coded version of the text. However, to see the meta-data, you'll need to look at the raw html of the fully tagged files.
Background
This version of An Icelandic-English Dictionary builds on the work of the Germanic Lexicon Project. I've made regular use of the original online version since 2004 and I always found it a valuable and convenient resource. However, the project stopped incorporating corrections prior to 2008 and, over the years, I found myself wanting to clean up those remaining errors, convert the text encoding entirely to Unicode, and make it easier to navigate the dictionary (especially when using the scanned images). I hope that my efforts in improving the online dictionary will be found useful by students and hobbyists alike.
After fiddling with conversion scripts off and on for a couple years, I finally attacked the project as a whole the summer of 2019 and have worked on it, as time permitted, since then. Specifically, I've completed the following:
- Converted the text to UTF-8 and embedded a font (Junicode, see below) so all characters are displayed correctly. The main text contained quite a few placeholders (e.g., &aolig; &avlig; &aalig; etc.) as well as rune, uncertain, unknown, illegible, and image tags that I've replaced with the correct Unicode characters or images as needed.
- Converted the markup to HTML5.
- Improved the formatting of the dictionary entries to use layers of hanging indents so it's easier to identify head-words and follow the logic of the longer entries.
- Proof-read/corrected the entire introductory section. This had not been open to the crowd-sourced proof-reading in the original project and was not included in Thomas Stridman's OCR improvements. This also included building tables for the grammar portion.
- Proof-read the Errata and Addenda that Thomas Stridman incorporated into the main text; adding all the missing material and marking all changes to the text with strike-through and footnotes.
- Moved the historical/etymological discussion of each letter into the introductory material (following the introduction to the alphabet) and moved the lists of rivers and Gaelic words from the appendix into the introduction (following the lists of verbs and irregular forms).
- Duplicated the list of negatives under ú- to a list of negatives under ó-. This was a common source of frustration for students as modern editions tend to use ó- for the negative prefix.
- Added links to all internal references within the dictionary (both to other main entries and to grammar sections).
- Further modified the hanging indents and full indents to make it easier to see multiple definitions mixed with quotes and entries where a word is combined with different prepositions, etc.
- Fixed remaining OCR errors. This is an ongoing process; however, through targeted searching and visually proofreading each entry, I have found and corrected a large number of errors that were still present in the text.
-
Search functionality:
- Added anchors to every head-word and every line in sub-entries.
- Created scripts to auto-generate search indices from the tagged files.
- Created the search interface to filter searching by type of information (headword vs. definition vs. quote vs. translation, etc.)
- Created a search algorithm that allows regular expressions and Unicode characters as well as a display of results with links to the original entries.
- Tagged the introduction/grammar sections so they could be included in searches.
- Visually inspected each page to ensure everything is tagged and tagged correctly (first draft).
- Modified the search algorithm so definitions are included in the search results when searching headwords (including the definitions from subsequent levels of large multi-level entries). This does slow down the search process, but the results are much more useful.
- Optimized the search algorithm (biggest improvement was for headwords with definitions attached, but noticeable improvement on other searches, too).
- Fixed the special character buttons so they insert at the current cursor position. Fixed the special character buttons so they don't steal focus from the text box. Fixed some odd situations where special characters would insert at a position offset from the current cursor position.
- Added options to modify the spelling of the search term (when searching headwords) to take u-umlaut, i-umlaut, and other common variations (e.g., æ/œ, ǫ/ö) into account.
- Added option to sort by ranked order or alphabetic order. Currently, the ranking is based on how close the match is to the original search term and how many matches there are (e.g., when searching quotes). Penalties are currently applied for unmatched characters in words longer than the match and (when searching headwords with alternates enabled) for matching alternates instead of the original vowel.
- Modified the search engine so it makes use of the Elexis data structures and meta-data. Searching headwords is now much faster and definitions from multiple levels are returned (with links to each level). Definitions from related words are also returned if the data-equivid meta-data is pointing to the referred word.
- Searching headwords now also searches the inflected forms, when available. Dr. Langeslag has generously provided his curated set of verb conjugations; including compound words, this results in ~1400 fully inflected verbs in the index. I have also added the inflected forms of words ending in -ligr/-legr, -liga/-lega, -igr/-ugr; words ending in maðr and the kinship nouns (bróðir, etc.); ~60 common adjectives; and the common pronouns (~7000 words in all). More inflected forms will continue to be added to the index.
- Added -st/-zt variants for -sk/-zk endings because these endings are used in some editions.
-
Improvements for mobile devices:
- Modified the layout/design so it's easier to read through page scans and simpler to switch between page scans and the text version.
-
Split all multi-column images into a single column for easier
reading and more convenient size on mobile devices.
- Note: in some places, the quality of the .PNG images from the original project can be poor; so, you may need to refer to the higher resolution .TIFF images in the original project at times.
- Added the ability to request a search from the URL itself. This is useful for e-readers that allow custom dictionaries. See notes under the instructions section of the search page.
- Added lazy image loading so the page scan view loads much faster.
- Added file compression to improve the speed of transferring/opening files.
- Step size for links is now calculated based on the total number of links to avoid overwhelming the screen on small devices.
-
Elexis interface:
- Built an interface to handle Elexis queries.
- Wrapped each complete headword entry in div tags with a variety of data-* attributes for storing meta data for easier creation of index files.
- Created scripts to auto-tag the entries with various information needed for Elexis queries (e.g., parts of speech).
- Added an extension for requesting the complete, fully tagged, HTML5 entries for a given Elexis unique ID.
- The inflected word forms (see above) are also available within Elexis using ?inflected=1 or ?inflected=TRUE in a /lemma/ query.
-
In progress:
- Updating meta-data to improve the results for the new search engine. For example, adding data-equivid tags when the definition refers to another word (id., v., cp., =, etc.); proofreading the part-of-speech tags (data-pos) and info like gender so that the search results can be filtered by part of speech, etc.
- Add more inflected forms of the words for the main dictionary and the Elexis lemma query (inflected=true option).
- Verify that links to homographs and multi-section entries are pointing to the correct location.
-
Future:
- Convert my current search engine (which is basically a set of flat text files with convoluted regular expressions) to a proper database approach. This will improve the speed of searches and make filtering the search results easier.
- Include the inflection information from Beygingarlýsing íslensks nútímamáls and link these modern inflection tables to the corresponding words in C.V. This will be particularly helpful for students and others reading electronic texts with modernized spelling.
- Add a tools to the main dictionary pages that allows viewing the inflection tables for a given word (both standardized Old Norse and modern, as available).
I have included the many OCR corrections from Thomas Stridman's version of this dictionary. His improvements to the original project were particularly valuable for the Greek words and phrases (none of which were interpreted correctly by the original OCR process). He also moved most of the Errata and Addenda into the main text.
I have embedded the Junicode font for the text. This typeface is based on George Hickes Linguarum vett. septentrionalium thesaurus grammatico-criticus et archaeologicus (Oxford, Sheldonian Theatre, 1703–1705). It's specifically designed for medievalists (i.e., it implements the Medieval Unicode Font Initiative, version 4.0) and contains several features useful for this project like runic characters, proper Nordic shapes for þ and ð, and a number of unusual characters found in manuscripts that are sometimes used in the dictionary (e.g., and ).
—Scott Burt
Contact:
Information from the Original Project
Cleasby/Vigfusson is the most comprehensive and authoritative dictionary on Old Icelandic.
The copyright on this dictionary is expired. You are welcome to copy this data, post it on other web sites, create derived works, or use the data in any other way you please. As a courtesy, please credit the Germanic Lexicon Project.
Work on this project started in 2003. It was one of the major ongoing projects, with correction work being performed by volunteers worldwide. The goal was to produce a fully corrected document, marked up in XML.
Sean Crist initiated and was directing the project, and also did the OCR, major software design and programming, and ongoing global corrections. There was a hiatus in the project from 2008-2020, but it looks like things are moving forward again.
Scanning and preparation of these page images was made possible by a grant from the American-Scandinavian Foundation.
Germanic Lexicon Project |
|
White Supremacy and Norse/Medieval Studies
There is a long history of people intentionally misrepresenting the history of medieval Europe and co-opting medieval symbols to support their racist political agendas. The Nazis infamously appropriated many facets of Nordic heritage and used it to justify genocide. Modern white supremacists and Neo-Nazis continue to use medieval symbols and distorted views of medieval Europe in their rallies and propaganda. Indeed, this has become so prevalent in recent times that those of us who are interested in various facets of medievalism or Norse history must be actively anti-racist lest our silence be taken as implicit support for these white supremacist groups and their hateful ideologies.
In other words, white supremacists and related bigots can find another source to use. You're not welcome here.
Here are some articles and resources related to the growing need for anti-racism in medieval studies and hobbies that include medieval and Norse topics:
- How Hate Groups are Hijacking Medieval Symbols While Ignoring the Facts Behind Them (article with some interviews)
- We Must Protect Our History From White Supremacists (article with some interviews)
- A Special Edition of The Public Medievalist: Race, Racism and the Middle Ages (this is a collection of scholarly articles on this topic)