The DAG < APT database contains Daghestanian lexemes that originate from Arabic, Persian and Turkic languages, as well as the original donor lexemes. The data was extracted from existing literature on lexical borrowing.

Arabic, Persian and Turkic languages1 are prolific borrowing sources for Daghestanian languages due to their cultural importance. Arabic is the language of religion, and Persian and Turkic were important languages for trade and exchange of knowledge. Of course these languages have also influenced each other, so a borrowing of an Arabic lexeme into a Daghestanian language may have been mediated through Persian and/or a local Turkic language like Azerbaijani.

The aim of the DAG < APT project is twofold: first, it combines information from the rich literature on lexical borrowing from Arabic, Persian and Turkic into various Daghestanian languages into a single, searchable database that can be used for reference; second, it creates a base of target lexemes that can be compared in terms of adaptation patterns and geographical distribution. This can help to uncover different historical and regional layers of borrowing processes, and perhaps identify cases of mediated borrowing in a more reliable and systematic way.


DAG < APT consists of:

  • the main database: a collection of target lexemes and their origins extracted from available literature
  • an overview of attested donor lexemes and their translations
  • a list of sources on borrowings in Daghestanian languages


Samira Verhees came up with the idea for the database and cleaned the first batch of data. George Moroz wrote the first script to convert word lists from digital versions of books into a table format. Timofei Dedov created the website and is responsible for adding new data. Timofei Balahanov transliterated donor concepts from Arabic.

How to cite

Plain text

Balahanov, T., T. Dedov, and S. Verhees (2024). DAG < APT, An online database of borrowed vocabulary in the languages of Daghestan, v. 1.0.0. Moscow: Linguistic Convergence Laboratory, NRU HSE. http://lingconlab.ru/dagatlas.


  title = {DAG < APT, An online database of borrowed vocabulary in the languages of Daghestan, v. 1.0.0},
  author = {Timofei Balahanov and Timofei Dedov and Samira Verhees},
  year = {2024},
  publisher = {Linguistic Convergence Laboratory, NRU HSE},
  address = {Moscow},
  url = {http://lingconlab.ru/dagatlas},


Please contact Timofei Dedov (tgdedov@gmail.com) in case you have any inquiries about the data. You can report technical issues in the DAG-APT repository on Github.


  1. We use the generalizing term “Turkic languages” following existing literature. The various Turkic languages that have had some influence on Daghestanian languages (Kumyk and Azerbaijani, and to a lesser extent Nogai and Turkish) are often grouped together because they can be difficult if not impossible to distinguish as lexical donors.↩︎