ⓘ Cyrillic script


Cyrillic script in Unicode

As of Unicode version 13.0 Cyrillic script is encoded across several blocks, all in the BMP: Phonetic Extensions: U+1D2B, U+1D78, 2 Cyrillic characters Cyrillic Extended-C: U+1C80–U+1C8F, 9 characters Cyrillic Supplement: U+0500–U+052F, 48 characters Combining Half Marks: U+FE2E–U+FE2F, 2 Cyrillic characters Cyrillic Extended-A: U+2DE0–U+2DFF, 32 characters Cyrillic: U+0400–U+04FF, 256 characters Cyrillic Extended-B: U+A640–U+A69F, 96 characters The characters in the range U+0400–U+045F are basically the characters from ISO 8859-5 moved upward by 864 positions. The next characters in the C ...


Romanian Cyrillic alphabet

The Romanian Cyrillic alphabet, is the Cyrillic alphabet that was used to write the Romanian language before the 1860s, when it was officially replaced by a Latin-based Romanian alphabet. The Romanian Cyrillic alphabet was based on the Bulgarian alphabet. Cyrillic remained in occasional use until the 1920s, mostly in Bessarabia. From the 1830s until the full adoption of the Latin alphabet, a so-called Civil alphabet was in place, combining Cyrillic and Latin letters, and including some of the Latin letters with diacritics that remain in the modern Romanian alphabet. The Romanian Orthodox C ...


Bulgarian alphabet

In AD 886, the Bulgarian Empire introduced the Glagolitic alphabet, devised by Saints Cyril and Methodius and their disciples in the 850s. The Glagolitic alphabet was gradually superseded in later centuries by the Cyrillic script, developed around the Preslav Literary School, Bulgaria at the beginning of the 10th century. Several Cyrillic alphabets with 28 to 44 letters were used in the early and middle 19th century during the efforts on the codification of Modern Bulgarian until an alphabet with 32 letters, proposed by Marin Drinov, gained prominence in the 1870s: it was used until the or ...


List of Cyrillic letters

Variants of Cyrillic are used by the writing systems of many languages, especially languages used in the former Soviet Union. The tables below list the Cyrillic letters in use in various modern languages and show the primary sounds they represent in them. Letter forms with a combined diacritic which are not considered separate letters in any language are excluded from the tables, with the exception of ѐ and ѝ. The highlighted letters are those of the basic Cyrillic alphabet; archaic letters no longer in use in any language today are not listed. For letters not on this list, see Template:In ...


Uk (Cyrillic)

Uk is a digraph of the early Cyrillic alphabet, although commonly considered and used as a single letter. It is a combination of the Cyrillic letters О and У or less frequently O and Ѵ. To save space, it was often written as a vertical ligature, called "monograph Uk". In modern times, ⟨ѹ⟩ has been replaced by the simple ⟨у⟩.

Cyrillic script

ⓘ Cyrillic script

The Cyrillic script is a writing system used for various languages across Eurasia and is used as the national script in various Slavic-, Turkic- and Iranic-speaking countries in Eastern Europe, the Caucasus, Central Asia, and Northern Asia.

In the 9th century AD the Bulgarian Tsar Simeon I the Great, following the cultural and political course of his father Boris I, commissioned a new script, the Early Cyrillic alphabet, to be made at the Preslav Literary School in the First Bulgarian Empire, which would replace the Glagolitic script, produced earlier by Saints Cyril and Methodius and the same disciples that created the new Slavic script in Bulgaria. The usage of the Cyrillic script in Bulgaria was made official in 893. The new script became the basis of alphabets used in various languages, especially those of Orthodox Slavic origin, and non-Slavic languages influenced by Russian. As of 2019, around 250 million people in Eurasia use it as the official alphabet for their national languages, with Russia accounting for about half of them. With the accession of Bulgaria to the European Union on 1 January 2007, Cyrillic became the third official script of the European Union, following Latin and Greek.

Cyrillic is derived from the Greek uncial script, augmented by letters from the older Glagolitic alphabet, including some ligatures. These additional letters were used for Old Church Slavonic sounds not found in Greek. The script is named in honor of the two Byzantine brothers, Saints Cyril and Methodius, who created the Glagolitic alphabet earlier on. Modern scholars believe that Cyrillic was developed and formalized by the early disciples of Cyril and Methodius, particularly by Clement of Ohrid.

In the early 18th century, the Cyrillic script used in Russia was heavily reformed by Peter the Great, who had recently returned from his Grand Embassy in western Europe. The new letterforms became closer to those of the Latin alphabet; several archaic letters were removed and several letters were personally designed by Peter the Great. West European typography culture was also adopted.


1. Letters

Cyrillic script spread throughout the East Slavic and some South Slavic territories, being adopted for writing local languages, such as Old East Slavic. Its adaptation to local languages produced a number of Cyrillic alphabets, discussed below.

Capital and lowercase letters were not distinguished in old manuscripts.

Yeri Ы was originally a ligature of Yer and I Ъ + І = Ы. Iotation was indicated by ligatures formed with the letter І: Ꙗ not an ancestor of modern Ya, Я, which is derived from Ѧ, Ѥ, Ю ligature of І and ОУ, Ѩ, Ѭ. Sometimes different letters were used interchangeably, for example И = І = Ї, as were typographical variants like О = Ѻ. There were also commonly used ligatures like ѠТ = Ѿ.

The letters also had numeric values, based not on Cyrillic alphabetical order, but inherited from the letters Greek ancestors.

The early Cyrillic alphabet is difficult to represent on computers. Many of the letterforms differed from those of modern Cyrillic, varied a great deal in manuscripts, and changed over time. Few fonts include glyphs sufficient to reproduce the alphabet. In accordance with Unicode policy, the standard does not include letterform variations or ligatures found in manuscript sources unless they can be shown to conform to the Unicode definition of a character.

The Unicode 5.1 standard, released on 4 April 2008, greatly improves computer support for the early Cyrillic and the modern Church Slavonic language. In Microsoft Windows, the Segoe UI user interface font is notable for having complete support for the archaic Cyrillic letters since Windows 8.


2. Letterforms and typography

The development of Cyrillic typography passed directly from the medieval stage to the late Baroque, without a Renaissance phase as in Western Europe. Late Medieval Cyrillic letters still found on many icon inscriptions today show a marked tendency to be very tall and narrow, with strokes often shared between adjacent letters.

Peter the Great, Tsar of Russia, mandated the use of westernized letter forms ru in the early 18th century. Over time, these were largely adopted in the other languages that use the script. Thus, unlike the majority of modern Greek fonts that retained their own set of design principles for lower-case letters, modern Cyrillic fonts are much the same as modern Latin fonts of the same font family. The development of some Cyrillic computer typefaces from Latin ones has also contributed to the visual Latinization of Cyrillic type.

Cyrillic uppercase and lowercase letter forms are not as differentiated as in Latin typography. Upright Cyrillic lowercase letters are essentially small capitals, although a good-quality Cyrillic typeface will still include separate small-caps glyphs.

Cyrillic fonts, as well as Latin ones, have roman and italic types. However, the native font terminology in most Slavic languages for example, in Russian does not use the words "roman" and "italic" in this sense. Instead, the nomenclature follows German naming patterns:

  • Roman type is called pryamoy shrift "upright type" - compare with Normalschrift "regular type" in German
  • Italic type is called kursiv "cursive" or kursivniy shrift "cursive type" - from the German word Kursive, meaning italic typefaces and not cursive writing
  • Cursive handwriting is rukopisniy shrift "handwritten type" in Russian - in German: Kurrentschrift or Laufschrift, both meaning literally running type

As in Latin typography, a sans-serif face may have a mechanically sloped oblique type naklonniy shrift - "sloped", or "slanted type" instead of italic.

Similarly to Latin fonts, italic and cursive types of many Cyrillic letters typically lowercase; uppercase only for handwritten or stylish types are very different from their upright roman types. In certain cases, the correspondence between uppercase and lowercase glyphs does not coincide in Latin and Cyrillic fonts: for example, italic Cyrillic ⟨ т ⟩ is the lowercase counterpart of ⟨ Т ⟩ not of ⟨ М ⟩.

A boldfaced type is called poluzhirniy shrift "semi-bold type", because there existed fully boldfaced shapes that have been out of use since the beginning of the 20th century. A bold italic combination bold slanted does not exist for all font families.

In Standard Serbian, as well as in Macedonian, some italic and cursive letters are allowed to be different to resemble more to the handwritten letters. The regular upright shapes are generally standardized among languages and there are no officially recognized variations.

The following table shows the differences between the upright and italic Cyrillic letters of the Russian alphabet. Italic forms significantly different from their upright analogues, or especially confusing to users of a Latin alphabet, are highlighted.

Note: in some fonts or styles, lowercase italic Cyrillic ⟨д⟩ ⟨ д ⟩ may look like Latin ⟨ g ⟩ and lowercase italic Cyrillic ⟨т⟩ ⟨ т ⟩ may look exactly like a capital italic ⟨T⟩ ⟨ T ⟩, only smaller.


3. Cyrillic alphabets

Among others, Cyrillic is the standard script for writing the following languages:

  • Non-Slavic languages: Abkhaz, Aleut now mostly in church texts, Bashkir, Chuvash, Erzya, Kazakh to be replaced by Latin script by 2025, Kildin Sami, Komi, Kyrgyz, Dungan, Mari, Moksha, Mongolian, Ossetic, Papar, Papar Kadazan, Romani some dialects, Sakha/Yakut, Tajik, Tatar, Tlingit now only in church texts, Tuvan, Udmurt, Yuit Siberian Yupik, and Yupik in Alaska.
  • Slavic languages: Belarusian, Bulgarian, Macedonian, Russian, Rusyn, Serbo-Croatian, Ukrainian

The Cyrillic script has also been used for languages of Alaska, Slavic Europe except for Western Slavic and some Southern Slavic, the Caucasus, Siberia, and the Russian Far East.

The first alphabet derived from Cyrillic was Abur, used for the Komi language. Other Cyrillic alphabets include the Molodtsov alphabet for the Komi language and various alphabets for Caucasian languages.


4. Name

Since the script was conceived and popularised by the followers of Cyril and Methodius, rather than by Cyril and Methodius themselves, its name denotes homage rather than authorship. The name "Cyrillic" often confuses people who are not familiar with the scripts history, because it does not identify a country of origin in contrast to the "Greek alphabet". Among the general public, it is often called "the Russian alphabet," because Russian is the most popular and influential alphabet based on the script. Some Bulgarian intellectuals, notably Stefan Tsanev, have expressed concern over this, and have suggested that the Cyrillic script be called the "Bulgarian alphabet" instead, for the sake of historical accuracy. It must be noted here that alphabet is not the same as script, so the accurate name is actually the Bulgarian script.

In Bulgarian, Macedonian, Russian, and Serbian, the Cyrillic alphabet is also known as azbuka, derived from the old names of the first two letters of most Cyrillic alphabets just as the term alphabet came from the first two Greek letters alpha and beta. In the Russian language syllabaries, especially the Japanese kana, are commonly referred to as syllabic azbukas rather than syllabic scripts.


5. History

The Cyrillic script was created in the First Bulgarian Empire. Its first variant, the Early Cyrillic alphabet, was created at the Preslav Literary School. It is derived from the Greek uncial script letters, augmented by ligatures and consonants from the older Glagolitic alphabet for sounds not found in Greek. Tradition holds that Cyrillic and Glagolitic were formalized either by Saints Cyril and Methodius who brought Christianity to the southern Slavs, or by their disciples. Paul Cubberley posits that although Cyril may have codified and expanded Glagolitic, it was his students in the First Bulgarian Empire under Tsar Simeon the Great that developed Cyrillic from the Greek letters in the 890s as a more suitable script for church books. Later Cyrillic spread among other Slavic peoples, as well as among non-Slavic Vlachs.

Cyrillic and Glagolitic were used for the Church Slavonic language, especially the Old Church Slavonic variant. Hence expressions such as "И is the tenth Cyrillic letter" typically refer to the order of the Church Slavonic alphabet; not every Cyrillic alphabet uses every letter available in the script.

The Cyrillic script came to dominate Glagolitic in the 12th century. The literature produced in the Old Bulgarian language soon spread north and became the lingua franca of the Balkans and Eastern Europe, where it came to also be known as Old Church Slavonic. The alphabet used for the modern Church Slavonic language in Eastern Orthodox and Eastern Catholic rites still resembles early Cyrillic. However, over the course of the following millennium, Cyrillic adapted to changes in spoken language, developed regional variations to suit the features of national languages, and was subjected to academic reform and political decrees. A notable example of such linguistic reform can be attributed to Vuk Stefanovic Karadzic who updated the Serbian Cyrillic alphabet by removing certain graphemes no longer represented in the vernacular, and introducing graphemes specific to Serbian i.e. Љ Њ Ђ Ћ Џ Ј, distancing it from Church Slavonic alphabet in use prior to the reform. Today, many languages in the Balkans, Eastern Europe, and northern Eurasia are written in Cyrillic alphabets.


6.1. Relationship to other writing systems Latin script

A number of languages written in a Cyrillic alphabet have also been written in a Latin alphabet, such as Azerbaijani, Uzbek, Serbian and Romanian in the Republic of Moldova until 1989, in Romania throughout the 19th century. After the disintegration of the Soviet Union in 1991, some of the former republics officially shifted from Cyrillic to Latin. The transition is complete in most of Moldova except the breakaway region of Transnistria, where Moldovan Cyrillic is official, Turkmenistan, and Azerbaijan. Uzbekistan still uses both systems, and Kazakhstan has officially begun a transition from Cyrillic to Latin scheduled to be complete by 2025. The Russian government has mandated that Cyrillic must be used for all public communications in all federal subjects of Russia, to promote closer ties across the federation. This act was controversial for speakers of many Slavic languages; for others, such as Chechen and Ingush speakers, the law had political ramifications. For example, the separatist Chechen government mandated a Latin script which is still used by many Chechens. Those in the diaspora especially refuse to use the Chechen Cyrillic alphabet, which they associate with Russian imperialism.

Standard Serbian uses both the Cyrillic and Latin scripts. Cyrillic is nominally the official script of Serbias administration according to the Serbian constitution; however, the law does not regulate scripts in standard language, or standard language itself by any means. In practice the scripts are equal, with Latin being used more often in a less official capacity.

The Zhuang alphabet, used between the 1950s and 1980s in portions of the Peoples Republic of China, used a mixture of Latin, phonetic, numeral-based, and Cyrillic letters. The non-Latin letters, including Cyrillic, were removed from the alphabet in 1982 and replaced with Latin letters that closely resembled the letters they replaced.


6.2. Relationship to other writing systems Romanization

There are various systems for Romanization of Cyrillic text, including transliteration to convey Cyrillic spelling in Latin letters, and transcription to convey pronunciation.

Standard Cyrillic-to-Latin transliteration systems include:

  • The Working Group on Romanization Systems of the United Nations recommends different systems for specific languages. These are the most commonly used around the world.
  • Scientific transliteration, used in linguistics, is based on the Bosnian and Croatian Latin alphabet.
  • ISO 9:1995, from the International Organization for Standardization.
  • GOST 16876, a now defunct Soviet transliteration standard. Replaced by GOST 7.79, which is ISO 9 equivalent.
  • BGN/PCGN Romanization 1947, United States Board on Geographic Names & Permanent Committee on Geographical Names for British Official Use).
  • Various informal romanizations of Cyrillic, which adapt the Cyrillic script to Latin and sometimes Greek glyphs for compatibility with small character sets.
  • American Library Association and Library of Congress Romanization tables for Slavic alphabets ALA-LC Romanization, used in North American libraries.

See also Romanization of Belarusian, Bulgarian, Kyrgyz, Russian, Macedonian and Ukrainian.


6.3. Relationship to other writing systems Cyrillization

Representing other writing systems with Cyrillic letters is called Cyrillization.


7.1. Computer encoding Unicode

As of Unicode version 13.0, Cyrillic letters, including national and historical alphabets, are encoded across several blocks:

  • Cyrillic Supplement: U+0500–U+052F
  • Cyrillic: U+0400–U+04FF
  • Phonetic Extensions: U+1D2B, U+1D78
  • Cyrillic Extended-A: U+2DE0–U+2DFF
  • Cyrillic Extended-C: U+1C80–U+1C8F
  • Combining Half Marks: U+FE2E–U+FE2F
  • Cyrillic Extended-B: U+A640–U+A69F

The characters in the range U+0400 to U+045F are basically the characters from ISO 8859-5 moved upward by 864 positions. The characters in the range U+0460 to U+0489 are historic letters, not used now. The characters in the range U+048A to U+052F are additional letters for various languages that are written with Cyrillic script.

Unicode as a general rule does not include accented Cyrillic letters. A few exceptions include:

  • a few Old and New Church Slavonic combinations: Ѷ, Ѿ, Ѽ.
  • combinations that are considered as separate letters of respective alphabets, like Й, Ў, Ё, Ї, Ѓ, Ќ as well as many letters of non-Slavic alphabets;
  • two most frequent combinations orthographically required to distinguish homonyms in Bulgarian and Macedonian: Ѐ, Ѝ;

To indicate stressed or long vowels, combining diacritical marks can be used after the respective letter for example, U+0301 ◌́ COMBINING ACUTE ACCENT: ы́ э́ ю́ я́ etc.

Some languages, including Church Slavonic, are still not fully supported.

Unicode 5.1, released on 4 April 2008, introduces major changes to the Cyrillic blocks. Revisions to the existing Cyrillic blocks, and the addition of Cyrillic Extended A 2DE0. 2DFF and Cyrillic Extended B A640. A69F, significantly improve support for the early Cyrillic alphabet, Abkhaz, Aleut, Chuvash, Kurdish, and Moksha.


7.2. Computer encoding Other

Punctuation for Cyrillic text is similar to that used in European Latin-alphabet languages.

Other character encoding systems for Cyrillic:

  • CP866 – 8-bit Cyrillic character encoding established by Microsoft for use in MS-DOS also known as GOST-alternative. Cyrillic characters go in their native order, with a "window" for pseudographic characters.
  • GOST-main.
  • MIK – 8-bit native Bulgarian character encoding for use in Microsoft DOS.
  • KOI8-R – 8-bit native Russian character encoding. Invented in the USSR for use on Soviet clones of American IBM and DEC computers. The Cyrillic characters go in the order of their Latin counterparts, which allowed the text to remain readable after transmission via a 7-bit line that removed the most significant bit from each byte - the result became a very rough, but readable, Latin transliteration of Cyrillic. Standard encoding of early 1990s for Unix systems and the first Russian Internet encoding.
  • JIS and Shift JIS – Principally Japanese encodings, but there are also the basic 33 Russian Cyrillic letters in upper- and lower-case.
  • KOI8-U – KOI8-R with addition of Ukrainian letters.
  • Windows-1251 – 8-bit Cyrillic character encoding established by Microsoft for use in Microsoft Windows. The simplest 8-bit Cyrillic encoding - 32 capital chars in native order at 0xc0–0xdf, 32 usual chars at 0xe0–0xff, with rarely used "YO" characters somewhere else. No pseudographics. Former standard encoding in some GNU/Linux distributions for Belarusian and Bulgarian, but currently displaced by UTF-8.
  • GB 2312 – Principally simplified Chinese encodings, but there are also the basic 33 Russian Cyrillic letters in upper- and lower-case.
  • ISO/IEC 8859-5 – 8-bit Cyrillic character encoding established by International Organization for Standardization


7.3. Computer encoding Keyboard layouts

Each language has its own standard keyboard layout, adopted from typewriters. With the flexibility of computer input methods, there are also transliterating or phonetic/homophonic keyboard layouts made for typists who are more familiar with other layouts, like the common English QWERTY keyboard. When practical Cyrillic keyboard layouts or fonts are unavailable, computer users sometimes use transliteration or look-alike "volapuk" encoding to type in languages that are normally written with the Cyrillic alphabet.