Tip of the day: Did you know that users are put in the security-group known-users based on their reputation score or if they are identified to Services?

Users in this group receive a number of benefits, such as being able to send more messages per minute.

Nick Character Sets

From UnrealIRCd documentation wiki
Jump to navigation Jump to search
Other languages:

In UnrealIRCd you can specify which "character sets" or languages should be allowed in nicknames. You do this in set::allowed-nickchars.

Available UTF8 character sets

UnrealIRCd 4.0.17 adds experimental support for UTF8 character encoding.

Note that many Services packages do not permit registration with such characters. See also #Important notes.

The following languages are available:

Name Description Script/Alphabet Allowed extra characters (other than the default)
hebrew-utf8 hebrew characters Hebrew script אבגדהוזחטיךכלםמןנסעףפץצקרשת
latin-utf8 latin characters Latin script ÀÁÂÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÖØÙÚÛÜÝÞßàáâäåæçèéêëìíîïðñòóôöøùúûüýþÿĂ㥹ĆćČčĎďĘęĚěĹ弾ŁłŃńŇňŐőŔŕŘřŚśŞşŠšŢţŤťŮůŰűŹźŻżŽž
french-utf8 french characters Latin script ÀÂÇÈÉÊËÎÏÔÙÛÜàâçèéêëîïôùûüÿ
slovak-utf8 slovak characters Latin script ÁÄÉÍáäéíóôúýČčĎďĹ弾ňŔ੹ŤťŽž
icelandic-utf8 icelandic characters Latin script ÁÆÍÐÓÖÚÝÞáæíðóöúýþ
danish-utf8 danish characters Latin script ÅÆØåæø
swedish-utf8 swedish characters Latin script ÄÅÖäåö
catalan-utf8 catalan characters Latin script ÀÇÈÉÍÏÒÓÚÜàçèéíïòóú
italian-utf8 italian characters Latin script ÀÈÉÌÍÒÓÙÚàèéìíòóùú
spanish-utf8 spanish characters Latin script ÁÉÍÑÓÚÜáéíñóúü
hungarian-utf8 hungarian characters Latin script ÁÉÍÓÖÚÜáéíóöúüŐőŰű
czech-utf8 czech characters Latin script ÁÉÍÓÚÝáéíóúýČčĎďĚěŇňŘřŠšŤťŮůŽž
romanian-utf8 romanian characters Latin script ÂÎâîĂ㪺Ţţ
swiss-german-utf8 swiss-german characters Latin script ÄÖÜäöü
german-utf8 german characters Latin script ÄÖÜßäöü
turkish-utf8 turkish characters Latin script ÇÖÜçöüĞğıŞş
dutch-utf8 dutch characters Latin script èéëïöü
polish-utf8 polish characters Latin script ÓóĄąĆćĘꣳŃńŚśŹźŻż
latvian-utf8 latvian characters (5.0.7+) Latin script
estonian-utf8 estonian characters (5.0.7+) Latin script
lithuanian-utf8 lithuanian characters (5.0.7+) Latin script
greek-utf8 greek characters Greek script ΆΈΉΊΌΎΏΐΑΒΓΔΕΖΗΘΙΚΛΜΝΞΟΠΡΣΤΥΦΧΨΩΪΫάέήίΰαβγδεζηθικλμνξοπρςστ
ukrainian-utf8 ukrainian characters Cyrillic script ЄІЇАБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЬЮЯабвгдежзийклмнопрстуфхцчшщьюяєіїҐґ
russian-utf8 russian characters Cyrillic script ЁАБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯабвгдежзийклмнопрстуфхцчшщъыьэюяё
cyrillic-utf8 cyrillic characters Cyrillic script ЁЄІЇЎАБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯабвгдежзийклмнопрстуфхцчшщъыьэюяёєіїўҐґ
belarussian-utf8 belarussian characters Cyrillic script ЁІЎАБВГДЕЖЗЙКЛМНОПРСТУФХЦЧШЫЬЭЮЯабвгдежзйклмнопрстуфхцчшыьэюяёіў

Available non-utf8 character sets

Table of all available "old" character sets (not using UTF8):

Name Description Character set / encoding
catalan Catalan characters iso8859-1 (latin1)
danish Danish characters iso8859-1 (latin1)
dutch Dutch characters iso8859-1 (latin1)
french French characters iso8859-1 (latin1)
german German characters iso8859-1 (latin1)
swiss-german Swiss-German characters (no es-zett) iso8859-1 (latin1)
icelandic Icelandic characters iso8859-1 (latin1)
italian Italian characters iso8859-1 (latin1)
spanish Spanish characters iso8859-1 (latin1)
swedish Swedish characters iso8859-1 (latin1)
latin1 catalan, danish, dutch, french, german, swiss-german, spanish, icelandic, italian, swedish iso8859-1 (latin1)
hungarian Hungarian characters iso8859-2 (latin2), windows-1250
polish-iso Polish characters (note that polish-w1250 is more common!) iso8859-2 (latin2)
romanian Romanian characters iso8859-2 (latin2), windows-1250, iso8859-16
latin2 hungarian, polish-iso, romanian iso8859-2 (latin2)
polish-w1250 Polish characters, windows variant windows-1250
slovak-w1250 Slovak characters, windows variant windows-1250
czech-w1250 Czech characters, windows variant windows-1250
windows-1250 polish-w1250, slovak-w1250, czech-w1250, hungarian, romanian windows-1250
greek Greek characters iso8859-7
turkish Turkish characters iso8859-9
russian-w1251 Russian characters windows-1251
belarussian-w1251 Belarussian characters windows-1251
ukrainian-w1251 Ukrainian characters windows-1251
windows-1251 russian-w1251, belarussian-w1251, ukrainian-w1251 windows-1251
hebrew Hebrew characters iso8859-8-I/windows-1255
chinese-simp Simplified Chinese Multibyte: GBK/GB2312
chinese-trad Tradditional Chinese Multibyte: GBK
chinese-ja Japanese Hiragana/Pinyin Multibyte: GBK
chinese chinese-* Multibyte: GBK
gbk chinese-* Multibyte: GBK

Important notes

A few notes:

  • The following basic nick characters are always allowed/included: a-z A-Z 0-9 [ \ ] ^ _ - { | }
  • Some combinations can cause problems and will cause an error. For example, combining latin* and chinese-* can not be properly handled by the IRCd and UnrealIRCd will refuse it. Mixing of other charsets might cause display problems. UnrealIRCd will print out a warning if you try to mix latin1/latin2/greek/other incompatible groups.
  • Most Services do not permit registration of UTF8 nicks
  • Casemapping (if a certain lowercase character belongs to an upper one) is done according to US-ASCII, this means that characters like ö and Ö are not recognized as 'the same' and hence someone can have a nick with álpha and someone else Álpha at the same time. This is a limitation of the current system and IRCd standards. Work is underway at the IRCv3 working group to solve this. People should be aware of this limitation. Note that this limitation already existed in channels (in which nearly any characters have always been available for use, and casemapping was also always performed in US-ASCII).
  • There is also no "similar looking character" or "identical looking character" checking. In particular if you enable cyrillic script (eg: russian-utf8) then characters such as cyrillic A and latin A will look the same. This could be abused to impersonate another user by using the identical looking character.

Examples

Western languages

For people in Europe and other Latin language countries (requires UnrealIRCd 4.0.17 and above):

set { allowed-nickchars { latin-utf8; }; };

Or, to use the old latin1 characters in western europe:

set { allowed-nickchars { latin1; }; };

Chinese language

This allows nick names to contain both Simplified Chinese and Traditional Chinese characters (GBK encoding):

set { allowed-nickchars { chinese-simp; chinese-trad; }; };