Windows-1254

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search
Windows-1254
MIME / IANAwindows-1254
Alias(es)cp1254 (Code page 1254)
Language(s)Turkish, English, Italian, French, German, Spanish, Portuguese, Danish, Swedish, Finnish, Norwegian, Luxembourgish, Rotokas, Tswana, Azeri (except the ə character, substituted by ä).
Created byMicrosoft
StandardWHATWG Encoding Standard
Classificationextended ASCII, Windows-125x
ExtendsISO 8859-9 (without single-byte C1 controls)

Windows-1254 is a code page used under Microsoft Windows (and for the web), to write Turkish that it was designed for (which is its dominant user, even though it can be used for some other languages too). Characters with codepoints A0 through FF are compatible with ISO 8859-9, but the CR range, which is reserved for C1 control codes in ISO 8859, is instead used for additional characters (analogous to the relationship between ISO-8859-1 and Windows-1252).

The WHATWG Encoding Standard, which specifies the character encodings which are permitted in HTML5 and which compliant browsers must support,[1] includes Windows-1254, which is used for both the Windows-1254 and ISO-8859-9 labels.[2][3] Unicode is preferred for modern applications; authors of new pages and the designers of new protocols are instructed to use UTF-8 instead.[2] As of 2022, less than 0.05% of all web pages use Windows-1254, and less than 0.06% use ISO-8859-9,[4][5] which the WHATWG also requires web browsers to handle as Windows-1254.[2] Since 1.9% of all websites located in Turkey use ISO-8859-9, plus the 1.3% that actually declare Windows-1254 used, in effect, 3.2% of websites there use Windows-1254.[6]

IBM uses code page 1254 (CCSID 1254 and euro sign extended CCSID 5350) for Windows-1254.[7][8][9]

Character set[edit]

The following table shows Windows-1254. Each character is shown with its Unicode equivalent.

Windows-1254[10][11][12][13][14][15]
0 1 2 3 4 5 6 7 8 9 A B C D E F
0x NUL SOH STX ETX EOT ENQ ACK BEL BS HT LF VT FF CR SO SI
1x DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM SUB ESC FS GS RS US
2x  SP  ! " # $ % & ' ( ) * + , - . /
3x 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
4x @ A B C D E F G H I J K L M N O
5x P Q R S T U V W X Y Z [ \ ] ^ _
6x ` a b c d e f g h i j k l m n o
7x p q r s t u v w x y z { | } ~ DEL
8x ƒ ˆ Š Œ
9x ˜  š œ Ÿ
Ax NBSP ¡ ¢ £ ¤ ¥ ¦ § ¨ © ª « ¬ SHY ® ¯
Bx ° ± ² ³ ´ µ · ¸ ¹ º » ¼ ½ ¾ ¿
Cx À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï
Dx Ğ Ñ Ò Ó Ô Õ Ö × Ø Ù Ú Û Ü İ Ş ß
Ex à á â ã ä å æ ç è é ê ë ì í î ï
Fx ğ ñ ò ó ô õ ö ÷ ø ù ú û ü ı ş ÿ
  Differences from Windows-1252

See also[edit]

References[edit]

  1. ^ "8.2.2.3. Character encodings". HTML 5.1 2nd Edition. W3C. User agents must support the encodings defined in the WHATWG Encoding standard, including, but not limited to […]
  2. ^ a b c van Kesteren, Anne. "Names and labels". Encoding Standard. WHATWG.
  3. ^ van Kesteren, Anne. "Legacy single-byte encodings". Encoding Standard. WHATWG.
  4. ^ "Historical trends in the usage of character encodings for websites". w3techs.com.
  5. ^ "Frequently Asked Questions". w3techs.com.
  6. ^ "Distribution of character encodings among websites that use Turkey". w3techs.com. Retrieved 2022-10-23.
  7. ^ "Code page 1254 information document". Archived from the original on 2016-03-03.
  8. ^ "CCSID 1254 information document". Archived from the original on 2016-03-26.
  9. ^ "CCSID 5350 information document". Archived from the original on 2014-11-29.
  10. ^ Unicode mapping table for Windows 1254
  11. ^ Unicode mappings of windows 1254 with "best fit"
  12. ^ Code Page CPGID 01254 (pdf) (PDF), IBM
  13. ^ Code Page CPGID 01254 (txt), IBM
  14. ^ International Components for Unicode (ICU), ibm-1254_P100-1995.ucm, 2002-12-03
  15. ^ International Components for Unicode (ICU), ibm-5350_P100-1998.ucm, 2002-12-03

External links[edit]