Windows-1256

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search
Windows-1256
MIME / IANAwindows-1256
Alias(es)cp1256 (Code page 1256)
Language(s)Arabic, Persian, Urdu, English, French (except capital letters with diacritics)
Created byMicrosoft
StandardWHATWG Encoding Standard
Classificationextended ASCII, Windows-125x

Windows-1256 is a code page used under Microsoft Windows to write Arabic and other languages that use Arabic script, such as Persian and Urdu.

This code page is not compatible with ISO-8859-6 and MacArabic encodings.

Windows-1256 encodes every abstract single letter of the basic Arabic alphabet, not every concrete visual form of isolated, initial, medial, final or ligatured letter shape variants (i.e. it encodes characters, not glyphs). The Arabic letters in the C0-FF range are in Arabic alphabetic order, but some Latin characters are interspersed among them. These are some Windows-1252 Latin characters used for French, since this European language has some historic relevance in former French colonies in North Africa such as Morocco and Algeria. This allowed French and Arabic text to be intermixed when using Windows 1256 without any need for code-page switching (however, upper-case letters with diacritics were not included).

IBM uses code page 1256 (CCSID 1256, euro sign extended CCSID 5352, and the further extended CCSID 9448) for Windows-1256.[1][2][3][4]

Unicode is preferred over Windows 1256 in modern applications, especially on the Internet; meaning the dominant UTF-8 encoding for web pages (see also Arabic script in Unicode, for complete coverage, unlike for e.g. Windows 1256 or ISO-8859-6 that do not cover extras). Less than 0.1% of all web pages use Windows-1256 in September 2019.[5][6]

Character set[edit]

Since the original code page left 9 values (bytes) marked as "NOT USED" in the original specification,[7] these bytes were used later for additional characters needed for the Perso-Arabic script (for the Persian and Urdu languages), plus the euro sign.[8]

The following table shows the extended version of Windows-1256. Each character is shown with its Unicode equivalent and its decimal code.

Here every Arabic letter is shown in isolated form. The actual forms of the letters inside Arabic words are rendered by a combination of software rules and appropriate font support.

Windows-1256[8][9][10][11][12][13][14]
_0 _1 _2 _3 _4 _5 _6 _7 _8 _9 _A _B _C _D _E _F
0_
0
NUL
0000
SOH
0001
STX
0002
ETX
0003
EOT
0004
ENQ
0005
ACK
0006
BEL
0007
BS
0008
HT
0009
LF
000A
VT
000B
FF
000C
CR
000D
SO
000E
SI
000F
1_
16
DLE
0010
DC1
0011
DC2
0012
DC3
0013
DC4
0014
NAK
0015
SYN
0016
ETB
0017
CAN
0018
EM
0019
SUB
001A
ESC
001B
FS
001C
GS
001D
RS
001E
US
001F
2_
32
SP
0020
!
0021
"
0022
#
0023
$
0024
%
0025
&
0026
'
0027
(
0028
)
0029
*
002A
+
002B
,
002C
-
002D
.
002E
/
002F
3_
48
0
0030
1
0031
2
0032
3
0033
4
0034
5
0035
6
0036
7
0037
8
0038
9
0039
:
003A
;
003B
<
003C
=
003D
>
003E
?
003F
4_
64
@
0040
A
0041
B
0042
C
0043
D
0044
E
0045
F
0046
G
0047
H
0048
I
0049
J
004A
K
004B
L
004C
M
004D
N
004E
O
004F
5_
80
P
0050
Q
0051
R
0052
S
0053
T
0054
U
0055
V
0056
W
0057
X
0058
Y
0059
Z
005A
[
005B
\
005C
]
005D
^
005E
_
005F
6_
96
`
0060
a
0061
b
0062
c
0063
d
0064
e
0065
f
0066
g
0067
h
0068
i
0069
j
006A
k
006B
l
006C
m
006D
n
006E
o
006F
7_
112
p
0070
q
0071
r
0072
s
0073
t
0074
u
0075
v
0076
w
0077
x
0078
y
0079
z
007A
{
007B
|
007C
}
007D
~
007E
DEL
007F
8_
128

20AC
پ
067E

201A
ƒ
0192

201E

2026

2020

2021
ˆ
02C6

2030
ٹ
0679

2039
Œ
0152
چ
0686
ژ
0698
ڈ
0688
9_
144
گ
06AF

2018

2019

201C

201D

2022

2013

2014
ک
06A9

2122
ڑ
0691

203A
œ
0153
ZWNJ
200C
ZWJ
200D
ں
06BA
A_
160
NBSP
00A0
،
060C
¢
00A2
£
00A3
¤
00A4
¥
00A5
¦
00A6
§
00A7
¨
00A8
©
00A9
ھ
06BE
«
00AB
¬
00AC
SHY
00AD
®
00AE
¯
00AF
B_
176
°
00B0
±
00B1
²
00B2
³
00B3
´
00B4
µ
00B5

00B6
·
00B7
¸
00B8
¹
00B9
؛
061B
»
00BB
¼
00BC
½
00BD
¾
00BE
؟
061F
C_
192
ہ
06C1
ء
0621
آ
0622
أ
0623
ؤ
0624
إ
0625
ئ
0626
ا
0627
ب
0628
ة
0629
ت
062A
ث
062B
ج
062C
ح
062D
خ
062E
د
062F
D_
208
ذ
0630
ر
0631
ز
0632
س
0633
ش
0634
ص
0635
ض
0636
×
00D7
ط
0637
ظ
0638
ع
0639
غ
063A
ـ
0640
ف
0641
ق
0642
ك
0643
E_
224
à
00E0
ل
0644
â
00E2
م
0645
ن
0646
ه
0647
و
0648
ç
00E7
è
00E8
é
00E9
ê
00EA
ë
00EB
ى
0649
ي
064A
î
00EE
ï
00EF
F_
240
ً
064B
ٌ
064C
ٍ
064D
َ
064E
ô
00F4
ُ
064F
ِ
0650
÷
00F7
ّ
0651
ù
00F9
ْ
0652
û
00FB
ü
00FC
LRM
200E
RLM
200F
ے
06D2

  Letter  Number  Punctuation  Symbol  Other  Undefined

See also[edit]

References[edit]

  1. ^ "Code page 1256 information document". Archived from the original on 2016-03-03.
  2. ^ "CCSID 1256 information document". Archived from the original on 2016-03-27.
  3. ^ "CCSID 5352 information document". Archived from the original on 2014-11-29.
  4. ^ "CCSID 9448 information document". Archived from the original on 2014-11-29.
  5. ^ "Historical trends in the usage of character encodings for websites, September 2019". w3techs.com.
  6. ^ "Frequently Asked Questions". w3techs.com.
  7. ^ Archiveddocs. "Code Page 1256 Windows Arabic". docs.microsoft.com.
  8. ^ a b "cp1256 to Unicode table" (PDF). www.unicode.org. Retrieved 2019-05-31.
  9. ^ Unicode mappings of windows 1256 with "best fit"
  10. ^ Code Page CPGID 01256 (pdf) (PDF), IBM
  11. ^ Code Page CPGID 01256 (txt), IBM
  12. ^ International Components for Unicode (ICU), ibm-1256_P110-1997.ucm, 2002-12-03
  13. ^ International Components for Unicode (ICU), ibm-5352_P100-1998.ucm, 2002-12-03
  14. ^ International Components for Unicode (ICU), ibm-9448_X100-2005.ucm, 2005-11-15

External links[edit]