7-bit character sets

ASCII, ISO 646 and IA5 are well-known 128-character sets that are fundamental to computing. Several versions of these standards were released in the past. This article discusses the history of ASCII and related national 7-bit character sets as well as the differences between their versions. Character tables are included.

When computers were young in the early 1960s, it was decided that text should be represented with 7 bits for each character. Seven bits would be enough to represent 128 different characters, including letters, numbers, symbols and required control codes. 6 bits were too few. 8 bits were considered too much. The standard became 7.

ASCII (American Standard Code for Information Interchange) was the first 7-bit character set to be standardized. During the years, several revisions of ASCII were published. ASCII based character sets became immensely widespread. Most character sets in current use are based on ASCII in a way or another.

ISO 646 and IA5 (CCITT International Alphabet No. 5) are the international counterparts of ASCII. ISO 646 and IA5 are ASCII-like standards that define more or less the same 128-character set as ASCII. The main difference is that these standards are international. ASCII is the U.S. national version of ISO 646.

National character sets. From the very beginning it was realized that 128 characters were not enough for international use. Each country required its own national characters with letters, accents and other diacritics. To meet this need, ISO 646 defined an International Reference Version (IRV), which each country could tune up to their national needs. Certain character positions could be replaced by national characters. A large number of country-specific versions were standardized. The versions were registered in the ISO/IEC International Register of Coded Character Sets (IR).

In addition to the these sets, several other 7-bit character sets were defined. 7-bit sets have later been replaced with 8-bit, 16-bit, even 32-bit character sets. Still, the 7-bit sets are the fundamental building blocks of almost all of today's character encoding systems.

Revisions of ASCII

ASCII has undergone several revisions to become the character set we know today. The history of ASCII is not always fully understood. As an example, IANA lists ASCII as the same thing as ANSI_X3.4-1968 and ANSI_X3.4-1986. This is not entirely accurate. The 1968 revision was ambigous. The ambiguities were fixed later, making the 1986 revision different from the 1968 revision.

The ASCII version that became standard was first published in 1967 and 1968. The character set of these versions is identical. However, it is not actually what we currently think of as ASCII. The Number Sign (#) could be replaced by the symbol £. The character in position 7C was actually a broken vertical bar (¦). It was broken to prevent confusion with logical OR (|). In later standards this position is either a vertical bar (|) or a national character. According to ASCII-1967 and ASCII-1968, three characters could be "stylized". "!" could be stylized as "|" to represent logical OR. "^" could be stylized as "¬" (logical NOT). The character "˜" was called an overline when used as punctuation, and as a tilde when used as a diacritic. It could also be used for another accent.

ASCII-1963 (ASA standard X3.4-1963) was the initial release of ASCII. It was in many ways different from the ASCII in current use. ASCII-1963 didn't yet gain wide acceptance. One of the reasons is that IBM chose to use EBCDIC, an IBM proprietary character set, in its successful SYSTEM/360 series of computers released in 1964.

ASCII-1965 was an unpublished major revision. It looked a lot like the current ASCII, even though there were differences with certain characters. ASCII-1965 was accepted as a standard, but it went unpublished and unused.

ASCII-1967 (USAS X3.4-1967) was a major revision of the previous versions of ASCII. This was the version that eventually evolved to the ASCII we know today.

ASCII-1967 was not exactly what we currently think of as ASCII. The differences are as follows. ASCII-1967 offered some options for certain characters, and one character was totally ambigous. The Number Sign (#) could be replaced by the symbol £. Two characters could be stylized. The Exclamation Point (!) could be stylized as a logical OR (|) and the Circumflex (^) could be stylized as a logical NOT (¬). Character 7C, even though called a Vertical Line, looked like a broken vertical bar (¦). It looked that way to avoid confusion with a solid vertical bar (|) used as a logical OR. In other words, since character 21 could sometimes look like (|), 7C had to look like (¦).

Character 7E was ambiguous. This character had three functions. It was 1) Overline when used as punctuation, 2) Tilde when used as a diacritic, and 3) General Accent, yet another diacritic which could be used for other accents not specifically provided. The character appeared in two shapes, upper tilde (˜) and midline tilde (~), interchangeably. No explanation was provided as to which shape to use and when. The character did not look like an overline (¯), even when it was called Overline. As if they couldn't decide what this character really was for. The midline shape (~) may have been unintentional. The midline position conflicts with the intended use either as a diacritic or as an overline. Ambiguity regarding the shape seems to have originated in ASCII-1965, where it may have been a typographical error or restriction.

ASCII-1968 (USAS X3.4-1968) was a minor revision. It didn't change any of the graphic characters. The only change was to the "newline" function. LF could now be used alone as a newline. The previous versions required the use of CR LF (or LF CR). The 1968 standard also gave the code its name ASCII or USASCII.

ASCII-1977 (ANSI X3.4-1977) fixed some of the ambiguities of ASCII-1967 and ASCII-1968. The Number Sign (#) could no longer be replaced by the Pound (£). Character 7C was now a Vertical Line (|) that no longer looked like a broken vertical bar. One could no longer stylize the Exclamation Point (!) as a (|) or the Circumflex (^) as a logical NOT (¬). Overline was no longer present; it was simply a Tilde (˜, not ~). That character could no longer be used as a General Accent either. ASCII-1977 also changed the definitions of several control characters. The changes did not necessarily change the intended use of these characters. An essential change was with VT and FF: it was now possible to allow an "optional implicit CR" after VT and FF the same way it was already possible with LF. More changes can be found in Control characters in ASCII and Unicode.

ASCII-1986 (ANSI X3.4-1986) did not change the character set nor the control characters.

Revisions of ISO 646

ASCII was accepted, with modifications, as an ISO recommendation in 1967. ISO 646 (officially, 7-bit coded character set for information processing interchange) was an inherently international standard. The basis was "IRV", an International Reference Version, which could be tuned up to national needs. ASCII was the US national version of ISO 646. Other national versions were published for Canada, Finland, France and so on by replacing certain graphic characters with national characters.

ISO R 646-1967 was the first official version of the standard (then called a recommendation). This version didn't provide an IRV yet, but only a skeleton chart to be filled by national standards organizations. The character set was similar to that of ASCII-1977 with the following differences: In place of the Number sign (#) there was a Currency symbol (£). Characters | { | } were totally missing; their locations were empty. Character 7E was an Overline (¯).

National versions could be produced by assigning national characters in place of those characters that in ASCII are @ [ \ ] { | }. In ISO R 646-1967, though, only @ [ and ] were in place, and the remaining slots were empty. When more national characters were needed, characters ^` ¯ could also be replaced by national characters. In specific, character 7E (¯) could be used as ˜ or another diacritical sign. £ could be replaced by # in countries where £ was not needed.

A special Sterling rule existed for the two characters immediately succeeding digit 9, namely the colon (:) and semicolon (;). These characters could be replaced by symbols for 10 and 11, respectively. This was to facilitate the adoption of ASCII in the sterling monetary area. In the old British monetary system, a pound was 20 shillings and a shilling was 12 pence.

ISO 646-1973 was the second version of the standard. This was the first version to define an IRV. The IRV was similar to ASCII-1977 with the following differences: In place of the Dollar sign ($) there was a Currency sign (¤). Character 7E (¯) was called Overline, Tilde, but it was supposed to look like an overline in the IRV.

National versions could be produced by assigning national characters in place of characters @ [ \ ] { | }. When more national characters were needed, characters ^` ¯ could be used for the same purpose. In specific, character 7E (¯) could be used as ˜ or another diacritical sign. Thus, national characters would appear at the same positions as before. The allowed characters in the "currency positions" were now (£ or #) for position 23 and ($ or ¤) for position 24. The Sterling rule was dropped now that the British Isles had moved to a decimal monetary system (in 1971).

ISO 646-1983 has not been available at the time of writing this article. Based on references in other sources, the IRV kept the Currency sign (¤). A change appears to have been made in the IRV as regards the Overline or Tilde character. Different interprentations on this character have been made in related standards. ECMA-6 (1985) lists this character as TILDE, OVERLINE (~). In IA5 (1988) the character was Tilde, overline (¯). In the IBM codepage 1009, which is based on ISO 646-1983, it is (˜).

The ISO International Registry appears to list the ISO 646-1983 character set as set number 002, but the actual document is actually from ISO 646-1973.

ISO/IEC 646:1991 is the current release. The IRV of 1991 replaced the Currency sign (¤) by the dollar ($).

Revisions of International Alphabet No. 5 (IA5 / IRA)

CCITT standardized the International Alphabet No. 5 (or just IA5). It was meant for data transmission on the general telephone network or on telegraph networks. IA5 is closely related to ISO 646.

IA5, 1968 version (V.3) was the initial standard. This standard has not been available at the time of writing this article.

IA5, 1972 version (V.3) amended the 1968 version. IA5 of 1972 is an almost word-for-word copy of ISO 646-1973. Character 7E (¯) was called Overline, tilde. It looked like (¯) in the IRV, but could be used as ˜ or another diacritical sign in national use.

IA5, 1988 version (T.50) corresponds to ISO 646-1983. The character set was an exact copy of that of the 1972 version. Confusingly, the IRV character 7E (¯) was now called Tilde, overline. It looked like an overline with no alternative representation as a tilde. No explanation was given as to its use. In national versions of IA5, no specific character was given in this position, but it should vary from nation to nation.

IRA, 1992 version (T.50) is the current standard. IA5 is now called IRA, International Reference Alphabet. IRA is technically equivalent to ISO/IEC 646:1991. Changes in the IRV relative to the 1988 version of IA5 were as follows. The Currency sign (¤) was replaced by the Dollar sign ($). Character 7E was now Tilde. Confusingly, the Tilde appears as ~ on page 9 and as ˜ on page 12 of the document.

Tilde

The tilde (position 7E) is a character that Unicode and ASCII disagree on. Tilde was originally meant as a diacritic. Its location was higher up on the line (˜) rather than in the middle (~), making it possible to use as a diacritic to form characters such as õ or ñ. Tilde looks like (˜) in ASCII-1977 and ASCII-1968, and also in ISO 646 and IA5.

The midline tilde (~) seems to have originated from a typographical error or restriction. The tilde first appeared in ASCII-1965, as printed in ACM Vol 8 Nr 4. This version had a tilde, intended as a diacritic only, but printed as both upper and midline tilde interchangeably. There was a character table with an upper tilde (˜) but in the text, the midline version was used instead. The text clearly refers to use as a diacritic only. This would not make sense with a midline tilde. Thus, the midline version was not intended. The same ambiguity was inherited by ASCII-1967 and ASCII-1968. Their text seems to require the upper position as well (see above).

Unicode 1.0 re-defined the character as TILDE (U+007E), which was a spacing character, not a diacritic. Unicode 1.0 accepted both versions (~ and ˜) as alternative representations of the same character. In addition, three other tildes were encoded: ASCII style "upper tilde" (˜) became available as two additional characters, SPACING TILDE (U+02DC) and NON-SPACING TILDE (U+0303). A midline tilde was also encoded as TILDE OPERATOR (U+223C). Since Unicode 2.0, the regular TILDE is represented as a midline tilde (~). Later Unicode versions have added even more tildes.

Table of differences

The following table lists the differences of the character sets with respect to ASCII-1986. The reference line "ASCII" is on the top. An empty cell means there same character is used both in ASCII-1986 and the other set. A gray cell means no character was defined in that position. A cell with 2 or 3 characters means alternative characters were available in that position.

Position hex

Reference (ASCII)

[

]

{

}

ASCII-1965

ASCII-1967 and ASCII-1968

#£

^¬

ASCII-1977 and ASCII-1986

006

367

ISO646 Invariant

170

ISO R / 646-1967

£#

(@)

([)

(])

¯˜

ISO646 IRV (1973), IA5 IRV (1973, 1988)

002

ISO646 IRV (1991), IA5 IRV (1992)

ISO646-CA Canada

ISO646-CA2 Canada

ISO646-CN China

ISO646-CU Cuba

]

[

ISO646-DE German

021

1011

ISO646-ES Spanish (Olivetti)

017

ISO646-ES2 Spanish languages

085

1014

ISO646-FI Finland, ISO646-SE Swedish

010

1018

Finland Extended version

#£

¤ $

^\|

`{[

¯}]

ISO646-SE2 Swedish for official writing of names

011

ISO646-FR1 French (1973)

025

1114

ISO646-FR French (1982)

ISO646-GB UK

ISO646-HU Hungarian

ISO646-IT Italian (Olivetti)

015

1012

ISO646-JP Japanese Roman

014

ISO646-JP-OCR-B Japanese OCR-B

092

ISO646-NO Norwegian

060

1016

ISO646-NO2 Norwegian v2 (withdrawn)

061

ISO646-PT Portuguese (Olivetti)

016

ISO646-PT2 Portuguese (IBM)

084

1015

ISO646-YU Serbocroatian and Slovenian

141

Irish (Gaelic)

207

T.61 Teletext

102

NATS Finland and Sweden

008-1

–

■

NATS Denmark and Norway

009-1

–

■

Viewdata and Teletext (UK)

047

←

→

↑

⌗

─

║

HP German

HP Spanish

Unicode 1.0

~˜

Unicode 2.0.0 and later

IR = Number in International Register of Coded Character Sets
CP = IBM codepage

ASCII-1963 is very different from all the other sets, see below.

ASCII-1963

NULL

SOM

EOA

EOM

EOT

WRU

BELL

FE0

HT/SK

VTAB

DC0

DC1

DC2

DC3

DC4

ERR

SYNC

LEM

(

)

;

[

]

↑

←

ACK

ESC

DEL

ASA X3.4-1963

ASCII-1965

NUL

SOH

STX

ETX

EOT

ENQ

ACK

BEL

DLE

DC1

DC2

DC3

DC4

NAK

SYN

ETB

CAN

ESC

(

)

;

[

]

{

}

DEL

X3.4-1965

¬ is called overline. The hook appears to distinguish it from underline. X3.4-1965 was approved as a standard, but not published.

ASCII-1967 and ASCII-1968

NUL

SOH

STX

ETX

EOT

ENQ

ACK

BEL

DLE

DC1

DC2

DC3

DC4

NAK

SYN

ETB

CAN

SUB

ESC

#£

(

)

;

[

]

^¬

{

}

DEL

USAS X3.4-1967 and USAS X3.4-1968

Where "#" is not required, it can be replaced by "£". "!" could be stylized as "|" to represent logical OR, and "^" could be stylized as "¬" (logical NOT). "¦" appears in two parts to prevent confusion with logical OR "|". The character "˜" is called an overline when used as punctuation, and as a tilde when used as a diacritic. It can also be used for another accent.

ASCII-1977 and ASCII-1986

NUL

SOH

STX

ETX

EOT

ENQ

ACK

BEL

DLE

DC1

DC2

DC3

DC4

NAK

SYN

ETB

CAN

SUB

ESC

(

)

;

[

]

{

}

DEL

ANSI X3.4-1977 and X3.4-1986. View

US-ASCII. The tilde (˜) was meant to be an accent, so it should appear high rather than in the middle (~). ISO-IR 006 is similar to ASCII-1977 and ASCII-1986, despite it saying it was based on ASCII-1968.

ISO646 Invariant

NUL

SOH

STX

ETX

EOT

ENQ

ACK

BEL

DLE

DC1

DC2

DC3

DC4

NAK

SYN

ETB

CAN

SUB

ESC

(

)

;

DEL

ISO/IEC 646:1992 (ISO-IR 170). View

82 invariant graphic characters of all versions of ISO/IEC 646.

ISO R / 646-1967

NUL

SOH

STX

ETX

EOT

ENQ

ACK

BEL

DLE

DC1

DC2

DC3

DC4

NAK

SYN

ETB

CAN

SUB

ESC

£#

(

)

;

(@)

([)

(])

¯˜

DEL

ISO R / 646-1967

Where "£" is not required, it can be replaced by "#". The empty slots and parenthesized slots are primarily for national characters.

ISO646 IRV (1973), IA5 IRV (1973, 1988)

NUL

SOH

STX

ETX

EOT

ENQ

ACK

BEL

DLE

DC1

DC2

DC3

DC4

NAK

SYN

ETB

CAN

SUB

ESC

(

)

;

[

]

{

}

DEL

ISO 646-1973, CCITT V.3-1973, ITU-T T.50 (1988). View

ISO646 IRV (1991), IA5 IRV (1992)

NUL

SOH

STX

ETX

EOT

ENQ

ACK

BEL

DLE

DC1

DC2

DC3

DC4

NAK

SYN

ETB

CAN

SUB

ESC

(

)

;

[

]

{

}

DEL

ISO/IEC 646:1991, ITU-T T.50 (1992). View

Similar to US-ASCII and also ISO-IR 006.

ISO646-CA Canada

NUL

SOH

STX

ETX

EOT

ENQ

ACK

BEL

DLE

DC1

DC2

DC3

DC4

NAK

SYN

ETB

CAN

SUB

ESC

(

)

;

DEL

CSA Z243.4-1985 (ISO-IR 121). View

Alternate Primary Graphic Set Nr. 1

ISO646-CA2 Canada

NUL

SOH

STX

ETX

EOT

ENQ

ACK

BEL

DLE

DC1

DC2

DC3

DC4

NAK

SYN

ETB

CAN

SUB

ESC

(

)

;

DEL

CSA Z243.4-1985 (ISO-IR 122). View

Alternate Primary Graphic Set Nr. 2

ISO646-CN China

NUL

SOH

STX

ETX

EOT

ENQ

ACK

BEL

DLE

DC1

DC2

DC3

DC4

NAK

SYN

ETB

CAN

SUB

ESC

(

)

;

[

]

{

}

DEL

GB 1988-80 (ISO-IR 057). View

ISO646-CU Cuba

NUL

SOH

STX

ETX

EOT

ENQ

ACK

BEL

DLE

DC1

DC2

DC3

DC4

NAK

SYN

ETB

CAN

SUB

ESC

(

)

;

]

[

DEL

NC 99-10:81 (ISO-IR 151). View

ISO646-DE German

NUL

SOH

STX

ETX

EOT

ENQ

ACK

BEL

DLE

DC1

DC2

DC3

DC4

NAK

SYN

ETB

CAN

SUB

ESC

(

)

;

DEL

DIN 66 003 (ISO-IR 021). View

ISO646-ES Spanish (Olivetti)

NUL

SOH

STX

ETX

EOT

ENQ

ACK

BEL

DLE

DC1

DC2

DC3

DC4

NAK

SYN

ETB

CAN

SUB

ESC

(

)

;

DEL

Variant of ISO 646 for the Spanish language (ISO-IR 017). View

ISO646-ES2 Spanish languages

NUL

SOH

STX

ETX

EOT

ENQ

ACK

BEL

DLE

DC1

DC2

DC3

DC4

NAK

SYN

ETB

CAN

SUB

ESC

(

)

;

DEL

A version of ISO 646 for the Spanish Languages (ISO-IR 085). View

ISO646-FI Finland, ISO646-SE Swedish

NUL

SOH

STX

ETX

EOT

ENQ

ACK

BEL

DLE

DC1

DC2

DC3

DC4

NAK

SYN

ETB

CAN

SUB

ESC

(

)

;

DEL

SFS 4017, SEN 85 02 00 Annex B (ISO-IR 010). View

Finland: Basic version.

Finland Extended version

NUL

SOH

STX

ETX

EOT

ENQ

ACK

BEL

DLE

DC1

DC2

DC3

DC4

NAK

SYN

ETB

CAN

SUB

ESC

#£

¤ $

(

)

;

^\|

`{[

¯}]

DEL

SFS 4017

The five positions allow an alternate symbol, if agreed on between sender and recipient.

ISO646-SE2 Swedish for official writing of names

NUL

SOH

STX

ETX

EOT

ENQ

ACK

BEL

DLE

DC1

DC2

DC3

DC4

NAK

SYN

ETB

CAN

SUB

ESC

(

)

;

DEL

SEN 85 02 00 Annex C (ISO-IR 011). View

ISO646-FR1 French (1973)

NUL

SOH

STX

ETX

EOT

ENQ

ACK

BEL

DLE

DC1

DC2

DC3

DC4

NAK

SYN

ETB

CAN

SUB

ESC

(

)

;

DEL

NF Z 62-010 (1973) (ISO-IR 025). View

Withdrawn.

ISO646-FR French (1982)

NUL

SOH

STX

ETX

EOT

ENQ

ACK

BEL

DLE

DC1

DC2

DC3

DC4

NAK

SYN

ETB

CAN

SUB

ESC

(

)

;

DEL

NF Z 62-010 (1982) (ISO-IR 069). View

Revised.

ISO646-GB UK

NUL

SOH

STX

ETX

EOT

ENQ

ACK

BEL

DLE

DC1

DC2

DC3

DC4

NAK

SYN

ETB

CAN

SUB

ESC

(

)

;

[

]

{

}

DEL

BS 4730 (ISO-IR 004). View

ISO646-HU Hungarian

NUL

SOH

STX

ETX

EOT

ENQ

ACK

BEL

DLE

DC1

DC2

DC3

DC4

NAK

SYN

ETB

CAN

SUB

ESC

(

)

;

DEL

MSZ 7795/3 (ISO-IR 086). View

ISO646-IT Italian (Olivetti)

NUL

SOH

STX

ETX

EOT

ENQ

ACK

BEL

DLE

DC1

DC2

DC3

DC4

NAK

SYN

ETB

CAN

SUB

ESC

(

)

;

DEL

Variant of ISO-7 for Italian (ISO-IR 015). View

ISO646-JP Japanese Roman

NUL

SOH

STX

ETX

EOT

ENQ

ACK

BEL

DLE

DC1

DC2

DC3

DC4

NAK

SYN

ETB

CAN

SUB

ESC

(

)

;

[

]

{

}

DEL

JIS C 6220 1969 (ISO-IR 014). View

ISO646-JP-OCR-B Japanese OCR-B

NUL

SOH

STX

ETX

EOT

ENQ

ACK

BEL

DLE

DC1

DC2

DC3

DC4

NAK

SYN

ETB

CAN

SUB

ESC

(

)

;

[

]

{

}

DEL

JIS C 6229-1984 (ISO-IR 092). View

ISO646-NO Norwegian

NUL

SOH

STX

ETX

EOT

ENQ

ACK

BEL

DLE

DC1

DC2

DC3

DC4

NAK

SYN

ETB

CAN

SUB

ESC

(

)

;

DEL

NS 4551 Version 1 (ISO-IR 060). View

ISO646-NO2 Norwegian v2 (withdrawn)

NUL

SOH

STX

ETX

EOT

ENQ

ACK

BEL

DLE

DC1

DC2

DC3

DC4

NAK

SYN

ETB

CAN

SUB

ESC

(

)

;

DEL

NS 4551 Version 2 (ISO-IR 061). View

ISO646-PT Portuguese (Olivetti)

NUL

SOH

STX

ETX

EOT

ENQ

ACK

BEL

DLE

DC1

DC2

DC3

DC4

NAK

SYN

ETB

CAN

SUB

ESC

(

)

;

DEL

A version of ISO 646 for the Portuguese Language (ISO-IR 016). View

ISO646-PT2 Portuguese (IBM)

NUL

SOH

STX

ETX

EOT

ENQ

ACK

BEL

DLE

DC1

DC2

DC3

DC4

NAK

SYN

ETB

CAN

SUB

ESC

(

)

;

DEL

A version of ISO 646 for the Portuguese Language (ISO-IR 084). View

ISO646-YU Serbocroatian and Slovenian

NUL

SOH

STX

ETX

EOT

ENQ

ACK

BEL

DLE

DC1

DC2

DC3

DC4

NAK

SYN

ETB

CAN

SUB

ESC

(

)

;

DEL

JUS I.B1. 002 (ISO-IR 141). View

Irish (Gaelic)

NUL

SOH

STX

ETX

EOT

ENQ

ACK

BEL

DLE

DC1

DC2

DC3

DC4

NAK

SYN

ETB

CAN

SUB

ESC

(

)

;

DEL

Irish Standard 433:1996 (ISO-IR 207). View

T.61 Teletext

NUL

SOH

STX

ETX

EOT

ENQ

ACK

BEL

DLE

DC1

DC2

DC3

DC4

NAK

SYN

ETB

CAN

SUB

ESC

(

)

;

[

]

DEL

CCITT T.61 (ISO-IR 102). View

NATS Finland and Sweden

NUL

SOH

STX

ETX

EOT

ENQ

ACK

BEL

DLE

DC1

DC2

DC3

DC4

NAK

SYN

ETB

CAN

SUB

ESC

(

)

–

;

■

DEL

ISO-IR 008-1. View

Newspaper text transmission. 45=long dash, minus sign. UA=Unit space A. UB=Unit space B. 94=solid.

NATS Denmark and Norway

NUL

SOH

STX

ETX

EOT

ENQ

ACK

BEL

DLE

DC1

DC2

DC3

DC4

NAK

SYN

ETB

CAN

SUB

ESC

(

)

–

;

■

DEL

ISO-IR 009-1. View

Newspaper text transmission. 45=long dash, minus sign. UA=Unit space A. UB=Unit space B. 94=solid.

Viewdata and Teletext (UK)

NUL

SOH

STX

ETX

EOT

ENQ

ACK

BEL

DLE

DC1

DC2

DC3

DC4

NAK

SYN

ETB

CAN

SUB

ESC

(

)

;

←

→

↑

⌗

─

║

DEL

ISO-IR 047. View

Alphanumerics for viewdata and broadcast teletext.

HP German

NUL

SOH

STX

ETX

EOT

ENQ

ACK

BEL

DLE

DC1

DC2

DC3

DC4

NAK

SYN

ETB

CAN

SUB

ESC

(

)

;

DEL

HP PCL5

HP Spanish

NUL

SOH

STX

ETX

EOT

ENQ

ACK

BEL

DLE

DC1

DC2

DC3

DC4

NAK

SYN

ETB

CAN

SUB

ESC

(

)

;

{

}

DEL

HP PCL5

Unicode 1.0

NUL

SOH

STX

ETX

EOT

ENQ

ACK

BEL

DLE

DC1

DC2

DC3

DC4

NAK

SYN

ETB

CAN

SUB

ESC

(

)

;

[

]

{

}

~˜

DEL

Unicode 1.0. View

Two alternative representations (~|˜) exist for the TILDE character. Similarly, the DOLLAR SIGN ($) has two representations, with one or two vertical bars.

Unicode 2.0.0 and later

NUL

SOH

STX

ETX

EOT

ENQ

ACK

BEL

DLE

DC1

DC2

DC3

DC4

NAK

SYN

ETB

CAN

SUB

ESC

(

)

;

[

]

{

}

DEL

Unicode 2.0.0. View

Alternative representations are no longer given. The TILDE character has a mid-line representation (~).

Sources

Registers

IANA: Character sets. Last updated 2011-10-30.
ISO/IEC: International Register of Coded Character Sets. Referenced August 2012.

Individual standards

ASA standard X3.4-1963. American Standard Code for Information Interchange.
Note: ASCII-1963.
Proposed Revised American Standard Code for Information Interchange. Communications of the ACM Vol 8 Nr 4 (April 1965), p. 207–214.
Note: ASCII-1965.
USAS X3.4-1967: USA Standard Code for Information Interchange. United States of America Standards Institute, New York, USA, 1967.
Note: ASCII-1967.
USAS X3.4-1968: USA Standard Code for Information Interchange. Reprinted as NIC 11246 in Feinler & Postel (ed.): Arpanet Protocol Handbook. NIC 7104 Rev. Jan 1978. ADA-052 594. Network Information Center, Menlo Park, California, USA.
Note: ASCII-1968.
ANSI X3.4-1977: American National Standard Code for Information Interchange. American National Standards Institute, Inc, New York, USA, 1977. Also reprinted in McGraw Hill's Compilation of Data Communication Standards, edition II, McGraw-Hill, 1982.
Note: ASCII-1977.
ANSI X3.4-1986: Coded Character Sets – 7-bit American National Standard Code for Information Interchange. American National Standards Institute, Inc, New York, USA, 1986.
Note: ASCII-1986.
CCITT Recommendation V.3 (1972): International Alphabet No. 5. Reprinted in McGraw Hill's Compilation of Data Communication Standards, edition II, McGraw-Hill, 1982.
CCITT Recommendation T.50 (11/1988): International Alphabet No. 5. Reedition of CCITT Recommendation T.50 published in the Blue Book, Fascicle VII.3 (1988). International Telecommunication Union 2008.
CCITT Recommendation T.50 (09/1992): International Reference Alphabet (IRA) (Formerly International Alphabet No. 5 or IA5). International Telecommunication Union 1993.
ECMA-6: 7-bit Coded Character Set, 5th edition 1985.
ISO / R 646-1967 (E): 6 and 7-bit coded character sets for information processing interchange. 1st edition December 1967. International Organization for Standardization, Switzerland.
ISO 646-1973 (E): 7-bit coded character set for information processing interchange. ISO Standards Handbook 1: Information transfer, 1st edition, 1977. Also reprinted in McGraw Hill's Compilation of Data Communication Standards, edition II, McGraw-Hill, 1982.
ISO 646:1991: Information technology – 7-bit coded character set for information processing interchange.
SFS 4017: 7-bit coded character set for information processing interchange. Suomen standardisoimisliitto, 1977. (Finnish Standards Association SFS)
The Unicode Standard, Version 1.0.0. ASCII, p. 172–175. The Unicode Consortium, 1991. ISBN 0-201-56788-1.
The Unicode Standard, Version 2.0. C0 Controls and Basic Latin, p. 7-6. The Unicode Consortium, 1996. ISBN 0-201-48345-9.
The Unicode Standard, Version 6.1.0. Archived Code Charts, C0 Controls and Basic Latin. The Unicode Consortium, 2012. ISBN 978-1-936213-02-3.

Vendor material

Hewlett-Packard: PCL 5 Comparison Guide. Publication Number: 5021-0378. Edition 2, 6/2003.
IBM: Code page identifiers. Referenced August 2012.
IBM i Globalization. Referenced August 2012.

Last updated in January 2016: tilde, sterling.

7-bit character sets
URN:NBN:fi-fe201201011004

©Aivosto Oy -

7-bit character sets

Revisions of ASCII

Revisions of ISO 646

Revisions of International Alphabet No. 5 (IA5 / IRA)

Tilde

Table of differences

ASCII-1963

ASCII-1965

ASCII-1967 and ASCII-1968

ASCII-1977 and ASCII-1986

ISO646 Invariant

ISO R / 646-1967

ISO646 IRV (1973), IA5 IRV (1973, 1988)

ISO646 IRV (1991), IA5 IRV (1992)

ISO646-CA Canada

ISO646-CA2 Canada

ISO646-CN China

ISO646-CU Cuba

ISO646-DE German

ISO646-ES Spanish (Olivetti)

ISO646-ES2 Spanish languages

ISO646-FI Finland, ISO646-SE Swedish

Finland Extended version

ISO646-SE2 Swedish for official writing of names

ISO646-FR1 French (1973)

ISO646-FR French (1982)

ISO646-GB UK

ISO646-HU Hungarian

ISO646-IT Italian (Olivetti)

ISO646-JP Japanese Roman

ISO646-JP-OCR-B Japanese OCR-B

ISO646-NO Norwegian

ISO646-NO2 Norwegian v2 (with­drawn)

ISO646-PT Portuguese (Olivetti)

ISO646-PT2 Portuguese (IBM)

ISO646-YU Serbo­croatian and Slovenian

Irish (Gaelic)

T.61 Teletext

NATS Finland and Sweden

NATS Denmark and Norway

Viewdata and Teletext (UK)

HP German

HP Spanish

Unicode 1.0

Unicode 2.0.0 and later

Sources

Registers

Individual standards

Vendor material

ISO646-NO2 Norwegian v2 (withdrawn)

ISO646-YU Serbocroatian and Slovenian