A Comparative Study of UTF-8, UTF-16, and UTF-32 of Unicode Code Point
The IUP Journal of Telecommunications, Vol. IV, No. 2, pp. 50-59, May 2012
Posted: 15 Oct 2012
Date Written: October 15, 2012
Abstract
Unicode is a critical enabling technology for developers who want to internationalize applications for global environments. Unicode assigns a unique number for every character, irrespective of what the platform, or the program, or the language is. The Unicode Standard has been adopted in the industry by Apple, HP, IBM, Microsoft, Oracle, SAP, Sun, Sybase, and many others. Unicode is required by modern standards such as XML, Java and WML, and is the official way to implement ISO/IEC 10646. It is supported in many operating systems, all modern browsers, and many other products. The emergence of the Unicode standard, and the availability of tools supporting it, is among the most significant recent global software technology advances. Each available format of UTF-8, UTF-16 and UTF-32 has its own pros and cons. The comparison of the following three formats is discussed in this paper.
Keywords: UTF-8, UTF-16, UTF-32, UCS, ASCII
Suggested Citation: Suggested Citation
Do you have a job opening that you would like to promote on SSRN?
