Java
  Home arrow Java arrow Page 5 - What`s New in Java 1.5 Tiger?
Dev Articles Forums 
ADO.NET  
Apache  
ASP  
ASP.NET  
C#  
C++  
ColdFusion  
COM/COM+  
Delphi-Kylix  
Design Usability  
Development Cycles  
DHTML  
Embedded Tools  
Flash  
Graphic Design  
HTML  
IIS  
Interviews  
Java  
JavaScript  
MySQL  
Oracle  
Photoshop  
PHP  
Reviews  
Ruby-on-Rails  
SQL  
SQL Server  
Style Sheets  
VB.Net  
Visual Basic  
Web Authoring  
Web Services  
Web Standards  
XML  
Dedicated Servers  
Moblin 
JMSL Numerical Library 
IBM® developerWorks 
Sun Developer Network 
Weekly Newsletter
 
Developer Updates  
Free Website Content 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us Get Paid 
Request Media Kit
Contact Us 
Site Map 
Privacy Policy 
Support 
 USERNAME
 
 PASSWORD
 
 
  >>> SIGN UP!  
  Lost Password? 
JAVA

What`s New in Java 1.5 Tiger?
By: O'Reilly Media
  • Search For More Articles!
  • Disclaimer
  • Author Terms
  • Rating: 3 stars3 stars3 stars3 stars3 stars / 7
    2005-05-19

    Table of Contents:
  • What`s New in Java 1.5 Tiger?
  • Using Queues
  • Ordering Queues Using Comparators
  • Overriding Return Types
  • Taking Advantage of Better Unicode
  • Adding StringBuilder to the Mix

  • Rate this Article: Poor Best 
      ADD THIS ARTICLE TO:
      Del.ici.ous Digg
      Blink Simpy
      Google Spurl
      Y! MyWeb Furl
    Email Me Similar Content When Posted
    Add Developer Shed Article Feed To Your Site
    Email Article To Friend
    Print Version Of Article
    PDF Version Of Article
     
     
    ADVERTISEMENT


    What`s New in Java 1.5 Tiger? - Taking Advantage of Better Unicode


    (Page 5 of 6 )

    While many of the features in this chapter and the rest of the book focus on entirely new features, there are occasions where Tiger has simply evolved. The most significant of these is Unicode support. In pre-Tiger versions of Java, Unicode 3.0 was supported, and all of these Unicode characters fit into 16 bits (and therefore a char). Things are different, now, so you’ll need to understand a bit more.

    How do I do that?

    In Tiger, Java has moved to support Unicode 4.0, which defines several  characters that don’t fit into 16 bits. This means that they won’t fit into a char acters inchar, and that has some far-reaching consequences. You’ll have to use  int to represent these characters, and as a result methods like  Character.isUpperCase() and Character.isWhitespace() now have variants that accept int arguments. So if you’re needing values in Unicode 3.0 that are not available in Unicode 3.0, you’ll need to use these new methods..

    What just happened?

    To really grasp all this, you have to understand a few basic terms:

    codepoint

    A codepoint is a number that represents a specific character. As an example, 0x3C0 is the codepoint for the symbol π.

    Basic Multilingual Plan (BMP)

    The BMP is all Unicode codepoints from \u0000 through \uFFFF. All of these codepoints fit into a Java char.

    supplementary characters

    These are the Unicode codepoints that fall outside of the BMP. There are 21-bit codepoints, with hex values from 010000 through 10FFFF, and must be represented by an int.

    A char, then, represents a BMP Unicode codepoint. To get all the supplementary characters in addition to the BMP, you need to use an int. Of course, only the lowest 21 bits are used, as that’s all that is needed; the upper 21 bits are zeroed out.

    All this assumes that you’re dealing with these characters in isolation, though, and that’s hardly the only use-case. More often, you’ve got to use these characters within the context of a larger String. In those situations, an int doesn’t fit, and instead two char values are encoded, and called a surrogate pair when linked like this. The first char is from the high-surrogates range (\uD800-\uDBFF), and the second char is from the low-surrogates range (\uDC00-\uDFFF). The net effect is that the number of chars in a String is not guaranteed to be the number of codepoints. Sometimes two chars represent a single codepoint (Unicode 4.0), and sometimes they represent two codepoints (Unicode 3.0).

    More Java Articles
    More By O'Reilly Media


     

    Buy this book now. This article was taken from chapter one of Java 1.5 Tiger: A Developer's Notebook, written by Brett McLaughlin and David Flanagan (O'Reilly, 2004; ISBN: 0596007388). Check it out at your favorite bookstore. Buy this book now.

    JAVA ARTICLES

    - Deploying Multiple Java Applets as One
    - Deploying Java Applets
    - Understanding Deployment Frameworks
    - Database Programming in Java Using JDBC
    - Extension Interfaces and SAX
    - Entities, Handlers and SAX
    - Advanced SAX
    - Conversions and Java Print Streams
    - Formatters and Java Print Streams
    - Java Print Streams
    - Wildcards, Arrays, and Generics in Java
    - Wildcards and Generic Methods in Java
    - Finishing the Project: Java Web Development ...
    - Generics and Limitations in Java
    - Getting Started with Java Web Development in...







    © 2003-2008 by Developer Shed. All rights reserved. DS Cluster 2 hosted by Hostway