Home arrow Java arrow Page 2 - Formatters and Java Print Streams

Formatters and Java Print Streams

Last week, we discussed Java print streams, concluding with the format method and formatter objects. This week, we pick up from where we left off. This is the second part of a three-part sereis. It is excerpted from chapter seven of Java I/O, Second Edition, written by Elliotte Rusty Harold (O'Reilly, 2006; ISBN: 0596527500). Copyright © 2006 O'Reilly Media, Inc. All rights reserved. Used with permission from the publisher. Available from booksellers or direct from O'Reilly Media.

Author Info:
By: O'Reilly Media
Rating: 4 stars4 stars4 stars4 stars4 stars / 8
June 21, 2007
  1. · Formatters and Java Print Streams
  2. · Character Sets
  3. · Locales
  4. · Format Specifiers
  5. · Floating-point conversions

print this article

Formatters and Java Print Streams - Character Sets
(Page 2 of 5 )


So far, I haven’t paid a lot of attention to character set issues. As long as you stick to the ASCII character set, a single computer, and System.out, character sets aren’t likely to be a problem. However, as data begins to move between different systems, it becomes important to consider what happens when the other systems use different character sets. For example, suppose I use a Formatteror aPrintStream on a typical U.S. or Western European PC to write the sentence “Au cours des dernières années, XML a été adapte dans des domaines aussi diverse que l’aéronautique, le multimédia, la gestion de hôpitaux, les télécommunications, la théologie, la vente au détail, et la littérature médiévale” in a file. Say I then send this file to a Macintosh user, who opens it up and sees “Au cours des derniËres annÈes, XML a ÈtÈ adapte dans des domaines aussi diverse que l’aÈronautique, le multimÈdia, la gestion de hÙpitaux, les tÈlÈcommunications, la thÈologie, la vente au dÈtail, et la littÈrature mÈdiÈvale.” This is not the same thing at all! The confusion is even worse if you go in the other direction.

If you’re writing to the console (i.e.,System.out), you don’t really need to worry about character set issues. The default character set Java writes in is usually the same one the console uses.

Actually, you may need to worry a little. On Windows, the console encoding is usually not the same as the system encoding found in thefile.encodingsystem property. In particular, the console uses a DOS character set such as Cp850 that includes box drawing characters such as L and +, while the rest of the system uses an encoding such as Cp1252 that maps these same code points to alphabetic characters like È and Î. To be honest, the console is reliable enough for ASCII, but anything beyond that requires a GUI.

However, there’s more than one character set, and when transmitting files between systems and programs, it pays to be specific. In the previous example, if we knew the file was going to be read on a Macintosh, we might have specified that it be written with the MacRoman encoding:

  Formatter formatter = new Formatter("data.txt", "MacRoman");

More likely, we’d just agree on both the sending and receiving ends to use some neutral format such as ISO-8859-1 or UTF-8. In some cases, encoding details can be embedded in the file you write (HTML, XML) or sent as out-of-band metadata (HTTP, SMTP). However, you do need some way of specifying and communicating the character set in which any given document is written. When you’re writing to anything other than the console or a string, you should almost always specify an encoding explicitly. Three of theFormatterconstructors take character set names as their second argument:

  public Formatter(String fileName, String characterSet)
   throws FileNotFoundException
  public Formatter(File file , String characterSet)
    throws FileNotFoundException
  public Formatter(OutputStream out, String characterSet)

I’ll have more to say about character sets in Chapter 19.

blog comments powered by Disqus

- Java Too Insecure, Says Microsoft Researcher
- Google Beats Oracle in Java Ruling
- Deploying Multiple Java Applets as One
- Deploying Java Applets
- Understanding Deployment Frameworks
- Database Programming in Java Using JDBC
- Extension Interfaces and SAX
- Entities, Handlers and SAX
- Advanced SAX
- Conversions and Java Print Streams
- Formatters and Java Print Streams
- Java Print Streams
- Wildcards, Arrays, and Generics in Java
- Wildcards and Generic Methods in Java
- Finishing the Project: Java Web Development ...

Watch our Tech Videos 
Dev Articles Forums 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us 
Weekly Newsletter
Developer Updates  
Free Website Content 
Contact Us 
Site Map 
Privacy Policy 

Developer Shed Affiliates


© 2003-2018 by Developer Shed. All rights reserved. DS Cluster - Follow our Sitemap
Popular Web Development Topics
All Web Development Tutorials