Last updated 2004-06-28 by Roedy
Green ©1996-2004 Canadian Mind Products
Java definitions: 0-9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
You are here : home : Java Glossary : E words : endian.
In a binary file, there are no separators between fields. The files are in binary, not readable ASCII.
What do you do if you want to read data not in this standard format, usually prepared by some non-Java program?
You have four options:
The only time endianness becomes a concern is in communicating with legacy little-endian C/C++ applications.
The following code will produce the same result on either a big or little endian machine:
// take 16-bit short apart into two 8-bit bytes. short x = 0xabcd; byte high = (byte)(x >>> 8); byte low = (byte)x;/* cast implies & 0xff */ System.out.println( "x=" + x + " high=" + high + " low=" + low );
The most common problem is dealing with files stored in little-endian format.
I had to implement routines parallel to those in java.io.DataInputStream which reads raw binary, in my LEDataInputStream and LEDataOutputStream classes. Don't confuse this with the io.DataInput human-readable character-based file-interchange format.
If you wanted to do it yourself, without the overhead of the full LEDataInputStream and LEDataOutputStream classes, here is the basic technique:
Presuming your integers are in 2's complement little-endian format, shorts are pretty easy to handle:
short readShortLittleEndian( ) { // 2 bytes int low = readByte() & 0xff; int high = readByte() & 0xff; return(short)( high << 8 | low ); }
Or if you want to get clever and puzzle your readers, you can avoid one mask since the high bits will later be shaved off by conversion back to short.
short readShortLittleEndian( ) { // 2 bytes int low = readByte() & 0xff; int high = readByte(); // avoid masking here return(short)( high << 8 | low ); }
long readLongLittleEndian( ) { // 8 bytes long accum = 0; for ( int shiftBy=0; shiftBy<64; shiftBy+=8 ) { // must cast to long or shift done modulo 32 accum |= (long)( readByte() & 0xff ) << shiftBy; } return accum; }
char readCharLittleEndian( ) { // 2 bytes int low = readByte() & 0xff; int high = readByte(); return(char)( high << 8 | low ); }
int readIntLittleEndian( ) { // 4 bytes int accum = 0; for ( int shiftBy=0; shiftBy<32; shiftBy+=8 ) { accum |= ( readByte () & 0xff ) << shiftBy; } return accum; }
double readDoubleLittleEndian( ) { long accum = 0; for ( int shiftBy=0; shiftBy<64; shiftBy+=8 ) { // must cast to long or shift done modulo 32 accum |= ( (long)( readByte() & 0xff ) ) << shiftBy; } return Double.longBitsToDouble(accum); }
float readFloatLittleEndian() { int accum = 0; for ( int shiftBy=0; shiftBy<32; shiftBy+=8 ) { accum |= ( readByte () & 0xff ) << shiftBy; } return Float.intBitsToFloat(accum); }
byte readByteLittleEndian( ) { // 1 byte return readByte(); }
float f = ByteBuffer.wrap( array ).order( ByteOrder.LITTLE_ENDIAN ).getFloat();
In Gulliver's travels the Lilliputians liked to break their eggs on the small end and the Blefuscudians on the big end. They fought wars over this. There is a computer analogy. Should numbers be stored most or least significant byte first? This is sometimes referred to as byte sex.
Those in the big-endian camp (most significant byte stored first) include the Java VM virtual computer, the Java binary file format, the IBM 360 and follow-on mainframes such as the 390, and the Motorola 68K and most mainframes. The Power PC is endian-agnostic.
Blefuscudians (big-endians) assert this is the way God intended integers to be stored, most important part first. At an assembler level fields of mixed positive integers and text can be sorted as if it were one big text field key. Real programmers read hex dumps, and big-endian is a lot easier to comprehend.
In the little-endian camp (least significant byte first) are the Intel 8080, 8086, 80286, Pentium and follow ons and the MOS 6502 popularised by the Apple ][.
Lilliputians (little-endians) assert that putting the low order part first is more natural because when you do arithmetic manually, you start at the least significant part and work toward the most significant part. This ordering makes writing multi-precision arithmetic easier since you work up not down. It made implementing 8-bit microprocessors easier. At the assembler level (not in Java) it also lets you cheat and pass addresses of a 32-bit positive ints to a routine expecting only a 16-bit parameter and still have it work. Real programmers read hex dumps, and little-endian is more of a stimulating challenge.
If a machine is word addressable, with no finer addressing supported, the concept of endianness means nothing since words are fetched from RAM in parallel, both ends first.
| Byte Sex Endianness of CPUs | ||
|---|---|---|
| CPU | Endianness | Notes |
| AMD Opteron | ? | 64-bit |
| AMD Duron, Athlon, Thunderird | little | 32-bit, the Duron, Athlon and Thunderbird in Windows 95/08/ME/NT/2000/XP |
| Apple ][ 6502 | little | |
| Apple Mac 68000 | big | Uses Motorola 68000 |
| Apple Power PC | big | CPU is bisexual but stays big in the Mac OS. |
| Burroughs 1700, 1800, 1900 | ? | bit addressable. Used different interpreter firmware instruction sets for each language. |
| Burroughs B5000 | word addressable | 48-bits, Algol stack machine, first virtual memory. |
| Burroughs 7800 | word addressable | 48-bits, Algol stack machine |
| CDC LGP-30 | word-addressable only, hence no endianness | 31½ bit words. Low order bit must be 0 on the drum, but can be 1 in the accumulator. |
| CDC 3300, 6600, Cyber | word-addressable, so no endianness | 60 bits |
| Compaq (née DEC) Alpha Servers | little | |
| Cray | big endian | 64-bit |
| DEC PDP | little | 16-bit |
| DEC Vax | little | 32-bit |
| IBM 360, 370, 380, 390, eSeries, zSeries | big | 32-bit |
| IBM 7044, 7090 | word addressable | 36-bit |
| IBM AS-400 | big | 64-bit |
| Power PC | either | The endian-agnostic Power-PC's have a foot in both camps. They are bisexual, but the OS usually imposes one convention or the other. e.g. Mac PowerPCs are big-endian. |
| IBM Power PC G5 | big endian | The endian-agnostic pseudo-little-endian mode has been dropped. This caused Microsoft Virtual PC a major headache in emulating the Pentium on a Mac Power PC G5. |
| Intel 8080, 8988, 8086, 80286 | little | 16-bit Chips used in PCs |
| Intel 80386, 80486, Pentium I, II, III, IV | little | 32-bit, chips used in PCs |
| Intel 8051 | big | |
| Intel Xeon | little | 32-bit, used in Unisys Clearpath servers, like a Pentium designed to be used in groups, with 144 extra SIMD instructions for web servers. |
| Intel Itanium | either | 64-bit |
| MIPS R4000, R5000, R10000 | big | Used in Silcon Graphics IRIX. |
| MOS 6502 | little | MOS 6502 was used in the Apple ][ |
| Motorola 6800, 6809, 680x0, 68HC11 | big | Early Macs used the 68000. Amiga. |
| NCR 8500 | big | |
| NCR Century | big | |
| SGI MIPS | both | machines with Cray ancestry are big, with SGI ancestry are little. |
| Sun Sparc and UltraSparc | big | Sun's Solaris. Normally used as big-endian, but also has support for operating for little-endian mode, including being able to switch endianness under program control for particular loads and stores. |
| Univac 1100 | word-addressable | 36-bit words. |
| Univac 90/30 | big | IBM 370 clone |
| Zilog Z80 | little | Used in CPM machines. |
In theory data can have two different byte sexes but CPUs can have four. Let us give thanks, in this world of mixed left and right hand drive, that there are not real CPUs with all four sexes to contend with.
| The Four Possible Byte Sexes for CPUS | ||
|---|---|---|
| Which Byte
Is Stored in the Lower-Numbered Address? |
Which Byte
Is Addressed? |
Used In |
| LSB | LSB | Intel, AMD, Power PC, DEC. |
| LSB | MSB | none that I know of. |
| MSB | LSB | Perhaps one of the old word mark architecture machines. |
| MSB | MSB | Mac, IBM 390, Power PC |
home |
Canadian Mind Products | |||
| mindprod.com IP:[24.87.56.253] | ||||
| Your IP:[80.134.30.163] | ||||
| You are visitor number 31428. | ||||
| Please send errors, omissions and suggestions | ||||
| to improve this page to Roedy Green. | ||||
| You can get a fresh copy of this page from: | or possibly from your local J: drive mirror: | |||
| http://mindprod.com/jgloss/endian.html | J:\mindprod\jgloss\endian.html | |||