Little and Big Endian conversion


Problem: Convert an integer from a given endian to its opposite endian

Endian

In computation endian refer to the ordering of bytes within a single word of 16-bit, 32-bit, or 64-bit. A 16-bit word contains 2 bytes. Say 0x12AB is a 16-bit hexadecimal integer. It’s most significant byte is 12 and the least significant byte is AB. When it is stored with the most significant byte 12 first in lower memory address, and the least significant byte AB is stored next to it, in higher memory address, then this storing format is called the big-endian. If the leas significant byte AB is stored first in higher memory address and the most significant byte is stored next to it, that is in the lower memory address then this format is known as the little-endian. Similarly for a 32-bit word say 0x1A2B3C4D , for little-endian format the least significant byte 4D would be stored first in lower memory address, and next would be 3C, next 2B and then at last the most significant byte would be stored 1A.

In little-endian the least significant byte comes first in the lower memory address. And in big-endian the most significant byte come first in lower memory address. Note that in these number representation scheme, the bit ordering within one byte is not changed they are still right to left.

Different Representations of 0x1A2B3C4D

Big-Endian Representation:

            <----- byte address increase                byte address increase ------>
Byte No   :     3      2      1       0     Byte No   :     0      1      2       3   
             +------+------+------+------+               +------+------+------+------+
Stored No :  |  4D  |  3C  |  2B  |  1A  |  Stored No :  |  1A  |  2B  |  3C  |  4D  |
             +------+------+------+------+               +------+------+------+------+
	     

Little-Endian Representation:

            <----- byte address increase                byte address increase ------>
Byte No   :     3      2      1       0     Byte No   :     0      1      2       3   
             +------+------+------+------+               +------+------+------+------+
Stored No :  |  1A  |  2B  |  3C  |  4D  |  Stored No :  |  4D  |  3C  |  2B  |  1A  |
             +------+------+------+------+               +------+------+------+------+

Above the big and little representation has been visualized. Note that which byte is stored is which byte number, and the direction of “byte address increase”. Each representation is showed in two ways, one with byte no/address increasing from left to right and another byte no/address increasing right to left.

It is not known if a word is stored in big or little endian, it depends on what interpretation did the programmer use to store the words, or what interpretation does the machine uses to store the bytes. The reverse of little endian is big endian and vice versa. So the developed code will toggle between big and little endian.

To know more about Endian, see the Wikipedia page here

The Idea

Below we consider the right most byte is the least significant. To develop code which will convert between these two endians, the idea is simple. We fetch the rightmost byte from an integer “x”, masking it with 0xff, and right shift “x” 8bits, thus the rightmost byte is lost and the next byte becomes the rightmost. The fetched rightmost byte is stored on the rightmost byte of another variable “t” and it is left shift 8bits. This sends it one byte position up and makes room in the rightmost position to receive the next byte from “x”. This iteration will terminate when “x” will be zero. This will be after the last byte, the most significant byte, is taken out and it is right shifted, which will make it zero and the loop will terminate. After this “t” will have the opposite endian format that of the integer “x”.

More

In some cases instead of ordering of 1byte (8bits) within a word 2bytes (16bits) are ordered in a word. The process remains the same but in this case we refer 2bytes as a single unit. Changing mask value to 0xffff and the shift amount to 16 will result in this. Or this feature can be incorporated into the function, by passing an additional parameter.

Source Code

/*
 * Code: Endian conversion, little and big.
 * 
 * 
 */
#include <stdio.h>

unsigned int toggle_endian (unsigned int x, int atomicity);

/* Function main. Drives the toggle_endian() function */
int
main (void)
{
  unsigned int hex;

  printf ("\nEnter a Hex Number : ");
  scanf ("%x", &hex);
  printf ("\nEntered Number  : %X", hex);
  printf ("\nOpposite Endian With 8bit atomicity : %X",
	  toggle_endian (hex, 8));
  printf ("\nOpposite Endian With 16bit atomicity : %X\n",
	  toggle_endian (hex, 16));

  return 0;
}

/* 
 *  File Name     : endian.c
 *  Function Name : toggle_endian
 *  Parameters    : 
 *                @ (unsigned int) x
 *                @ (int) atomicity
 *                    # Possible values : 8, 16
 *  Return Type   : (unsigned int)
 *                    # return toggled endian value.
 *                    # return 0x0 if atomicity value is invalid
 *  Globals       : none
 *  Description   : Recieves an unsigned integer and the atomicity value with which
 *                  the endian should be converted. If the atomicity value is not within
 *                  the possible values return 0x0 , else set mask and sft_amt with
 *                  proper values, and perform endian conversion.
 */

unsigned int
toggle_endian (unsigned int x, int atomicity)
{
  unsigned int t = 0;
  unsigned int mask, sft_amt;

  switch (atomicity)
    {
    case 8:
      mask = 0xff;
      sft_amt = 8;
      break;

    case 16:
      mask = 0xffff;
      sft_amt = 16;
      break;

    default:
      /* Invalid atomicity value, return 0x0 */
      return 0x0;
    }

  while (x)
    {
      t <<= sft_amt;
      t |= (x & mask);
      x >>= sft_amt;
    }
  return t;
}

Enter a hexadecimal number in the prompt to get the opposite endian. The second parameter sets the atomicity as told above. This helps to set the proper mask and shift amount values in toggle_endian function.

When atomicity=8, The rightmost byte acquired by anding “x” with mask=0xff. Then this masked out value is fed into the right most byte of “t”. The “x” is right shifted 1byte, so the rightmost byte is lost, and the next byte is ready to be masked out from the right side. After this shifting if “x” is zero, then there are no more bytes to read from “x” in the next iteration, so no need to make room for any more byte in “t” so the loop terminated immediately.

If atomicity!=8 and atomicity!=16 then 0×0 is returned to report error.

This code is made for 32-bit word, but note that this code will also work with 16bit words. This is because after the first two bytes are converted “x” would be 0, and the loop will terminate. “t” is an unsigned int,ie a 32bit word with the leftmost two bytes as 0. So when assigning this unsigned int to an unsigned short int the right most two bytes would be transferred.

The endian conversion includes the signed bit of the word. The passed word be it signed or unsigned is considered as unsigned and endian conversion is done and then returned, where it can be considered as signed or unsigned.

Dry Run

Here is a dry run of the above code with the input 0x1A2B3C4D


Function call:  toggle_endian (0x1A2B3C4D, 8  )

First iteration:
  x is not 0, enter loop

  current values: x = 0x1A2B3C4D, t = 0x00000000

  make space for next byte in t
    t <<= 8   now t = 0x00000000

    (x & 0xff) = 0x4D
    t |= (x & 0xff) , now t = 0x0000004D
    x >>= 8  makes x = 0x001A2B3C


Second iteration:
  x is not 0, enter loop

  current values: x = 0x001A2B3C, t = 0x0000004D

  make space for next byte in t
    t <<= 8   now t = 0x00004D00

    (x & 0xff) = 0x3C
    t |= (x & 0xff) , now t = 0x00004D3C
    x >>= 8  makes x = 0x00001A2B


Third iteration:
  x is not 0, enter loop

  current values: x = 0x00001A2B, t = 0x00004D3C

  make space for next byte in t
    t <<= 8   now t = 0x004D3C00

    (x & 0xff) = 0x2B
    t |= (x & 0xff) , now t = 0x004D3C2B
    x >>= 8  makes x = 0x0000001A

Fourth iteration:
  x is not 0, enter loop

  current values: x = 0x0000001A, t = 0x004D3C2B

  make space for next byte in t
    t <<= 8   now t = 0x4D3C2B00

    (x & 0xff) = 0x1A
    t |= (x & 0xff) , now t = 0x4D3C2B1A
    x >>= 8  makes x = 0x00000000

Fifth iteration:
  x is 0, terminate loop
  return t

On the go conversion with Character Array

Although this source code reverses the endian of one 4byte integer, this might not be the best choice for your application. For example, say you are making a BMP file format parser. The colour data are stored in LSB first that is the little endian format. Instead of loading the words in an integer array and applying the above code on each of the word, it would be easier to load the data in a character array and then jump one word and read that word in byte order, and continue with the next. This will save a lot of bit shifting. This on-the-go conversion will efficient.
I will be adding a code here, demonstrating this in some time.

Related Posts

Updates
  • 07.07.10 : Section added “On the go conversion with Character Array”
  • 01.10.12 : Added “Related Posts” section
About these ads

About phoxis

Homo-sapiens
This entry was posted in Coding Discussions, Computer Science and tagged , . Bookmark the permalink.

5 Responses to Little and Big Endian conversion

  1. Pingback: Synchronization Safe Integer | Phoxis

  2. Pingback: Reading files of the opposite endianness « Good Command Or Filename

  3. ackernar says:

    Hi, I’ve the problem with a file (float data) I resolve it with cpio command:
    1) ls name_file | cpio -o > tmp
    2) rm name_file
    3) cpio -ib < tmp
    name_file contain the right value!

    • phoxis says:

      Yes this is a good way with files. When you need to do such conversions, where you do not have such tools, or builtin functions then you need to go through manual translations. Like the BMP image format file. Although the byte arrangement could be done in different ways.

  4. Pingback: Detect Endianness of a System | Phoxis

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s