Synchronization Safe Integer


Synchronization safe integers are related to ID3v2 tags. ID3v2 tags has a dynamic structure where the length of the tag is variable and inside the tag, there can stay a lot of frames which again can be of dynamic sizes. The size of the tag is stored in the ID3v2 tag header is of length 4 bytes long. These 4 bytes are stored in a special format which is called a synchronization safe integer. Also from ID3v2.4 tags also the frame size bytes which are stored in the frame headers, describing the frame’s length, are stored as 4 byte synchronization safe integer.

Read this post to know more about ID3 tags: What are ID3 Tags all about?

Description

A 4byte synchronization safe integer is stored such that the MSB of each byte, that is the 7th bit (count starts from 0), is always zero. So a synchronization safe integer always have the 7th bit of each byte 0. Note that the relative ordering of the bytes are as big-endian, that is the most significant byte is on higher address. Check This blog post : Little and Big Endian conversion and Wikipedia link for more information on endian.

So for example a 4 byte integer 0x000000FF whose binary representation is 00000000 00000000 00000000 11111111 will be represented as 00000000 00000000 00000001 01111111 in the synchronization safe format. If any of the MSB of a byte is 1, then it is shifted above on the next significant byte. Check the below examples.

0x0000FFFF =              : 00000000 00000000 11111111 11111111
synch-safe representation : 00000000 00000011 01111111 01111111

0x04ADD3AC =              : 00000100 10101101 11010011 10101100
synch-safe representation : 00100101 00110111 00100111 00101100

Note that a 4 byte synchronization safe integer can only use only 28 of its bits, as each MSB of the 4 bytes are 0. And thus the range of the value of a synchronization safe integer which it is capable to encode is 228 - 1 = 268435455 = 0x0FFFFFFF. So in ID3v2.x tags a tag can be at max 268435456 bytes = 256MB long.

An integer can be easily converted to synchronization safe integer just by inserting a 0 in between the 6th and 7th bit of every byte of the 4 byte word, and then taking the first 32 bytes from LSB (ignoring the remaining 4 bytes starting from MSB positions) the as the synchronization safe integer.

Similarly a synchronization safe integer can be converted to normal representation by removing the MSB (the 0 byte) of each byte of the 4 byte word, and then joining them. The resulting 28 bit value is the decoded value.

Encoding and decoding synchronization safe integers are very important in ID3v2.x tag parsing. Especially in ID3v2.4 because except the tag size byte the frame size bytes are also stored as synch-safe integers.

Why “synchronization safe”

It is used to avoid false synchronization signals. MPEG audio files are divided into small, independent frames containing music data. Each frame has a 32 bits long descriptive header, the upper 11 or 12 bytes of which are always set to ‘1’ (ie. Values starting with 0xFFF or 0xFFE), and this header is called the frame sync. This is used by the decoders to detect an audio frame for playing. Now is an ID3 tag had some part of it with 0xFFE or 0xFFF (the top 11 or 12 bytes set to ‘1’), old media players, which cannot read ID3v2 tags, could wrongly detect the portion of the tag as an audio frame’s frame sync. A number constructed in such a manner, as described above (synch-safe integer), could never have a combination starting with 0xFF and hence, numbers represented in such a manner would never cause a false sync. Such integers are called synchronization safe integers, because they are safe from false synchronization.

Although not all the tag is stored as synch-safe integers, this is because to safe the other parts of the tag it has a scheme named “unsynchronization”, which is applied to avoid false synchronization in other parts of the tag. The parts where this unsynchronization scheme is not used, and can have a 0xFF pattern, are stored in synchronization safe integers, like the tag size byte of the tag, and the frame size bytes in ID3v2.4

Check the Wikipedia link for Synch-safe integers: http://en.wikipedia.org/wiki/Synchsafe

Sourcecode

The code below converts 4 byte integers to and from synchronization safe integers and normal integers. The code accepts an integer in the commandline. There are two commandline options -e encodes the given integer into synch-safe integer, and the -d option decodes the input integer to a normal integer.

I will encourage to read the bit shifting part to understand how it works.

/* This code is a part of http://phoxis.org/2010/05/08/synch-safe/  */
/* Code: To encode and decode from and to synchronization safe integers
 * as per ID3v2 specifications. To accompany with LFY ID3 tag article
 */

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>

#ifndef NUL
#define NUL '\0'
#endif

#define DECODE 0
#define ENCODE 1
#define UNSET -1

void show_help (void);

/*
 * Depending on the value of 'todo' the operand integer 'x' will be operated
 * if todo = DECODE then
 *      'x' will be decoded to a normal representation
 * if todo = ENCODE then
 *      'x' will be encoded into synch-safe integer representation
 */
unsigned int sizecodec (unsigned int x, int todo);

int
main (int argc, char *argv[])
{
  int todo = UNSET, opt;
  unsigned int x, x_final;
  char *endptr;

  while ((opt = getopt (argc, argv, "ed")) != -1)
    {
      switch (opt)
	{
	case 'e':
	  todo = ENCODE;
	  break;
	case 'd':
	  todo = DECODE;
	  break;
	default:
	  show_help ();
	  exit (0);
	}
    }

  if (todo == UNSET)
    {
      printf ("\nOperation Unset");
      show_help ();
      exit (0);
    }

  if ((argc - optind) != 1)
    {
      printf ("\nNumber operands should be 1\n");
      show_help ();
      exit (0);
    }

  x = strtol (argv[optind], &endptr, 16);
  if (*endptr != NUL)
    {
      printf ("\nInvalid symbol in input \"%s\"\n", argv[optind]);
      exit (0);
    }

  if (todo == DECODE)
    {
      /* MSB of each byte is inored */
      if (x & 0x808080)
	  x = x & 0x7f7f7f7f;
    }

  if (todo == ENCODE)
    {
      /* only 28 bits used */
      if (x & 0xf0000000)
	{
	  printf ("\nInvalid decoded size hex.\n");
	  exit (0);
	}
    }

  x_final = sizecodec (x, todo);

  printf ("\nInput                        : (Hex : %x) | (Decimal : %u)", x, x);
  printf ("\n%s    : (Hex : %x) | (Decimal : %u)\n",
	  (todo == DECODE ? "Normal representation    " : "Synch-Safe representation "), x_final, x_final);

  return 0;
}

unsigned int
sizecodec (unsigned int x, int todo)
{
  unsigned int a, b, c, d, x_final = 0x0;

  /* Decode to normal representation */
  if (todo == DECODE)
    {
      a = x & 0xff;
      b = (x >> 8) & 0xff;
      c = (x >> 16) & 0xff;
      d = (x >> 24) & 0xff;

      x_final = x_final | a;
      x_final = x_final | (b << 7);
      x_final = x_final | (c << 14);
      x_final = x_final | (d << 21);
    }
  /* Encode to synch-safe integer */
  else if (todo == ENCODE)
    {
      a = x & 0x7f;
      b = (x >> 7) & 0x7f;
      c = (x >> 14) & 0x7f;
      d = (x >> 21) & 0x7f;

      x_final = x_final | a;
      x_final = x_final | (b << 8);
      x_final = x_final | (c << 16);
      x_final = x_final | (d << 24);
    }

  return x_final;
}

void
show_help (void)
{
  printf ("\nCode to encode to and from synchronization safe integers");
  printf ("\nUsage:");
  printf ("\nTo decode : sizecodec -d <operand>");
  printf ("\nTo encode : sizecodec -e <operand>\n");
  printf
    ("\nOperand should be unsigned hex integer (base 16) \nOperand to encode should be less than or equal to 0x0fffffff");
  printf ("\nIf more than options is provided, the last one will be used\n");
}

Click here to download the above sourcecode

About these ads

About phoxis

Homo-sapiens
This entry was posted in Coding Discussions, Computer Science and tagged , , . Bookmark the permalink.

5 Responses to Synchronization Safe Integer

  1. Pingback: What are ID3 Tags all about? | Phoxis

  2. Hello Phoxis,

    It would be good if you could explain more in detail the bit operations given on the source code, because some readers may not have the “skills” to understand them correctly (including me), that would be a plus+ for the article. I’m currently working on a ID3v2.3 library, and since this is one of the most trickiest parts of reading an ID3v2.3 header, your explanation could “cut” this.

    By the way, the blog and the article, are totally awesome!.


    Carlos

    • phoxis says:

      I think you should have a look at how the bitshift operators work from any standard C book or site, and may also have a look at the section 6.5.7 in the ISO/IEC standard for C, which describes the Bitshift operators. Try manually tracing what actually happens when performing the shifts and logical AND operations (section 6.5.14). Then it will be clear i guess.

  3. danny says:

    hello

    am trying to reproduce the decoding part of your code in java but am having problems with line 77. it returns an int value where a boolean value is required for the if structure.
    Please can you kindly help me out here. I’ve been stuck on this project for a while now.

    thanks

    • phoxis says:

      you need to make the

      if (x & 0xf0000000)

      means simply if x & 0xf0000000 is non-zero then enter the if block.

      Therefore fix the code with

      if ((x & 0xf0000000) != 0)

      and the other one with

      if ((x & 0x808080) != 0)

      Also note for Java you need to apply the unsigned right shift operator >>> for proper operation.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s