Everybody has used the GNU or UNIX cat program in the command line. It is used to concatenate files and dump it into the standard output, or can simply be redirected to another file. Long ago i started to write my own version of the cat program. I have implemented each and every function which cat supports, and also made it look identical, except some messages. Although this is not a cat clone, and has no connection with the source code of GNU cat. This code was made by inspecting the output behaviour of GNU cat. This is named wcat. I started this because this was the most simple code to write and was intended for the Whitix OS project run by Mathew (http://www.whitix.org/). This is a very good OS development project for the beginners to start with. I could not at present actively participate in this project because of the time limitations here at my end.

The preliminary code was made very fast, but it took time to make it perfect and replicate the behaviour or GNU cat . The most interesting part was implementing the different options like, line number for the -n and -b option, and printing the special charters with the -v, -I options etc. The line number generation was implemented with the method described in the r-Permutations With Repetitions post. The line number counter is 20 digits long and counts number in decimal, so there is no worry of overflowing line numbers. You will notice the line_number array is initialized with blank spaces and in some locations it is initialized with \r , \t, and . This was done to keep the line_number array pre-formatted so that it can be directly written to the output buffer without any more processing. Have a look at the code, i have tried to keep it as clean as possible.
After some good amount of testing i found no bugs in the release and finally made a final release, which i am presenting here. I will try keep this code updated here at this page (if it undergoes any).

Sourcecode

The sourcecode is presented below. Find a link at the bottom of the page to dowload a zipped file of the code.

/*
 * Program  : wcat
 * Version  : 1.0
 * Revision : 1
 * Status   : Stable
 * 
 */

/*
 * Version Update 1.0:
 *   Now accepts input from stdin.
 */
/* Version Update 0.6:
 *   change in 'wcat()' function parameter
 *   using bit fields for flags
 *   comments added
 *   Decimal counter extended from 18 digits to 20 digits
 */

/* Features: 
 * -E	Show end line with '$'
 * -T	Show tab character with ^I
 * -n	Number all line numbers
 * -b	Number only nonempty line numbers
 * -v 	Show non-printing characters with M- or ^ prefix
 * -s	Sqeeze consicutive empty lines into one
 * -e	Same as -vE
 * -t	Same as -vT
 * -A	Same as -vET
 * -h	Help
 */

/* TODO:
 * Primary:
 *	Testing [DONE] and feedback
 * 	accept input from stdin [DONE (v1.0)]
 * 	cleanup code, and make the main output loop better
 * Secondary:
 * 	write long options
 * 	brief comment the option flags and the operation location
 */

/*
 * Author: Arjun Pakrashi (phoxis)
 *         http://phoxis.org
 */

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys types.h="">
#include <sys stat.h="">
#include <fcntl.h>
#include <ctype.h>
#include <error.h>
#include <errno.h>

#define VERSION "1.0"
#define REVISION "1"
#define STATUS "Stable"

#define TRUE 1
#define FALSE 0

#ifndef NUL
#define NUL '\0'
#endif

#define NEW_LINE '\n'
#define TAB '\t'

#define SUCCESS     1
#define FAIL       -1
#define READ_ERROR  2
#define WRITE_ERROR 3

#define STDOUT_FILE 1
#define STDIN_FILE 0
#define BLK_SIZE BUFSIZ

#define LSB 20			/* pre calculated value of (22 - 2) */
#define ARRAY_LENGTH 22

typedef struct flag_bits
{
  char showend:1;
  char showtab:1;
  char linenum_all:1;
  char linenum_nonempty:1;
  char sqeeze_bl:1;
  char showspchar:1;
  char help:1;
} option_flags;

int wcat (const char *file, option_flags flag);
void print_help (void);
void generate_line_number (void);

/* line_number is used to store decimal counts. The one but last location is initilized
 * with 0 to start the count with 0. Others are initilized with blank space and 
 * carrage returns to subpress printing other digits and make equal formatting to each count.
 * As the count goes on and spans multiple digits the lower posisions are used. This array is
 * used in the 'generate_line_number ()'. Each call to this function will generate the next count.
 * The function is called from and 'line_number' is used in 'wcat ()'
 */
/* array length is 22, valid decimal digits is 20 last two positions for 0 and \t */
char line_number[ARRAY_LENGTH] =
  { ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ',
  '\r', ' ', ' ', ' ', '\r', ' ', ' ', ' ', ' ', ' ',
  '0', '\t'
};

/* Function Name : main
 * Parameters    :
 *               @ (int) argc
 *               @ (char *) argv
 * Return Value  : (int)
 * Globals       : None
 * Description   : Parses the command line options and calls 'wcat()'
 */
int
main (int argc, char *argv[])
{
  int current_file;
  int status;
  int opt;
  option_flags flag = { FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE };

  while ((opt = getopt (argc, argv, "ETnbsveAh")) != -1)
    {
      switch (opt)
	{
	case 'E':
	  flag.showend = TRUE;
	  break;
	case 'T':
	  flag.showtab = TRUE;
	  break;
	case 'n':
	  flag.linenum_all = TRUE;
	  break;
	case 'b':
	  flag.linenum_nonempty = TRUE;
	  break;
	case 's':
	  flag.sqeeze_bl = TRUE;
	  break;
	case 'v':
	  flag.showspchar = TRUE;
	  break;
	case 'e':
	  flag.showspchar = TRUE;
	  flag.showend = TRUE;
	  break;
	case 't':
	  flag.showspchar = TRUE;
	  flag.showtab = TRUE;
	  break;
	case 'A':
	  flag.showspchar = TRUE;
	  flag.showend = TRUE;
	  flag.showtab = TRUE;
	  break;
	case 'h':
	  flag.help = TRUE;
	  break;
	default:
	  error (0, 0, "Execute %s -h for help.\n", argv[0]);
	  return 0;
	}
    }

  if (flag.help)
    {
      print_help ();
      exit (0);
    }

/* If both -b and -n are TRUE then override -b */
  if (flag.linenum_nonempty && flag.linenum_all)
    {
      flag.linenum_all = FALSE;
    }

/* If no file is supplied, then take the stdin as input */
  if (argc == 1)
    {
      status = wcat ("-", flag);
    }

  for (current_file = optind; current_file < argc; current_file++)
    {
      /* here we also pass the '-' parameter, stands for stdin */
      status = wcat (argv[current_file], flag);
    }
  return 0;
}

/* Function Name : wcat
 * Parameters    :
 *               @ (const char *) file : File path
 *               @ (option_flag) flag  : Flag bits
 * Return Value  : (int) Success or Faliure
 * Globals       : (char []) line_number
 * Description   : Dumps a file contents described by 'file' to stdout.
 *                 Output can be modified by flags
 */

/* FIXME: Should we think also about windows \r\n new lines here? */
int
wcat (const char *file, option_flags flag)
{
  int fd;
  char inbuf[BLK_SIZE], outbuf[BLK_SIZE * 4],
    *outpt, *currch, prevch = NEW_LINE, *endbuf;
  long int bytes_read = 0, bytes_to_write = 0, bytes_written = 0;
  int status, nl_lock = FALSE;

  /* NOTE: A non-printing character can be represented by at most 4 characters,
   * when the -v . To be safe 'outbuf[]' is decrared with a size of BLK_SIZE * 4
   */

  if (file[0] == '-' && file[1] == NUL)
    {
      /* read from standard input if file is "-" */
      fd = STDIN_FILE;
    }
  else
    {
      /* else open the supplied file */
      fd = open (file, O_RDONLY);
      if (fd < 0)
	{
	  error (0, errno, "Cannot Open File \"%s\"", file);
	  return 0;
	}
    }
  while (TRUE)
    {
      bytes_read = read (fd, inbuf, BLK_SIZE);
      if (bytes_read == -1)
	{
	  status = READ_ERROR;
	  break;
	}
      if (bytes_read == 0)
	{
	  status = SUCCESS;
	  break;
	}

      bytes_to_write = 0;
      currch = inbuf;
      endbuf = inbuf + bytes_read;
      outpt = outbuf;

      while (currch < endbuf)
	{
	  if (prevch == NEW_LINE)
	    {
	      if (flag.linenum_all)
		{
		  generate_line_number ();
		  memcpy (outpt, line_number, ARRAY_LENGTH);
		  outpt += ARRAY_LENGTH;
		}

	      if ((flag.linenum_nonempty) && (*currch != NEW_LINE))
		{
		  generate_line_number ();
		  memcpy (outpt, line_number, ARRAY_LENGTH);
		  outpt += ARRAY_LENGTH;
		}
	    }

	  if (*currch == NEW_LINE)
	    {
	      if (flag.sqeeze_bl && prevch == NEW_LINE)
		{
		  if (nl_lock)
		    {
		      currch++;
		      continue;
		    }
		  else
		    {
		      nl_lock = TRUE;
		    }
		}
	      else
		{
		  nl_lock = FALSE;
		}

	      if (flag.showend)
		{
		  *outpt++ = '$';
		}
	      *outpt++ = NEW_LINE;
	    }

	  else if ((*currch == TAB) && (flag.showtab))
	    {
	      *outpt++ = '^';
	      *outpt++ = 'I';
	    }

	  else if ((flag.showspchar) && (!isprint (*currch)))
	    {
	      /* NOTE:This condition takes longer time check it, 
	       * fix the if else nesting, avoid unnecessary checks. 
	       */
	      if (*currch == NEW_LINE || *currch == TAB)
		{
		  *outpt++ = *currch;
		}
	      else
		{
		  unsigned char c = *currch;
		  if (c > 127)
		    {
		      *outpt++ = 'M';
		      *outpt++ = '-';
		      c = c - 128;
		    }
		  if (c == 127)
		    {
		      *outpt++ = '^';
		      c = c - 64;
		    }
		  if (c <= 32)
		    {
		      *outpt++ = '^';
		      c = c + 64;
		    }
		  *outpt++ = c;
		}
	    }
	  else
	    {
	      *outpt++ = *currch;
	    }
	  prevch = *currch++;
	}

      bytes_to_write = outpt - outbuf;
      bytes_written = write (STDOUT_FILE, outbuf, bytes_to_write);

      if (bytes_written == -1)
	{
	  status = WRITE_ERROR;
	  break;
	}
      if (bytes_written != bytes_to_write)
	{
	  status = WRITE_ERROR;
	  break;
	}
    }
  /* close the file only if it is not stdin */
  /* NOTE: Closing stdin will cause no more input
   * accepted by the current running program, so it
   * will not accept the next read from stdin for the
   * other '-' parameters if supplied
   */
  if (fd != STDIN_FILE)
    close (fd);

  return status;
}

/* Function Name : generate_line_number
 * Parameters    : (void)
 * Return Value  : (void)
 * Globals       : 
 *               @ (char []) line_number
 * Description   : Each call of this function generates next decimal count
 *                 described by 'line_number' char array.
 */
void
generate_line_number (void)
{
  int i;

  line_number[LSB]++;
  if (line_number[LSB] == ':')
    {
      for (i = LSB; i &gt;= 0; i--)
	{
	  if (line_number[i] == ':')
	    {
	      line_number[i] = '0';
	      if (line_number[i - 1] <= ' ')
		line_number[i - 1] = '1';
	      else
		line_number[i - 1]++;
	    }
	  else
	    break;
	}
    }
}

/* Function Name : print_help
 * Parameters    : (void)
 * Return Value  : (void)
 * Description   : Prints a help of this program into stdout
 */
void
print_help (void)
{
  fprintf (stdout, "Usage: wcat [OPTION] FILE_1 [FILE_2] ... [FILE_n]\n");
  fprintf (stdout, "Concatinate FILE(s) to standard output\n");
  fprintf (stdout, "\nOptions:\n");
  fprintf (stdout, "\t-A\t\tSame as -vET\n");
  fprintf (stdout, "\t-b\t\tNumber only nonempty lines\n");
  fprintf (stdout, "\t-e\t\tSame as -vE\n");
  fprintf (stdout, "\t-E\t\tMark endline with \'$\'\n");
  fprintf (stdout, "\t-n\t\tNumber all lines\n");
  fprintf (stdout, "\t-s\t\tSqeeze consicutive empty lines into one\n"); 
  fprintf (stdout, "\t-t\t\tSame as -vT\n");
  fprintf (stdout, "\t-T\t\tShow tab character as ^I\n");
  fprintf (stdout,
	   "\t-v\t\tShow non-printing characters with M- or ^ prefix except TAB and LFD (new line)\n");
  fprintf (stdout, "\t-h\t\tShow this help\n");
  fprintf (stdout,
	   "\n\nWith no FILE given, or when FILE is - (a hyphen), reads from standard input");
  fprintf (stdout,
	   "\n\nExamples with standard input:\n\twcat file1 - file2 : Output file1's content, then standard input, then file2's content\n\twcat \t\t   : Copy standard input to standard output");
  fprintf (stdout, "\n\nVersion: %s\tRevision: %s\tStatus: %s\n", VERSION,
	   REVISION, STATUS);
}

Download this code here : Download wcat_v1_0.c.zip

To compile the code use the following command

gcc wcat.c -o wcat
Advertisements

3 thoughts on “wcat : A GNU cat implementation

    1. The code is not hard, and that’s why i started with it. The main goal was to try to organize my coding practices.

      And about the scrolling portion of the page it is easy, i have used the div tags and used inline CSS options. I have used it like shown below:

      <div style="max-height:1000px;overflow:auto;">
      Write what ever here, with long long length of page scroll. 
      If it exceeds 1000px a vertical scroll will appear
      </div>
      

      The style option defines the inline CSS options, which overrides the CSS options specified in the current blog theme’s CSS file. The max length of the page is defined, and if it exceeds then it will automatically add a scrollbar. You can similarly define a max-width also which may also add a horizontal scroll bar.
      This is very useful to post long codes, and avoid long main page scroll.
      Checkout any good CSS book or doc, to start CSS.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s