This is a quick post on how to generate a process tree Linux (and *nix) operating systems.

The idea is the same, as in the previous posts: Finding overall and per core CPU utilization and Find process IDs of a running process by name. Read the information present in the /proc/ directory. To get which processes are running we can read the directories with numbers as their names in the /proc/ directory. To generate a process tree we need to establish a process child relationship within the running processes. Each process has a parent (the first generated process is an exception), and it is stored in the process table entry of that process. We need to fetch the parent process id for each running process inorder to establish the tree. Here’s the plan.

I will not go through any background information about *nix processes and immediately get into the point.

What I have done is first read the process information of each of the process and hash them by their PID. Therefore, to get the information about a process with this approach, we just need to search the hash with the PID of the process we want to know details, as the hash key. Many information about the process is stored, which can be read from the different files in /proc/. The essential information is to know the children of this process. This structure holds only the process information in a flat way, from which there is no easy way to extract a parent child relationship.

To establish the parent child relationship, in this implementation, I have taken a separate structure. I am using a hash of arrays. The hash is keyed by the process IDs and the value corresponding each hash is an array of its child process IDs. I am loading this hash as follows.

The /proc/ directory is iterated for processes. Each /proc/PID/ and its files are opened for the PID related information. The /proc/PID/status file is opened, which along with other information also contails the parent process id (PPID). At this point of time we have come to know one parent child relationship, and therefore we store it. We append the PID in the hash location keyed by the PPID.

Let me give an example.

Say we have opened /proc/12345/status (PID = 12345) and come to know that the parent of this process is 789. Therefore we come to know that the PPID of the process with PID = 12345 is 789. So what we do is, append 12345 to the array in the hash fetched by using 789 as the key in the hash table. This tells that “one of the child of 789 is 12345“.

Once this structure is created it is pretty simple to readout the process tree. Drawing a tree in the terminal is just a game of indentation.

I am recursively printing the process tree. The function __show_tree takes two arguments. The first one is a process ID and the second one is the depth of the recursive call. We call it will

__show_tree (0, 0);

This is because we know that 0 is the root of the process tree. Using the depth of the recursive call we can give proper indentation in the terminal.

The __show_tree function will first fetch the array of children PIDs of the process ID in its argument ($parent). Then it will print its process information. The indentation is adjusted using the depth information ($depth). Next it loops over the array of the elements of the children PID array and call __show_tree recursively with the process IDs in the array one by one. In the recursive call the recursive depth is incremented.

I wrote a perl code which I am posting below. Pardon me for not posting a C or C++ code. Perl is very good in string processing and I needed to demonstrate the tree hierarchy creation, therefore C or C++ would require a bit more work.

The code looks a bit messy. To get you started the %pinfo hash will store the flat process information and %proctree will hold the hash of arrays which hold the parent child information. $proctree{$parent} will give the array reference of PIDs who are presently the children of the process with PID $parent. The implementation is pretty straightforward.

Here is the sample code:

Sourcecode

#!/usr/bin/perl

use strict;
use warnings;

my @curthreads;
my @cur_open_files;
my $retval;
my @fields;

our %proctree;
our %pinfo;

sub __show_tree
{
  my $parent = shift;
  my $depth = shift;
  
  my @children = @{$proctree{$parent}} if defined ($proctree{$parent});
  if ($parent == '0')
  {
    print ("\t" x $depth,  "|\n", "\t" x $depth, "+-- [$parent]\n");
  }
  else
  {
    print ("\t" x $depth,  "|\n", "\t" x $depth, "+-- [PID: $parent] (Threads: ", $pinfo{$parent}{'threads'}, ") (Openfiles: ", $pinfo{$parent}{'openfiles'}, ") (State: ", $pinfo{$parent}{'procstate'}, ") (VSZ: ", $pinfo{$parent}{'vsz'}, ") (RSS: ", $pinfo{$parent}{'rss'}, ") (Name: ", $pinfo{$parent}{'name'} ,")\n");
  }
  foreach my $curchild (sort @children)
  {
    #Call for each child PID of the current $parent.
    __show_tree ($curchild, $depth + 1);
  }
}

sub show_tree
{
  __show_tree (0, 0);
}

#Check how many files the process have open.
sub read_open_files
{
  my $current = shift;
  $retval = opendir (SUBD2, "/proc/$current/fd");
  if ($retval)
  {
    @cur_open_files = readdir (SUBD2);
    $pinfo{$current}{'openfiles'} = scalar @cur_open_files - 2; #Two dirs are . and .. 
    closedir (SUBD2);
  }
  else
  {
    $pinfo{$current}{openfiles} = 'NA';
  }
}


######### Start ###########

#Get running PIDs in array @processes
opendir (DIR, "/proc");
my @processes = grep (m/^[0-9]{1,5}$/, readdir (DIR));
print "Processes running: ",  scalar @processes, "\n";
closedir (DIR);


foreach my $current (@processes)
{
  read_open_files ($current);
  
  #Things to do with /proc/PID/status file
  $retval = open (STAT, "/proc/$current/status") or warn ("Cannot open stat\n");
  
  while (<STAT>)
  {
    if (m/^ppid/i)
    {
      #Establish parent child relationship
      @fields = split (":");
      $fields[1] =~ tr/[ \t\n\r]//d;
      #We have got one parent child pair. Append it.
      push (@{$proctree{$fields[1]}}, $current);
    }
    elsif (m/^state/i)
    {
      @fields = split (" ");
      $pinfo{$current}{'procstate'} = $fields[1];
    }
    elsif (m/^vmsize/i)
    {
      @fields = split (":");
      $fields[1] =~ s/^\s+|\s+$//g;
      $pinfo{$current}{'vsz'} = $fields[1];
     }
    elsif (m/^vmrss/i)
    {
      @fields = split (":");
      $fields[1] =~ s/^\s+|\s+$//g;
      $pinfo{$current}{'rss'} = $fields[1];
    }
    elsif (m/^name/i)
    {
      @fields = split (':');
      $fields[1] =~ s/^\s|\s+$//g;
      $pinfo{$current}{'name'} = $fields[1];
    }
    elsif (m/^threads/i)
    {
      @fields = split (':');
      $fields[1] =~ s/^\s+|\s+$//g;
      $pinfo{$current}{'threads'} = $fields[1];
    }
    $pinfo{$current}{'rss'} = 'NA' unless defined $pinfo{$current}{'rss'};
    $pinfo{$current}{'vsz'} = 'NA' unless defined $pinfo{$current}{'vsz'};
    $pinfo{$current}{'name'} = 'NA' unless defined $pinfo{$current}{'name'};
    #Here we can read more about the process from different files
  }
  close (STAT);
}

print "\n\nPrinting process tree:\n";
show_tree ();

If you have a better way to do this, feel free to comment about it.

Leave a comment