Find pairs of numbers in an array with difference ‘k’

The problem statement is, an long array is given with n elements, we need to find all the pairs of numbers in this long array which have a constant difference k. I will post three methods. The first method is the brute force method and has a runtime complexity O (n²), the next method has runtime complexity O (n*log (n)), and the last one will have the runtime complexity O (n).

The O (n²) solution

This is the brute force solution. Check each pair in the list for difference k. Fix the i^th, element of the list and iterate over the list from j = (i + 1) upto j = (n - 1). Where i = 0, 1, 3, ... , (n-1) and n is the number of elements in the list, and check if the numbers in the list in the location pair i and j has a difference of k. Here is the sourcecode.

#include <stdio.h>
#include <stdlib.h>

#define ABS(x) (((x)<0)?-(x):(x))

int main (void)
{
  int *arr, i, j, n, k, count = 0;

  scanf (" %d %d", &n, &k);

  arr = malloc (sizeof (int) * n);
  for (i=0; i<n; i++)
  {
    scanf (" %d", &arr[i]);
  }

  /* brute force */
  for (i=0; i<n; i++)
  {
    for (j=i + 1; j<n; j++)
    {
      if (ABS (arr[j] - arr[i]) == k)
      {
        count++;
      }
    }
  }

  printf ("%d\n", count);

  free (arr);
  return 0;
}

For each number fixed and the rest of the list starting from that index is scanned. Therefore growth of the execution time for this method is clearly O (n²). ABS finds the difference between two numbers, because if a and b is compared once only and a - b is -k, we will miss the pair as there will be no b - a comparison.

The O (n*log(n)) solution

In this method, we will first sort the list using any O (n * log (n)) sorting algorithm. I am assuming that the numbers are sorted in a non-decreasing order. After sorting, we know one fact that if j > i and arr[j] - arr[i] > k then that there exists no arr[k] for k > j and arr[k] - arr[i] = k. This will help us search.

The simple way is after we fix an element i and linearly search for the pair. We search if arr[i] + k is present in the array segment start from j = (i + 1) upto j = (n - 1). If present we have a pair, else we don’t. This won’t be O (n * log (n)) as it still will involve, for each element, to the end of the list in the worst case. But this will definitely be much better than the previous solution, as we immediately break at the point when the difference goes beyond k.

We can make improvement to the previous idea by replacing linear search with binary search. For this case, for a location i in the sorted array, find for arr[i] + k in the array range j = (i + 1) to j = (n - 1) . If found then we have a pair, else we don’t.

If k is low and n is large the linear search may get faster pair matches and failures. If k is large then binary search will be faster. But making the binary search will guarantee the search in O (log (n)) time each pass, therefore for n elements O (n * log (n)) which makes the searching for pairs of number with k difference in O (n * log (n)) time.

I will show the binary search implementation of this problem below. I have used the C standard library qsort and bsearch function to quicksort and binary search the list.

#include <stdio.h>
#include <stdlib.h>

static int compare (const void *a, const void *b)
{
  return *((int *) a) - *((int *) b);
}

int main (void)
{
  int i, n, k, *arr, count = 0, key;

  scanf (" %d", &n);
  scanf (" %d", &k);

  arr = malloc (sizeof (int) * n);

  for (i=0; i<n; i++)
  {
    scanf (" %d", &arr[i]);
  }

  qsort (arr, n, sizeof (int), compare);

  for (i=0; i<n; i++)
  { 
    key = arr[i] + k;
    if (bsearch (&key, arr + i + 1, n - i - 1, sizeof (int), compare) != NULL)
    {
      count++;
      printf ("(%d, %d)\n",arr[i], arr[i] + k);
    }
  }

  printf ("Total: %d\n", count);

  free (arr);

  return 0;
}

The O (n) solution

To find the pair of numbers in a list with difference k we have to scan the list at least once, selecting one element in each iteration and doing some processing with it. This process itself is O (n). To solve this problem in O (n), after we select an element during the scanning process, we need to do something which takes O (1) time. Nothing more than O (1) processing time growth per element in the list will let us make the entire process to find k difference pairs O (n).

Therefore the answer is clear, Hash Table comes to the rescue. The process is still the same as the last. We populate the list numbers into the hash table. Then we start iterating with each element in the hash table. For each element e in the hash table, we search if (e + k) is also present in the table. If yes, we have a pair, if no then we don’t. The insertion and search operation in the hash table is O (1) in average case.

Let’s code it.

I will use C++ to implement this, as the STL already has containers which uses hash tables.

The existing map container in C++ STL is implemented with some kind of self balancing binary search tree, most probably Red Black trees (that’s what the implementation in my system uses). Therefore using map will only get you O (log (n)) insertion and search.

I will use the unordered_map present in C++ STL. Note that the unordered_map is implemented using hash tables and is introduced in C++11 standard. Therefore to compile the code you need appropriate compiler switches, if your compiler by default does not compile with C++11 standards. For gcc you need to add the -std=c++11 or at least -std=c++0x switch option.

/* Point to be noted: 
 * map container insertion and search is not O(1), unordered_map is.
 * unordered_map container class is introduced from C++11 standard
 * therefore to compile you need -std=c++0x or -std=c++11 . 
 */

#include <iostream>
#include <unordered_map>

using namespace std;

int main (void)
{
  unordered_map<int, bool> hash;
  int n, k, val, count = 0;

  cin >> n;
  cin >> k;

  /* Load the values in hash 
   */
  for (int i=0; i<n; i++)
  {
    cin >> val;
    hash.insert (pair<int, bool> (val, true));
  }

  /* For each value inside the hash, add `k' with it and see if it exists,
   * in the hash. If yes, we have a pair, otherwise, we have no pair.
   */
  for (unordered_map<int, bool>::iterator it = hash.begin (); it != hash.end (); it++)
  {
    if (hash.find (it->first + k) != hash.end ())
    {
      cout << "(" << it->first << ", " << (it->first + k) << ")" << endl;
      count++;
    }
  }

  cout << "Pairs: " << count << endl;

  return 0;
}

And before end, I will write the same with Perl (learning Perl :) ).

#!/usr/bin/perl -w

my $n = <STDIN>;
my $k = <STDIN>;
my %hash;
my $count = 0;
my $val;
 
for (my $i = 0; $i < $n; $i++)
{
  chomp ($val = <STDIN>);
  $hash{$val} = 1;
}

foreach my $key (keys %hash)
{  
  if (exists ($hash{$key + $k}))
  {
    print ("($key, ", $key + $k, ")\n");
    $count++;
  }
}

print ("\nTotal pairs: $count\n");

For Perl implementation note that, when we enter from keyboard you should enter each number in each line and also should not input leading or trailing spaces or decimal points. I have not included processing to parse out numbers from a line.

One thing to note that, when you press enter after entering an integer it is stored in $val and then it is uses in the hash table. $val has a trailing newline character at this point and every key will go with the trailing newline character in the hash table. Later when we do $val + $k the answer will have no newline character at the end and we won’t find any match. Therefore it is important to remove the trailing newline, if any exists using the chomp function.

So, that was it.

Find pairs of numbers in an array with difference ‘k’

The O (n2) solution

The O (n*log(n)) solution

The O (n) solution

Share this:

Related

Leave a comment Cancel reply

The O (n²) solution