**Problem : **To generate all Combinations of a set of distinct elements

Before we start discussing about the implementation we will go through the basic definitions of Combinations. Then we discuss the method to generate all the Combinations with examples and descriptions. Then the source code to implement the method will be presented. Continue reading.

### Definitions

**Combination:**-
- A Combination of a set of distinct objects is an un-ordered arrangement of some or all of these objects.Let A = {1,2,3} , then {1}, {2}, {3}, {1,2}, {2,3} , {1,3}, {1,2,3} and {} are the all the combinations of set A

**r-Combination:**-
- An r-Combination of a set of distinct objects is an un-ordered arrangement of
*‘r’ elements*of these objects. Let there be`n`distinct elements in a set, then a combination of`r`elements selected from that set is an r-combination of the set.Let A = {1,2,3} then {1,2},{1,3},{2,3} are the 2-permutations of set A

- An r-Combination of a set of distinct objects is an un-ordered arrangement of

A combination of a certain set is simply a subset of it. Like the sets {1,2} and {2,1} are the same combinations, because they are the same subsets of the set {1,2,3}. If there are ‘n’ elements in a set, then combinations with length ‘r’ (where ) would be . The total number of combinations that can be generated is . No repetitions of a certain set element is allowed here.

### The Idea

The main idea is to group the bit positions with a ‘1’ of the binary count sequence and use that group to select a combination from a string. Each binary count is unique ie. the places where it has 1’s (or 0’s) are unique in each count. For each count we will get a unique set of bit positions where the bit is 1. For example, the 4 bit binary count starts from 0000 and ends in 1111 . When a bit position is ‘1’ we can think that bit position is selected. and such in 1010 bit position 1 and 3 are selected. Because each count has a unique set of positions which is ‘1’ , each count will select a unique set of positions, and as we count up different combinations of bit positions get selected, thus counting from 0000 to 1111 will select all possible combinations of bit positions. To select a combination from a source string this unique set of bit positions can be used to filter-out/select-from another string and make a combination.

Lets go through this with an example. Let there be a source string of length 4. Each count of the 4 bit binary number has 1’s and 0’s in a certain position and no other count has 1’s and 0’s in the same positions. Like in 1011 there are 1’s in the 0^{th} , 1^{st} and 3^{rd} position and a 0 in the 2nd position. No other binary count has the 1’s and 0’s in the same positions as 1011 . We group the bit position and form a set. Like in the previous example, 1011 will make a set {0,1,3} . For each count we can make such a set consisting of bit-positions where the bit is a 1. Each count will generate a unique set of bit positions. Let the source string of length 4 be `"abcd"`. For each such set generated from the binary count, we use the set elements (position numbers) to select from the string `"abcd"` and generate the combinations. Like the `{0,1,3}` selects the combination `"dca"`. As we count up the binary sequence different positions get selected getting new sets and because each set is unique it will select a unique set of characters from the string “abcd” and form combinations.

All the combinations of a 4 length string is shown below. Study the below table to get the idea.

+-------+------------+-----------+ | Binary | Position | Selected | | Count | Set | String | +-------+------------+-----------+ | 0000 | {} | "" | | 0001 | {0} | "a" | | 0010 | {1} | "b" | | 0011 | {0,1} | "ab" | | 0100 | {2} | "c" | | 0101 | {0,2} | "ac" | | 0110 | {1,2} | "bc" | | 0111 | {0,1,2} | "abc" | | 1000 | {3} | "d" | | 1001 | {0,3} | "ad" | | 1010 | {1,3} | "bd" | | 1011 | {0,1,3} | "abd" | | 1100 | {2,3} | "cd" | | 1101 | {0,2,3} | "acd" | | 1110 | {1,2,3} | "bcd" | | 1111 | {0,1,2,3} | "abcd" | +-------+------------+-----------+

**Note:** The 0^{th} binary bit is the rightmost bit of the binary number. The 0^{th} character is the leftmost character in the string.

The left side column denotes the binary counts, in the middle column we make the set of the bit positions which have a ‘1’ in the corresponding binary count. In the right most column, we use the bit positions from the set to select from the string. The character positions are selected from the source string, denoted by the corresponding bit position set. Like in the count 1001 the set is {0,3} so we select only the 0^{th} and 3^{rd} characters from the source string and thus the selection is “ad” . After all the binary count of a fixed bit length (4 bits here) is finished we have acquired all the unique sets and thus using the sets we have got all the combinations of the string.

For a 4 length string there is a total of 2^{4} = 16 combinations. The break up is as below:

Zero length combinations :

One length combinations :

Two length combinations :

Three length combinations :

Four length combinations :

Total = 16 combinations

The combinations generated in this method is not ordered according to string length. We cannot generate the r-combinations of a string for a certain value of ‘r’ with this method. To generate a fixed length say only select 4-combinations or 3-combinations from a set of n elements we need to generate the binary count in such a way so that it first counts all the binary numbers having four 1s . This is not discussed here.

The benefit of this method is we can generate combinations sorted according to weight. This means, if there is a certain weight (a value) assigned to each of the string characters, and we sort the characters of the string ascending (or descending) to their weights, then the combinations would be generated in ascending (or descending) weight order. In this way we can guarantee each combination generated has a lesser or an equal value than its previously generated combination. This can be used to select sub-strings in a scrabble solver application, as each letter has a weight there and so this combination method will generate high-scoring patterns to low-scoring patterns, and can be used to cut down the search space for the best solution by stopping as soon as the first scrabble match is found. Although special bonus scores are to be handled separately and will make this more complicated.

### Programming Approach

The main problem is to decide how we will generate the binary counts. The simplest way is to use a normal regular unsigned integer variable to count and then use bit masking to determine the ‘1’ bit positions and only allow that certain character position of the source string to be selected. This has some limitations as the length of unsigned integer is not a fixed value and will change in different platforms. Generally the length of integer is 4bytes (at least 2bytes), so we get a `4*8 = 32bits` length bit field. This restricts the combinations to be generated to 32 length. A long long int will have `8bytes = 64bits` length.

Another method is to use an r-permutation program to count a binary string sequence and use that string to select from the combination source string. A r-permutation engine can be found here. This can be easily modified to count binary sequence by just setting the source set to “10” and tell the length of the binary count to generate all the binary numbers up-to that length.

In the second method practically there is no combination length limitation so we can generate even 100 length combinations also. The main draw back is the extra calculations and time needed to generate the permutation string and filtering them out. So there is a combination length (range) and the execution time trade-off. We will present the first approach where we will generate combinations with the help of an integer.

In most applications we will not need such long combinations to generate. Like for a scrabble game we do not expect a word to have more than 32 characters. Also You get 7 chips per turn in a scrabble game. In a single player scrabble the combination length is limited to 7. The first method would be good enough for these applications.

### The Sourcecode

We now present the source code to generate the combinations of a source string input thriugh the keyboard. The code is discussed afterwards,

/* Program to generate all r-Combinations of a set with distinct element * This code is a part of https://phoxis.wordpress.com/2009/10/13/allcombgen/ */ #include <stdio.h> #include <string.h> #include <stdlib.h> #define MAX 100 #define TRUE 1 #define FALSE 0 int generate_combination (const char *source_string,char *combination_string); unsigned int comb_mask = 0x00000001; unsigned int max_mask_length; /* Main Function, driving generate_combination() */ int main (void) { char source_string[MAX], combination_string[MAX]; int len; printf ("Enter the Source String:"); scanf ("%s", source_string); len = strlen (source_string); max_mask_length = ~(0x00000001 << len); printf ("\nPrinting Combinatins:\n"); while (1) { if (generate_combination (source_string, combination_string) == FALSE) break; printf ("%s\n", combination_string); } printf ("\nfinished\n"); return 0; } /* * File Name : combination.c * Function Name : generate_combination * Parameters : * @ (const char *) source_string * @ (char *) combination_string * Return Type : (int) * # return FALSE if no more mermutations, else return TRUE * Globals : * @ (unsigned int) comb_mask * # A bitmask. Used to select unique combinations from source * string. This function increments and updates this mask * in each iteration. Each call results in a new permutation. * @ (unsigned int) max_mask_length * # A bitmask. Used to control the length of masking of * comb_mask variable. This is not modified in this function. * Initilized in function main() * Description : Each call to this function generated a new combination from * the source_string and then places into combination_string * The combination is done based upon comb_mask variable. The * positions, in which comb_mask has a '1' , are only selected * from the source_string to make a combination. * Note : The combination generation is comb_mask dependent. To generate * unique combinations in each call comb_mask should not be modified * any where else, if some explicit customized combinination is to be * generated. Each bit field has a fixed length (32bit) , the combintaion * can be generated is limited to this length, in this version of code. * This function does not return the null set. */ int generate_combination (const char *source_string, char *combination_string) { unsigned int mask = 0x00000001; int s_pos = 0, c_pos = 0; /* Main logic */ while ((mask & max_mask_length)) { if ((comb_mask & mask)) { combination_string[c_pos] = source_string[s_pos]; c_pos++; } s_pos++; mask <<= 1; } /*update permutation mask */ comb_mask++; /* Terminate the combination_string with NULL character */ combination_string[c_pos] = 0; /* If combination_string is empty, ie. c_pos == 0 , return FALSE else return TRUE */ return (c_pos == 0 ? FALSE : TRUE); }

Click Here To Download this Code

### Code Description

The variable `comb_mask` here is the bit field which we use to count up the binary sequence and make the selection set. The `comb_mask` variable is set to `0x00000001` , that is the upper 31 bits are set to ‘0’ and the LSB is set to ‘1’, and counted upwards. The `max_mask_length` is used to limit the masking of variable `mask` and `comb_mask` up-to the length of the source string. Its is set to `~(0x00000001 << len)` . If the length of the source string is `'n'` then the `'(n+1)'th` position of the `max_mask_length` will be ‘0’, which determines the loop’s termination. The `mask` variable is initialized to `0x00000001` . It is left bit shifted in each iteration to shift the ‘1’ from right most to left positions. In each iteration it is checked with the `max_mask_length` to see if the max bit field length is exceeded and if the iteration should stop as all combinations are generated.

The loop termination is as follows: The `mask` has only one ‘1’ in a certain position. It is AND-ed with `max_mask_length`. If the length is 4 then `max_mask_length` will have a ‘0’ in the 5^{th} position. The loop will be true till `mask` has the ‘1’ at positions less than 5^{th}. Whenever `mask` will have the ‘1’ in the 5^{th} position the bitwise operation `(mask & max_mask_length)` would be false , (both are complements of each other), and the loop will terminate. Thus length of the count is bound inside a limit.

For example if the length of the source string is 4 then `max_mask_length` is `~(0x00000001 << 4) = ~(0x00000010) = 0xffffffef = 11111111111111111111111111101111` in binary. in the while loop this value is masked with mask, when ever the mask’s 4th bit would get 1 it would mean that it has counted from `00000 to 01111` and generated all the combinations and now has become `10000` so we would stop. Thus the loop is terminated after all the combinations are generated.

In the loop the `if` statement tests each bit position of the `comb_mask` with the help of `mask` variable which scans from the rightmost bit to leftmost bit and checks if a certain bit position is a ‘1’, if it is then the character of the corresponding position of the source string is selected. The bit position being checked currently is tracked by s_pos counter. If a a bit position of `comb_mask` is ‘1’ then the position of that bit (recorded by s_pos) is used to select the corresponding character from the `source_string` and assign to `combination_string` . The c_pos variable points to the last character of the `combination_string`.

For example, `comb_mask` is `0x0000000a` whose binary equivalent is `00000000000000000000000000001010` . Now `mask` is `00000000000000000000000000000001` . `(mask & comb_mask)` evaluates to 0, so the 0^{th} character is *not* selected. next iteration `mask` is equal to `00000000000000000000000000000010` . `(mask & comb_mask)` evaluates to 1, so the 1^{st} character *is* selected, and so on.

The `mask` is left shifted by 1 bit and updated in each iteration, so it can check the next bit. The `comb_mask` is incremented in each iteration to make select new combinations.

### Notes

The null set is not selected by the above code.

The above code does not select the input string for repeat characters in it. So a string “aaaa” is allowed and each ‘a’ is considered as different entity.

This is not a process with which permutation without repetitions can be made, that permuting elements by changing their order within themselves cannot be made.

This is a good process to generate a bruteforce password generator, which can be used to input into an appropriate hash function and stored, or matched with the target hash.

Thanks for posting this up! Exactly what I needed for my project. It compiled and ran without any modifications.

thank you

Hello there, just became aware of your blog through Google, and found that it’s really informative. I’m going to watch out for brussels.

I will be grateful if you continue this in future. Many people will be benefited from your writing.

Cheers!