A wide character trie implementation

It’s been a long time since I posted here. Therefore I decided to dig out some quick and straightforward stuffs from my disk which I previously decided should’t be in this blog.

I have posted multiple trie and dictionary search based programs in C, C++ and Perl before Jumble Work Solver and Jumble Work Solver Again. This time (again!) it’s about a trie. Although this time there was a specific requirement from a group who needed to implement a trie based word distance counting for bangla language, therefore Unicode support. This was supposed to be modified more and plugged into a spelling correction for scanned OCR text in the Bengali language.

Initially I suggested that a ternary tree would be more appropriate as the memory cost for standard trie (not compressed) would be huge. Although, finally the decision was to go with plain and simple trie. I know it is mad, but this is what it is :D.

Also, before going into the implementation, I should note that there is an efficient implementation of Radix Tree present in libcprops library. Though which we won’t be using for this one. I would recommend you people to have a look into this library if you already haven’t seen it yet.

Let’s say, we have a word “hello” and a node with an array of pointers of length 26 representing each character of the English language, each of which indicates that if the ith character follows the character represented by this node. Therefore for this example “hello”, the head node’s array of pointers will have the location 7 pointing to another node, which will have the 4th location of the pointer pointing to another node, whose pointer array’s 11 th location will point to another node and so on. Note here the indexing starts from zero. When the word ends, then the next pointer can be pointed to a special terminal marker node, which can be common to all, and the node is marked as a terminal node. This is essential as a valid word can be a substring of another valid word, in which case the shorter would need to be decided. When another new word like “help” comes in, we will follow the same path upto “hel” created by the word “hello”, and as we find there are no pointer for “p” pointing to any node, a new node will be created as explained before. Continue reading “A wide character trie implementation”

Advertisements

A generic Fisher-Yates Shuffle

It’s been a long time I have done any activity in this blog. I was going through some old stuffs and thought to post something. Today I will post a generic implementation for Fisher-Yates Shuffle.

Although you can get the Fisher-Yates algorithm from wiki, still I am briefly explaining it.

Let us assume that there is a array arr of length n. We need to find a uniformly random permutation of the elements of the array. One of the variations of the algorithm is as follows.

arr is an array of length n, where indexing starts from 0
i=n-1
while i>0
{
  r = generate a random number between 0 and i (both inclusive)
  swap the array elements arr[r] and arr[i]
  i = i - 1
}
arr is now uniformly permuted

Continue reading “A generic Fisher-Yates Shuffle”

Generate the process tree of a Linux system

This is a quick post on how to generate a process tree Linux (and *nix) operating systems.

The idea is the same, as in the previous posts: Finding overall and per core CPU utilization and Find process IDs of a running process by name. Read the information present in the /proc/ directory. To get which processes are running we can read the directories with numbers as their names in the /proc/ directory. To generate a process tree we need to establish a process child relationship within the running processes. Each process has a parent (the first generated process is an exception), and it is stored in the process table entry of that process. We need to fetch the parent process id for each running process inorder to establish the tree. Here’s the plan. Continue reading “Generate the process tree of a Linux system”

Find pairs of numbers in an array with difference ‘k’

The problem statement is, an long array is given with n elements, we need to find all the pairs of numbers in this long array which have a constant difference k. I will post three methods. The first method is the brute force method and has a runtime complexity O (n2), the next method has runtime complexity O (n*log (n)), and the last one will have the runtime complexity O (n).

Continue reading “Find pairs of numbers in an array with difference ‘k’”

Jumble word solver again

I have already posted a jumbled word solver written in C language, although what I posted is actually become old, as I have changed some of the things in the code. I will update the post with this (hope to update!) with the new changes, but before it I would like to post the the same stuff in other languages and with different datastructure. Recently I am learning Perl and brushing up C++, therefore I will post jumble word solver written in C++ and Perl in this post.
Continue reading “Jumble word solver again”

Implement queue using stack

The puzzle is to implement basic queue operations with using only basic stack operations. That is, a stack object is given, we need construct a wrapper for the queue functions, insert, remove, which will only use the stack object as its storage, and naturally will have to use the stack operations. I have already posted the opposite task in the post Implement stack using a queue

This can be done using two stack objects. We call these the first stack and the second stack. Although either the insert or the remove complexity will no more be O(1).

I have discussed the process gradually. I added the last solution when it clicked in my mind while reviewing this post.

Continue reading “Implement queue using stack”

Plot histogram in terminal

We had assignments to print “*”s in different formation in undergraduate class, which I never liked as they were pointless. Now I got a somewhat justifiable application, plot histogram in terminal. In the last post Generating random numbers from Normal distribution in C I posted the C code to generate random numbers from the Normal distribution using the Polar method. In this post I am posting a simple code to plot the histogram of generated random numbers from this or any other distribution. Let me first post the code and then explain what is going on. Continue reading “Plot histogram in terminal”

Generating random numbers from Normal distribution in C

I needed to write a random number generator in C which will generate random numbers from Normal Distribution (Gaussian Distribution). Without this component I couldn’t proceed to finish writing a C code for Heuristic Kalman Algorithm by Lyonnet and Toscano for some experiments. I selected the Marsaglia and Bray method also known as the Polar method to generate Normal random variables. Here is how it is done. Continue reading “Generating random numbers from Normal distribution in C”

Implement stack using a queue

The puzzle is to implement basic stack operations with using only basic queue operations. That is, a queue object is given, we need construct a wrapper for the stack functions, push, pop, which will only use the queue object as its storage, and naturally will have to use the queue operations.

This can be done using only one queue object. Although either the push or the pop complexity will no more be O(1).
Continue reading “Implement stack using a queue”