It’s been a long time since I posted here. Therefore I decided to dig out some quick and straightforward stuffs from my disk which I previously decided should’t be in this blog.
I have posted multiple trie and dictionary search based programs in C, C++ and Perl before Jumble Work Solver and Jumble Work Solver Again. This time (again!) it’s about a trie. Although this time there was a specific requirement from a group who needed to implement a trie based word distance counting for bangla language, therefore Unicode support. This was supposed to be modified more and plugged into a spelling correction for scanned OCR text in the Bengali language.
Initially I suggested that a ternary tree would be more appropriate as the memory cost for standard trie (not compressed) would be huge. Although, finally the decision was to go with plain and simple trie. I know it is mad, but this is what it is :D.
Also, before going into the implementation, I should note that there is an efficient implementation of Radix Tree present in libcprops library. Though which we won’t be using for this one. I would recommend you people to have a look into this library if you already haven’t seen it yet.
Let’s say, we have a word “hello” and a node with an array of pointers of length 26 representing each character of the English language, each of which indicates that if the ith character follows the character represented by this node. Therefore for this example “hello”, the head node’s array of pointers will have the location 7 pointing to another node, which will have the 4th location of the pointer pointing to another node, whose pointer array’s 11 th location will point to another node and so on. Note here the indexing starts from zero. When the word ends, then the next pointer can be pointed to a special terminal marker node, which can be common to all, and the node is marked as a terminal node. This is essential as a valid word can be a substring of another valid word, in which case the shorter would need to be decided. When another new word like “help” comes in, we will follow the same path upto “hel” created by the word “hello”, and as we find there are no pointer for “p” pointing to any node, a new node will be created as explained before. Continue reading
It’s been a long time I have done any activity in this blog. I was going through some old stuffs and thought to post something. Today I will post a generic implementation for Fisher-Yates Shuffle.
Although you can get the Fisher-Yates algorithm from wiki, still I am briefly explaining it.
Let us assume that there is a array arr of length n. We need to find a uniformly random permutation of the elements of the array. One of the variations of the algorithm is as follows.
arr is an array of length n, where indexing starts from 0
r = generate a random number between 0 and i (both inclusive)
swap the array elements arr[r] and arr[i]
i = i - 1
arr is now uniformly permuted
This is pretty embarrassing. I promised to post several posts in last year and now ended up posting two posts in the last year, also none of those posts were actually a post. The worst year ever. Let’s see this year.
The WordPress.com stats helper monkeys prepared a 2014 annual report for this blog.
Here’s an excerpt:
The Louvre Museum has 8.5 million visitors per year. This blog was viewed about 92,000 times in 2014. If it were an exhibit at the Louvre Museum, it would take about 4 days for that many people to see it.
Click here to see the complete report.
I promised (myself and you all) to post atleast 2 posts per month this year. Which clearly did not work at all. This is because this post is the second post after the “promise post”. What is going on then. The fact is I am not that much busy that I cannot put up a new post on the blog. The fact is that right now there are at least three different things I need to handle and this context switch every day or multiple times a week does not let me actually click the publish button (you all know how hard it is to click that button).
Recently (more than a year now) I was working on a cluster analysis algorithm and previously I was learning while working with some basic feed forward neural networks. Along with this I also keep in touch with *nix system programming and Linux OS kernel architecture (and a bit of Solaris recently). Therefore things are diverse. The job I do in the daytime (for which I get paid) is another domain (though similar to the system programming thing, but not exactly). Therefore the contexts are pretty diverse for me therefore the focus keeps shifting.
There are quite a few good ideas, for which there are drafts written, but as usual, it stays at on the disk. Why I am posting this? I am posting this because I need to press the “Publish” button and try again start the posting.
Therefore let’s see.
It has been a long time since a post in this blog. I thought to break this silence with a short post.
Several text editors in Linux environment like KWrite and GEdit that will store automatic temporary files ending with a ~ (tilde) character. Generally it is a default setting in most of the major distributions. Initially this feature annoyed me, as now you have twice the number of files in your directory, and often used rm *~ to clear out the files. Although it was annoying at the beginning, there are several cases it saved me. Continue reading
Like last year here are some stats of this blog for this year from wordpress.com. Worse this year, especially post count is 16, not enough compared to last year post count 24. Lesser visits this year, but some new relatively new posts have caught attention. Hoping to keep the post count at least 2 posts per month on an average on 2014.
The WordPress.com stats helper monkeys prepared a 2013 annual report for this blog.
Here’s an excerpt:
The Louvre Museum has 8.5 million visitors per year. This blog was viewed about 75,000 times in 2013. If it were an exhibit at the Louvre Museum, it would take about 3 days for that many people to see it.
Click here to see the complete report.
Posted in Writings