JavaScript STL (Standard Template Library), Part 4 | WebReference

JavaScript STL (Standard Template Library), Part 4

JavaScript STL (Standard Template Library), Part 4

In previous articles, I've presented a programming library based on the C++ Standard Template Library. In this article I will introduce four new collections from the stdext namespace and show how easily many of the collections may be interchanged within a simple JavaScript unit test.

The stdext Namespace

As explained in part 1, the C++ STL is designed to be enclosed within a namespace called std. This namespace contains most of the STL code and allows it to use programming names independently of other libraries. There were some parts of the library deemed not suitable to be included in the std namespace, but too useful to throw away, so the stdext namespace was created as a kind of code orphanage for all of these misfits. In this article I present the hash-type collections, namely hash_map, hash_multimap, hash_set and hash_multiset.

A New Code File

Since the code in stdext is held in a separate namespace, it's only proper that the stdext namespace should be contained within a separate file. The advantage is that the extra code need not be downloaded if it's not necessary. Since the main std namespace is contained within a file called JSstl.js, a new file JSstlext.js will contain the stdext namespace.

function STDEXT()

{

   // namespace class STDEXT

}

STDEXT.prototype.hash_map = function(hash,less)

{

   return new STDEXTHashMap(hash,less);

}

STDEXT.prototype.hash_multimap = function(hash,less)

{

   return new STDEXTHashMultiMap(hash,less);

}

STDEXT.prototype.hash_set = function(hash,less)

{

   return new STDEXTHashSet(hash,less);

}

STDEXT.prototype.hash_multiset = function(hash,less)

{

   return new STDEXTHashMultiSet(hash,less);

}

STDEXT.prototype.hash =

{

      // standard hash behaviour is to assume value is already an integer

      fn:function(a)

      {

            return a;

      }

};

 

// declare namespace stdext

var stdext = new STDEXT();

Like the std namespace defined in the JSstl.js file, the stdext namespace is defined using a class called STDEXT. The four new collections can be accessed by calling member functions on the stdext instance. A new standard functor is introduced to provide a default hash implementation. Along with std.less, this functor may be used in any of the hash-type collections where the key is already an integer.

A Bit About Hashing

Hashed collections provide an extremely efficient means of storing and retrieving data. So long as the hash function is well designed, inserts and retrievals should execute in a consistent time frame, regardless of the number of elements in the collection. Hashed collections work by first converting the keys into integers by using a hash function. These integers are then used to index into an array of buckets where the items are stored. The hashed collection works well when the hash function returns a different index for every item. When two different keys hash to the same integer, then the two different items will be stored within the same bucket. This is called a collision. A hashed collection must have a strategy for coping with collisions; the usual solution is to store the items in a linked list.

Imagine that we need to store a number of names in a hashed collection. Since the key-type, string, is not an integer, it must be converted into one using a supplied hashing function. For the sake of this example, let's use the string length; "John" would hash to 4, "Jan" would hash to 3 and so on. Adding the names John, Jan and Jeremy will result in a collection similar to the one in figure 1...

A hash collection of Jan, John and Jeremy
Figure 1. A hashed collection. Each element is accessible in constant time.

Notice the links connecting the elements will allow the elements to be iterated through even though they're not all adjacent within the buckets. An important point to understand that distinguishes hashed collections from other indexed collections is that the list of items is not stored in any predictable order. Typically new elements will be added to the list at the tail, but if a collision occurs, they will be inserted somewhere in the middle. If speed of insertion and retrieval is the only concern then a hashed collection is a good choice. If ordering of elements is important, then a map or a set would be better.

Adding Jane to the collection exercises the collision logic as both Jane and John hash to a value of 4. The result looks like this...


Figure 2. Jane has been added to the hash collection and now heads bucket 4. John now takes longer to find.

With the addition of Jane to the hash collection, Jane becomes the new head of bucket 4 as alphabetically, Jane comes before John. To find John now requires two steps; first to Jane, and then to John.

Created: March 27, 2003
Revised: December 16, 2005

URL: http://webreference.com/programming/javascript/gr/column17/1