File MIES_Hashmap.ipf

Pure Igor Pro implementation of a hashmap using separate chaining.

Features:

  • Supports all sizes which are a power of two

  • Constant time access O(1) with small load factors

  • Handles collisions correctly

  • Avoids implicit rehashing but supports explicit rehashing

Indizes into HM_CreateStatsWave()

static const double HM_TOTAL_ENTRIES_ROW = 0

Indizes into HM_CreateManagementWave()

static const double HM_USED_ROWS_ROW = 0
static const double HM_STATS_ROW = 1

Indizes into HM_CreateHashmap()

static const double HM_KEYS_COLUMN = 0
static const double HM_VALUES_COLUMN = 1

Indizes into HM_Create()

static const double HM_MGMT_ROW = 0
static const double HM_HASHMAP_ROW = 1

Functions

static std::tuple<uint64> HM_DJBHash(string str)

Implementation of djb2 in plain Igor Pro.

The implementation here does support embedded nulls, see DisplayHelpTopic “Embedded Nulls in Literal Strings”.

See also https://github.com/dim13/djb2/blob/master/docs/hash.md#djb2.

static variable HM_HashKey(WaveRefWave hashmap, string key)
static variable HM_GetKeyIndex(WaveText keys, string key, variable numFilledEntries)
static wave HM_FetchUsedRows(WaveRefWave hashmap)
static wave HM_FetchStats(WaveRefWave hashmap)
static variable HM_GetSize(WaveRefWave hashmap)
static wave HM_FetchKeys(WaveRefWave hashmap, variable idx)
static wave HM_FetchValues(WaveRefWave hashmap, variable idx)
static std::tuple<WAVE, WaveText, WAVE> HM_FetchWaves(WaveRefWave hashmap, variable idx)
static wave HM_CreateStatsWave()

Statistics wave.

Rows:

  • 0: Total number of filled entries

static wave HM_CreateUsedRows(variable size)

List of stored key/value pairs per hash prefix.

Rows:

  • Number of key/value pairs per hashmap row

static wave HM_CreateManagementWave(variable size)

Internal management wave.

Rows:

  • 0: 32-bit unsigned int wave with size rows denoting the number of key/value pairs per line

  • 1: wave ref wave with stats entries, see HM_GetStatsWave()

static wave HM_CreateHashmap(variable size, variable valueType)

Return a wave reference wave resembling a hashmap (implementation)

Rows:

  • first bits of the hash

Columns:

  • 0: 1D text wave with all keys for this hash

  • 1: 1D wave with all values for this hash, type depends on valueType parameter

Complexity: O(n)

Parameters:
  • size – size of the hashmap, needs to be a power of two

  • valueType – wave type of the values, defaults to text wave and can be one of IgorTypes

wave HM_Create(variable size = defaultValue, variable valueType = defaultValue)

Return a wave reference wave resembling a hashmap.

Rows:

Complexity: O(n)

Parameters:
  • size – size of the hashmap, needs to be a power of two

  • valueType – wave type of the values, defaults to text wave and all non-complex numeric types from IgorTypes are supported

static variable HM_StoreValue(wave values, variable idx, string *str = defaultValue, variable *var = defaultValue)
variable HM_AddEntry(WaveRefWave hashmap, string key, string str = defaultValue, variable var = defaultValue)

Add an entry into the hashmap.

Complexity: Amortized O(1)

Returns:

1 when adding a new value and 0 when overwriting

std::tuple<string, variable> HM_GetEntryAsString(WaveRefWave hashmap, string key)

Get a string entry from the hashmap.

Complexity: Amortized O(1)

Return values:
  • value – string value found

  • found – 1 if something was found, 0 if not

std::tuple<variable, variable> HM_GetEntryAsNumber(WaveRefWave hashmap, string key)

Get a numeric entry from the hashmap.

Complexity: Amortized O(1)

Return values:
  • value – value found

  • found – 1 if something was found, 0 if not

variable HM_DeleteEntry(WaveRefWave hashmap, string key)

Delete the entry with the given key.

Complexity: Amortized O(1)

Returns:

0 on success, 1 if the key could not be found

wave HM_GetAllKeys(WaveRefWave hashmap)

Return all keys in the hashmap.

Complexity: O(n)

static wave HM_GetAllKeysPerRow(WaveRefWave hashmap, variable index, variable entriesWithHash)
static variable HM_CalculateLoadFactor(WaveRefWave hashmap)

Calculate the load factor from the hashmap and the filled entries.

Complexity: O(1)

variable HM_RehashIfRequired(WaveRefWave *hashmap)

Rehashes if required and returns a modified hashmap pass-by-reference.

The load factor (number of available entries vs filled entries) is determined. And if that is above HM_MAX_LOAD_FACTOR we create a new hashmap with the doubled size and add all existing entries to it.

Complexity: Usually amortized O(1) but in the worst case O(n)

Returns:

0 if nothing needed to be done, 1 if the hashmap was resized and rehashed

Variables

static const double HM_SMALL_WAVE_OPTIMIZATION_ROWS = 5
static const double HM_MAX_LOAD_FACTOR = 0.7