Skip to main content

Hermes - My Need For An Alternate Key-Value Store

Hermes


Yet another concurrent, lightweight, fast and efficient In-Memory Key-Value Store alternative.
This was a sub-component of a pet IR system in Go.
I thought this is helpful enough and can stand on its own and can help alleviate disk I/O operations (database, file, etc.) in a quite-not-so-small-but-single-machine kind of systems.

What makes Hermes unique is its use of the ff:


  • LRFU cache (which makes for a swift transition between LRU and LFU).
  • Bloom filter implementation (Cuckoo Filter, to allow for item deletion in every eviction done by the policy) to avoid caching one-hit-wonders (disabled by default, please see the toml file to enable it) for memory efficiency.
  • HAMT data structure (which makes it memory efficient at a nearly O(1) operation)


Use this in your app like so (as embedded):
package main

import (
 "fmt"
 "github.com/jtejido/hermes/hermes"
 "github.com/jtejido/hermes/config"
 "github.com/BurntSushi/toml"
 "sync"
 "strconv"
)

func main() {
 var wg sync.WaitGroup
 var ctx hermes.Context
 var con config.Config

 if _, err := toml.DecodeFile("src/github.com/jtejido/hermes/config.toml", &con); err != nil {
  fmt.Println(err)
  return
 }
 
 c := hermes.NewCache(&con)
 for i := 0; i <= 100000; i++ {
  wg.Add(1)
  go func(i int) {

   c.Set(strconv.Itoa(i), []byte(strconv.Itoa(i)))

   v, err := c.Get(ctx, strconv.Itoa(i))

   if err == nil {
    fmt.Printf("key: %s, value: %s is still here\n", strconv.Itoa(i), v)
   } else {
    fmt.Printf("%d not found\n", i)
   }
   wg.Done()
  }(i)
  wg.Wait()
 }
 
}

It depends on a .toml file for its setting (thus the need for a 3rd party dependency), you can roll your own if you wish, as long as you can marshall/unmarshall it to the right config type (see config/config.go for these types).

It was supposed to be a complete app, after modifying the default configuration, and can be run to accept keys and values via REST API right away.
If you happened to have gotten this via go get github.com/jtejido/hermes, it already created a compiled binary on your designated bin folder.
Type this for the options:
./hermes -h
The commands for running it is as follows:
  -lambda float
     Lambda used for LRFU. (default 0.65)
  -filter
     Bloom Filter enabled?. (default false)
  -filter-items uint
     Maximum number of items to be stored in filter. (default 1000000)
  -logfile string
     Location of the logfile. (default "access.log")
  -maxmemory int
     Maximum amount of data in the cache in  MB. (default 256)
  -port int
     The public port to use. (default 9090) --take note that apps should use this
  -address int
     The port for peers to listen on. (default 10050)
  -shards int
     Number of shards for the cache. (default 1024)
  -version
     Hermes version.

Aside from the above options, and checking the .toml file, you should be aware that if you're planning to use it as a distributed cache, there is a setting in .toml file called peers, which, by default, the address should be part of this array; if setting up another instance of hermes on a different address, that address should also be included here (of course that instance should have a different address, e.g. 10051), which means you can add as many as you wish, as long as they're all added on the peers array.


Before using it (using any languages), you should know that the value it stores is in []byte, which means you'll have to do the serialization/deserialization on your end (any key-value stores does that).

To put something in, use curl to send PUT request:
curl -v -XPUT localhost:9090/hermes/api/v1/cache/example3 -d "yey"
*   Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 9090 (#0)
> PUT /hermes/api/v1/cache/example3 HTTP/1.1
> Host: localhost:9090
> User-Agent: curl/7.43.0
> Accept: */*
> Content-Length: 4
> Content-Type: application/x-www-form-urlencoded
> 
* upload completely sent off: 4 out of 4 bytes
< HTTP/1.1 201 Created
< Date: Sat, 27 Oct 2018 21:38:52 GMT
< Content-Length: 0
< 
* Connection #0 to host localhost left intact
An access.log file will be created in the same folder.
To get the items, you'll have do it via GET:
curl -v -XGET localhost:9090/hermes/api/v1/cache/example
*   Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 9090 (#0)
> GET /hermes/api/v1/cache/example HTTP/1.1
> Host: localhost:9090
> User-Agent: curl/7.43.0
> Accept: */*
> 
< HTTP/1.1 200 OK
< Content-Type: application/json
< Date: Sat, 27 Oct 2018 21:43:12 GMT
< Content-Length: 16
< 
* Connection #0 to host localhost left intact
{"Value":"yey"}
The values will be in json form (more stuff will be added soon), which means you'll decode it first before deserializing it back for your own app.

others are:
curl -v -XDELETE localhost:9090/hermes/api/v1/cache/example
curl -v -XGET localhost:9090/hermes/api/v1/stats  // also json {hit, miss, collission, delhit and delmiss}
curl -v -XGET localhost:9090/hermes/api/v1/clear

If you wish to run it as a service or daemon, use daemonize, or if you can use takama's library, modify the source, and re-build it.

Popular posts from this blog

GeodesyPHP - A Great-Earth Distance library

Geodesy-PHP Geodesy-PHP is a port of some known geodesic/math functions for getting distance from a known point A to a known point B, given their coordinates (good for working out distances between different latitude/longitude data provided by Google Geolocation or any RESTful APIs). It also supports conversion between units of length, Polar position to Cartesian coordinates, and transforming different Reference Datums. It provides distance calculations thru: Spherical Law of Cosines Haversine formula  (Half a Versine - versed sine) Vincenty's formula Thomas' formula Hubeny's formula Andoyer-Lambert's formula Elliptic Distance Forsythe-Andoyer-Lambert Formula Note: This library is a collection that solves the Inverse geodetic problem . Installation: composer require jtejido/geodesy-php Usage Distance Calculation All classes receives and gives all values in  Metre unit of length by default. use Geodesy\Location\LatLong; use Geodes...

Basset - Information Retrieval Library in PHP

Basset Basset is a full-text  PHP Information Retrieval library. This is a collection of developments in the field of IR and ported over to PHP for research purposes. Basset provides different ways of searching through documents in a collection (ad-hoc), by applying advanced and experimental IR algorithms and/or techniques gathered from different Research studies and Conferences, most notably: TREC SIGIR ECIR ACM Basics Warning: This is a tool that is continuously under development. Please use this as a research tool for your otherwise special Production needs. Adding Documents Basset manages adding document thru the IndexWriter Class. It processes the documents you'll be adding in and later on commit to an external file. It takes a directory path, and overwrite (they both default to '../index/' and true consecutively). Setting overwrite to false means that you won't be accidentally overwriting any existing index inside the directory. Methods:...

Apollo: Where and When

Project: Apollo Abstract GIS-Applications are applications that gathers, manages and analyzes spatial-related data. They are rooted from Geography and is now used in almost all aspects of discipline where Geographic and/or Spatial data manipulation is required. This includes  applications related to  engineering, planning, management, transport/logistics, insurance, telecommunications, and business [ 1 ]. Guttman [ 2 ] started the topic of generalizing B-tree ( a self-balancing  tree data structure  that  maintains sorted data and allows searches, sequential access, insertions, and deletions in  logarithmic time . [ 3 ] ) to index multi-dimensional data (coordinates, polygons, rectangles, among others). Such a step enables the rise of Geo-spatial databases for gathering, analyzing and querying such data (i.e. Esri , GIS-planning  and Database extensions like GeoMesa (for Hadoop), PostGIS (PostgreSQL), Oracle, CouchDB, among others) ...