By and huge, in case you want a database, you’ll be able to attain for one of many large names—MySQL/MariaDB, PostgreSQL, SQLite, MongoDB—and get to work. However typically the one-size-fits-all strategy doesn’t match all. Now and again your use case falls down between barstools, and it’s essential to attain for one thing extra specialised. Listed here are 9 offbeat databases that run the gamut from in-memory analytics to key-value shops and time-series programs.
The phrase “SQL OLAP system” usually conjures photos of data-crunching monoliths or sprawling knowledge warehouse clusters. DuckDB is to analytical databases what SQLlite is to MySQL and PostgreSQL. It isn’t designed to run on the identical scale as full-blown OLAP options, however to offer quick, in-memory analytical processing for native datasets.
A lot of DuckDB’s options are counterparts to what’s present in larger OLAP merchandise, even when smaller in scale. Knowledge is saved as columns somewhat than rows, and question processing is vectorized to make the perfect use of CPU caching. You gained’t discover a lot in the way in which of native connectivity to reporting options like Tableau, however it shouldn’t be tough to roll such an answer manually. Other than bindings for C++, DuckDB additionally connects natively to 2 of the most typical programming environments for analytics, Python and R.
“Edge” is a time period utilized in graph databases to consult with the connection or relationship between two entities or nodes (akin to between a buyer and an order, or between an order and a product, and so on.) of a extremely linked dataset. EdgeDB makes use of the PostgreSQL core and all of the properties it supplies (like ACID transactions and industrial-strength reliability) to construct what its makers name an “object-relational database” with robust subject varieties and a SQL-like question language.
Thus EdgeDB combines NoSQL-like ease of use and immediacy, the relational modeling energy of a graph database, and the ensures and consistency of SQL. Although EdgeDB just isn’t formally a doc database, you should use it to retailer knowledge that manner. And you should use the GraphQL question language to simply retrieve knowledge from EdgeDB, simply as you’ll be able to with native graph databases akin to Neo4j.
An open supply challenge spearheaded by Apple, FoundationDB is a “multi-model” database that shops knowledge internally as key-value pairs (primarily the NoSQL mannequin), however will be organized into relational tables, graphs, paperwork, and lots of different knowledge constructions. ACID transactions assure knowledge integrity, and horizontal scaling and replication are each accessible out of the field. FoundationDB’s design comes with some stiff restrictions, although: keys, values, and transactions all have arduous dimension limits, and transactions have arduous deadlines as effectively.
The purpose behind HarperDB is to provide a single database for handling structured and unstructured data in an enterprise—somewhere between a multi-model database like FoundationDB and a data warehouse or OLAP solution. Ingested data is deduplicated and made available for queries through the interface of your choice: SQL, NoSQL, Excel, etc. BI solutions like Tableau or Power BI can integrate directly with HarperDB without the data needing to be extracted or processed. Both enterprise and community editions are available.
[ Also on InfoWorld: How to choose the right database for your application ]
As popular and powerful as Redis is, the in-memory key-value store has been criticized for falling short in threaded performance and ease of use. KeyDB is protocol-compatible with Redis, so can be used as a drop-in replacement. But KeyDB adds some nifty under-the-hood improvements, chiefly multithreading for network I/O operations and query parsing. Plans for the next edition of Redis, Redis 6, include threaded I/O as well, but KeyDB is available now.
A product of Uber’s internal engineering team, M3DB is a distributed time-series database that is used in Uber’s metrics platform (essentially as a data store for Prometheus). Borrowing ideas from Apache Cassandra and a Facebook project named “Gorilla,” M3DB allows arbitrary time precision, out-of-order insertions, and configurable levels of replication and read consistency. However, the creators note that M3DB might not be suitable for all time-series database use cases. For instance, M3DB can’t insert data out of order beyond a given time window (the default is two hours), and it is mainly optimized for storing and retrieving 64-bit floats rather than other kinds of data.
The name implies a fusion of the Redis in-memory key-value store and SQL query capabilities, and that’s exactly what RediSQL is — specifically, a Redis module that embeds a SQLite database. Data is stored transparently in Redis, so Redis handles persistency and in-memory processing. Each database is associated with a Redis key, so you can have multiple SQL databases on a single Redis instance. Queries to those databases are standard SQL, passed via the standard Redis API. You can also create and precompile statements (essentially stored procedures) in RediSQL to speed up query execution. Both commercial and open source editions are available.
SQLite is a little miracle: an embeddable open source database that is lightning-fast and ultra-reliable. SQLite makes a great default choice whenever you need a database in a single-user application, but SQLite instances are limited to a single node.
RQLite builds on SQLite to create a distributed database system. Setting up multiple nodes is easy, and data automatically replicates across those nodes using the Raft consensus algorithm. RQLite also provides encryption between nodes and a discovery service that makes it easy to add nodes automatically. But RQLite also has a few drawbacks: Write speeds are slower than in SQLite, and only deterministic SQL functions—i.e., those guaranteed to produce the same result on every node—are safe to use.
Most high-end databases these days have some kind of in-memory functionality, even if it involves something like table pinning (e.g., SQL Server). UmbraDB, an analytics database that can run as a drop-in replacement for PostgreSQL, is designed to use in-memory processing whenever it can. When it can’t, it uses a novel variable-size page mechanism for paging data from storage. Long-running queries are optimized for execution with LLVM.
This story, “9 offbeat databases worth a look” was originally published by