CozoDB

CozoDB is a cutting-edge, general-purpose, transactional, and relational database designed to harness the power of Datalog for query execution. It is not just versatile but also embeddable, making it suitable for a wide range of applications, from embedded systems to large-scale server environments. CozoDB stands out by focusing heavily on graph data and algorithms, which is essential for deriving insights from interconnected data.

Embeddability

CozoDB is designed to run in the same process as your main program, making it an embedded database. Unlike client-server databases like MySQL or PostgreSQL, CozoDB can operate in environments without network connectivity, which is ideal for applications running on devices like mobile phones. It offers the flexibility to also function in a client-server mode, allowing better resource utilization and higher concurrency.

Graph Data Focus

The architecture of CozoDB embraces the interconnected nature of data, making it particularly well-suited for graph-based queries. While traditional graph databases often require data to fit into the labelled-property graph model, CozoDB leverages the relational model, which is more versatile and easier to manage. This approach allows for the representation and querying of implicit graph structures within the data, providing deeper insights that are not easily achievable with other models.

Datalog for Querying

Datalog is at the core of CozoDB's query capabilities. Unlike SQL, Datalog offers powerful recursion capabilities, making it highly effective for graph queries. CozoDB's Datalog implementation enhances this further by supporting recursive aggregations and providing efficient algorithms for common graph operations like PageRank. Datalog's composability allows for breaking down complex queries into manageable rules, promoting maintainability without sacrificing performance.

Advanced Features

CozoDB integrates several advanced features to enhance its utility and performance:

  • HNSW Vector Search: Starting from version 0.6, CozoDB includes vector search capabilities using HNSW (Hierarchical Navigable Small World) indices. This feature enables high-performance similarity searches within the database.
  • MinHash-LSH for Near-Duplicate Search: Version 0.7 introduces MinHash-LSH, enabling efficient near-duplicate detection and full-text search.
  • Time Travel: CozoDB supports time travel queries, allowing users to execute queries at historical points in time, providing a view of the data as it existed at any given moment.
  • Multi-Version Concurrency Control (MVCC): Ensures data consistency and integrity during concurrent operations.
  • Efficient Storage Engines: CozoDB supports multiple storage backends, including in-memory storage for non-persistent data, and persistent storage engines like RocksDB and SQLite.

Performance

CozoDB delivers impressive performance metrics, making it suitable for both OLTP and OLAP workloads:

  • Transactional Queries: Handles around 100K queries per second (QPS) for mixed transactional queries and over 250K QPS for read-only queries.
  • Graph Algorithms: Completes two-hop graph traversals in under 1ms for large graphs and performs the PageRank algorithm efficiently even on large datasets.

Getting Started

One of the standout features of CozoDB is its ease of use and extreme embeddability. Users can try out CozoDB directly in their browser through the WebAssembly (WASM) demo available on the CozoDB website. This demo runs at near-native speed and provides a hands-on introduction to CozoDB's capabilities.

For those looking to integrate CozoDB into their environments, detailed installation instructions and tutorials are available on the official documentation.

Architecture

The architecture of CozoDB is layered to ensure modularity and ease of maintenance:

  1. Storage Engine: Supports various backends, ensuring flexibility in data storage.
  2. Query Engine: Handles query compilation, execution, and transaction management.
  3. Language/Environment Wrappers: Provides bindings for multiple programming languages and environments, including Python, NodeJS, Java, Clojure, and more.

Project Status

CozoDB is actively developed, with frequent updates bringing new features and improvements. Although still young, it is stable enough for use in various applications. Users are encouraged to try it out and provide feedback to help shape its future development.

Useful Links

Similar Projects