03 July, 2008
 
 

 

HyperGraphDB is primerely what its carefully chosen name implies: a database for storing hypergraphs. It was originally designed out of a set of requirements for the implementation of an AI engine (more about HyperGraphDB's design history here). The original ideas turned out strong and with enough generative potential to stimulate the growth of a general purpose storage mechanism capable of accomodating different styles of data management under one umbrella.

It is hard to categorize HyperGraphDB as yet another database because much of its design evolves around providing the means to manage structure reach information with arbitrary layers of complexity. For instance, a relational as well as an object-oriented style of data management can be emulated. The design is minimalistic at its core and the end-goal is to evolve a set of concepts and practices, combining structure and interpretation in such a way as to allow future software to meet the complexities of the real-world better that now.

Get It

HyperGraphDB is licensed under LGPL and is now hosted at Google Code:

http://code.google.com/p/hypergraphdb

Use your favorite Subversion client to access it. Let us know if you are using it!

Documentation

API Javadocs can be viewed online here.

Support

HyperGraphDB support, including general questions, bug reports etc. is provided through the HyperGraphDB Google Group.

Key Facts

  • The mathematical definition of a hypergraph is an extension to the standard graph concept that allows an edge to point to more than two nodes. HyperGraphDB extends this even further by allowing edges to point to other edges as well and making every node or edge carry an arbitrary value as payload.
  • The basic unit of storage in HyperGraphDB is called an atom. Each atom is typed, has a value and can point to one or more other atoms.
  • Data types are managed by a general, extensible type system embedded itself as a hypergraph structure. Types are themselves atoms as everybody else, but with a particular role (well, as everybody else too).
  • The storage scheme is platform independent and can thus be accessed by any programming language from any platform. Low-level storage is currently based on BerkeleyDB from Sleepycat Software.
  • Size limitations are virtually non-existent. There is no software limit on the size of the graph managed by a HyperGraphDB instance. Each individual value's size is limited by the underlying storage, i.e. by BerkeleyDB's 2GB limit. However, the architecture allows bypassing BerkeleyDB for particular types of atoms if one so desires.
  • The current implementation is solely Java based. A C++ implementation will soon follow and make HyperGraphDB accessible to native platforms as well. Note that the storage scheme being open and precisely specified, all languages and platforms are able to share the same data.
  • The missing major features currently are transactionality, distribution and a comprehensive HyperGraph manipulation language that would allow working with HyperGraph data at a higher, more convenient level that conventional language APIs. All those features are planned and currently at a design stage.
  • The open-source license that we've chosen is similar to the dual licensing scheme of BerkeleyDB (to which you would be bound anyway, since we are using BerkeleyDB). Essentially this licensing schema makes the software open-source and free for all uses, except when
  • The Java implementation offers an automatic mapping of idiomatic Java types to a HyperGraphDB data schema which makes HyperGraphDB into an object-oriented database suitable for regular business applications. Examples of this are provided in some of the application extensions (see below).
  • Semantic Web and NLP projects can benefit from the provided mapping between the RDF/OWL family of languages to HyperGraphDB as well as existing semantic networks such as WordNet, ConceptNet etc.

Usage Scenarios

In a server-side Java application, the standard setup relies on a RDBMs together with a set of business components and a presentation tier. If you've kept up with the latest industry advances, you have a good O/R mapping tool such as Hibernate to transparently and non-intrusively convert your object structure to/from database tables. Recently, there has been a noticeable trend to replace RDBMs, especially for smaller applications by embedded in-memory databases with less sophisticated, but typically much faster querying capabilities.

In a desktop Java application, programmers frequently rely on a large set of configuration files to store user preferences and other persistent application state. A large amount of time is devoted to the management of configuration data and frequently end-users are not allowed to configure simple application behavior simply because programmers don't have the time to make "everything" configurable and need to selectively predict the most important parameters of potential interest to users. With HyperGraphDB, all beans that have to do with configuration can simply be added as atoms and they will be managed from there on. A handful of generic UI forms would expose any configuration bean to the user without further programming. For a much more ambitious project in that direction see our own Scriba project.

Bioinformatics projects form a category of fairly complex software that not only can benefit form a data management piece like HyperGraphDB, but also constitute a very natural fit for it. Frequently, such projects need to manage highly complex descriptive information based on structured taxonomies (or ontologies), together with large sets of experimental data. In addition, sophisticated algorithms operate on both experimental and ontological data in order to infer interaction networks at various level of biological organization. HyperGraphDB is designed to accomodate all those activities.

Semantic Web projects are an obvious domain of application of HyperGraphDB. The so called "conceptual graphs" or RDF graphs and even the more advanced modelling practices utilizing higher-order relationships have a straighforward and natural experession within the HyperGraphDB framework.

Networks research can benefit from the capacity of HyperGraphDB to store very large, distributed graphs and have pattern mining, computationally intensive algorithms operate on them.


YourKit is kindly supporting open source projects with its full-featured Java Profiler. YourKit, LLC is creator of innovative and intelligent tools for profiling Java and .NET applications. Take a look at YourKit's leading software products: YourKit Java Profiler and YourKit .NET Profiler.