HyperGraphDB is primerely what its carefully chosen name implies: a database for storing hypergraphs.
It was originally designed out of a set of requirements for the implementation
of an AI engine (more about HyperGraphDB's design history
here). The original ideas turned out strong and with enough
generative potential to stimulate the growth of a general purpose storage mechanism capable
of accomodating different styles of data management under one umbrella.
It is hard to categorize HyperGraphDB as yet another database because much of its design evolves
around providing the means to manage structure reach information with arbitrary
layers of complexity. For instance, a relational as well as an object-oriented style
of data management can be emulated. The design is minimalistic at its core and
the end-goal is to evolve a set of concepts and practices, combining structure and
interpretation in such a way as to allow future software to meet the complexities
of the real-world better that now.
Get It
HyperGraphDB is licensed under LGPL and is now hosted at Google Code:
http://code.google.com/p/hypergraphdb
Use your favorite Subversion client to access it. Let us know if you are using it!
Documentation
API Javadocs can be viewed online here.
Support
HyperGraphDB support, including general questions, bug reports etc. is provided through
the
HyperGraphDB Google Group.
Key Facts
- The mathematical definition of a hypergraph is an extension to the standard graph
concept that allows an edge to point to more than two nodes. HyperGraphDB extends this even
further by allowing edges to point to other edges as well and making every node or edge carry
an arbitrary value as payload.
- The basic unit of storage in HyperGraphDB is called an atom. Each atom is typed, has a value
and can point to one or more other atoms.
- Data types are managed by a general, extensible type system embedded itself as a hypergraph
structure. Types are themselves atoms as everybody else, but with a particular role (well, as everybody
else too).
- The storage scheme is platform independent and can thus be accessed by any programming
language from any platform. Low-level storage is currently based on BerkeleyDB from
Sleepycat Software.
- Size limitations are virtually non-existent. There is no software limit on the size of the
graph managed by a HyperGraphDB instance. Each individual value's size is limited by the underlying
storage, i.e. by BerkeleyDB's 2GB limit. However, the architecture allows bypassing BerkeleyDB
for particular types of atoms if one so desires.
- The current implementation is solely Java based. A C++ implementation will soon follow and
make HyperGraphDB accessible to native platforms as well. Note that the storage scheme being open and
precisely specified, all languages and platforms are able to share the same data.
- The missing major features currently are transactionality, distribution and a comprehensive
HyperGraph manipulation language that would allow working with HyperGraph data at a higher, more convenient
level that conventional language APIs. All those features are planned and currently at a design stage.
- The open-source license that we've chosen is similar to the dual licensing scheme of BerkeleyDB
(to which you would be bound anyway, since we are using BerkeleyDB). Essentially this licensing schema makes
the software open-source and free for all uses, except when
- The Java implementation offers an automatic mapping of idiomatic Java types to a HyperGraphDB data
schema which makes HyperGraphDB into an object-oriented database suitable for regular business applications.
Examples of this are provided in some of the application extensions (see below).
- Semantic Web and NLP projects can benefit from the provided mapping between the RDF/OWL family of
languages to HyperGraphDB as well as existing semantic networks such as WordNet, ConceptNet etc.
Usage Scenarios
In a server-side Java application, the standard setup relies on a RDBMs together with a set of business
components and a presentation tier. If you've kept up with the latest industry advances, you have a good
O/R mapping tool such as Hibernate to transparently and non-intrusively convert your object structure to/from
database tables. Recently, there has been a noticeable trend to replace RDBMs, especially for smaller applications
by embedded in-memory databases with less sophisticated, but typically much faster querying capabilities.
In a desktop Java application, programmers frequently rely on a large set of configuration files to store
user preferences and other persistent application state. A large amount of time is devoted to the management of
configuration data and frequently end-users are not allowed to configure simple application behavior simply because
programmers don't have the time to make "everything" configurable and need to selectively predict the most important
parameters of potential interest to users. With HyperGraphDB, all beans that have to do with configuration can simply
be added as atoms and they will be managed from there on. A handful of generic UI forms would expose any configuration
bean to the user without further programming. For a much more ambitious project in that direction see our own
Scriba project.
Bioinformatics projects form a category of fairly complex software that not only can benefit form a data
management piece like HyperGraphDB, but also constitute a very natural fit for it. Frequently, such projects need to manage
highly complex descriptive information based on structured taxonomies (or ontologies), together with large sets of
experimental data. In addition, sophisticated algorithms operate on both experimental and ontological data in order
to infer interaction networks at various level of biological organization. HyperGraphDB is designed to accomodate all
those activities.
Semantic Web projects are an obvious domain of application of HyperGraphDB. The so called "conceptual graphs" or RDF
graphs and even the more advanced modelling practices utilizing higher-order relationships have a straighforward and natural
experession within the HyperGraphDB framework.
Networks research can benefit from the capacity of HyperGraphDB to store very large, distributed graphs and
have pattern mining, computationally intensive algorithms operate on them.
YourKit is kindly supporting open source projects with its full-featured Java Profiler.
YourKit, LLC is creator of innovative and intelligent tools for profiling
Java and .NET applications. Take a look at YourKit's leading software products:
YourKit Java Profiler and YourKit .NET Profiler.
|