In order to build a stock exchange and trade investors money, you really need to be sure of the transaction engine underlying the platform. gTrade utlises the next generation of transaction engines to ensure not only performance and reliability, but also meet the extreme regulations imposed by the Investment Protection bodies such as the SEC – this includes things such as the retention of data and its auditability. You simply can’t do this with your run of the mill MySQL database and Ruby on Rails application – no matter how well you implement the system. Don’t get me wrong, the LAMP (Linux + Apache + MySQL + PHP) & MARS (MySQL + Apache + Ruby + Solaris/Linux) stacks are great. They are better than that, they’re excellent – totally revolutionised web development. However, those types of stacks were never designed from the ground up for an enterprise implementation – this is evident from the great efforts needed to cluster MySQL and make RoR highly available – often ending in failure or extremely complex and rigid implementations.
This is where gTrade differs – we use Obsidian Dynamics’ DTS/S1 ‘Pitch Black’. DTS/S1 is a next generation distributed transaction server and native object database. Pitch Black is a clustered solution that embraces high levels of parallelism to achieve absolute reliability, fault tolerance and organic scalability resulting in ultra high performance transaction processing. Parallel to this, Pitch Black achieves the lowest cost per transaction of any other commercial transaction processor or transactional database management system.
Pitch Black is all about simplicity and speed. Where relational databases bear the overhead of ‘unpacking’ complex object-oriented structures to fit a 2D table model, and then ‘repacking’ these structures to present back to the application, a native object database persists business structures as-is. The net effect: where peer products require million-dollar computer hardware to achieve speeds in the order of tens of thousands of transactions per second (TPS), Pitch Black can achieve similar figures on hardware that would take up one unit of rack space. Given clusters of high-end server hardware, coupled with clustered network attached storage (NAS) devices, speeds in excess of 100,000 TPS are achievable.
DTS/S1: features
Clustered deployment
While Pitch Black has been designed to extract blistering performance from a minimal deployment, it has been engineered for growth. Pitch Black assumes an N x M (N-by-M) compute and storage model, where N clustered DTS/S1 nodes operate on a shared storage facility that may internally be clustered to M nodes. Although not novel, a modular shared-disk approach is far more applicable to systems such as DTS/S1 which advocate the proximity of business logic to storage. Its elegance is in allowing an organisation to scale either its computational power, or storage capacity, or both, whichever proves to be the bottleneck. This level of scaling is a physical impossibility in a shared-nothing architecture, where each cohort is both a computational and a storage node.
Microcontainer architecture
Pitch Black employs a microcontainer architecture, which provides additional flexibility in the interaction between business logic and the storage subsystem. The microcontainer provides a number of services to the application, which include (predominantly) a transaction service, resource locking and deadlock avoidance, querying (of both the overall state, historical state and the transitional data) as well as an external interface for exercising queries and invoking transactions remotely. A significant advantage of the DTS/S1 microcontainer is its ability to host and execute the full application business logic suite within DTS, and distribute the processing load throughout the nodes that comprise the Pitch Black cluster. Unlike stored procedures employed by most commercial databases, DTS does not lock the developers into working with proprietary languages that only the database understands. Because the DTS transaction and object broker service are native, the application developers are not required to learn any other scripting language. This paradigm extends to querying DTS – like any other aspect of the system, the queries are also written in the developers’ native language. Presently, DTS/S1 supports the Java(TM) programming language, including Java ME and Java EE technologies.
 |
The DTS/S1 transaction service is subdivided into two layers – the (high level) transaction broker and the (low level) storage and interconnect fabric. The higher level facilitates the principles of the DTS model, including object revision control, decoupled transactions, recovery protocols, remote transaction and query brokerage, an API to interface with the business logic, resource locking and others. Its unaware of the interconnect and storage protocols that are provided by the pluggable interconnect fabric (PIF). This enables Pitch Black to support a variety of physical storage platforms, including storage area networks (SANs), network attached storage (NAS) serving a POSIX-compliant file system, non-volatile RAM (NVRAM) and FLASH technologies – any storage implement ranging from enterprise-grade storage farms to embedded systems.
The next level up in the DTS stack is occupied by transactions and queries. These are always defined on the DTS node in form of conventional Java bytecode. They may be either a part of the business logic that resides and executes on the DTS node, or “named” parametric transactions and queries that can be invoked remotely via the Remote Object Model Broker (ROMB) API through the external interface.
Like many other things, the DTS microcontainer does not impose structure on the external interface, which may be a pathway for communication with other components that form part of the system, or a B2B portal for external 3rd party systems.
Floating objects
We have christened the Pitch Black design model as a “floating objects” architecture. All objects in DTS undergo separate life cycles, however, operations may atomically combine fragments of certain objects’ life cycles in time. Although independently versioned, objects may reference others forming arbitrarily complex graphs. Because Pitch Black is clustered and each DTS node is inherently multithreaded, there are avenues allowing data to be viewed and manipulated in parallel. Recently accessed and modified objects are cached on each node, and cache coherency is maintained by each node.
The floating objects architecture assumes the transparent migration of objects from one node to another, where objects may simultaneously co-exist on multiple nodes for read-only operations. In order to accommodate this, DTS provides a re-entrant read/write lock service and deadlock avoidance mechanisms which are always employed by DTS/S1 but are also available to the application level. By comparison, relational databases are unaware of the individual objects that form the data tables. The only way serialisable transactions (the highest level of transaction isolation) are achieved is through row-level locking, which in all but the most trivial cases has to acquire a series of locks, as multiple rows across several tables may constitute a single object. This is largely uncontended synchronisation, where locks are acquired out of need. Being aware of the higher-level object context, Pitch Black contends for the minimal number of locks to achieve the same level of transaction isolation.
Decoupled transactions
The next design fundamental behind Pitch Black is decoupled transactions. Pitch Black records all data manipulations as transaction objects. Each transaction object depicts the change that one or a group of objects must undergo atomically. Should a transaction fail to update all objects in its working set, it will roll back all changes, returning the system back to the precise state it was prior to the commencement of the transaction. Pitch Black maintains a coherent cache that ensures that all nodes within the cluster are exposed to a consistent overall data state. The fundamental distinction between Pitch Black and peer transaction processors and databases, is that the latter tend to have monolithic data structures that encompass the overall system state and get persisted periodically.
In order to illustrate the difference between monolithic and floating object models, consider the following diagram that illustrates the evolution of an object in a relational data model. A relational data model depicts all data as a set of tables. Most enterprise business objects map to more than one table and, conversely, a single table depicts fragments of more than object. Relational databases operate at row level, and being unaware of the object schema cannot exercise revision control on a per-object basis. What we are able to observe then, is the evolution of monolithic structures in time, where each monolith is assigned a revision number following a transaction.
 |
All transactions ultimately relate to the same monolithic data structure and must be replayed from the journal in strict sequence to rebuild a transaction-consistent view of the overall state in an event of a failure. During recovery, the system will first inspect its stable storage to locate the most recent image taken prior to failure. The image will then be loaded into memory and the system will re-evaluate each transaction recorded after that image up to the point of failure. Transactions that have not been marked as committed will be rolled back. Theoretically, after this process is complete the system will assume the same state as it was in just prior to failure.
Re-aligning a single object invariably involves a complete replay of all pending transactions – a process that takes minutes if the system was heavily loaded prior to failure. Pitch Black operates on a finite set of lightweight objects that can undergo independent versioning and may form completely isolated chains of transactions that may be replayed in parallel and out of relative sequence. This is illustrated in the figure below:
At first glance, we may be mislead in assuming that this diagram depicts a set of wholly interrelated objects. In fact, we can trace out two completely decoupled transition paths with no overlapping intermediate dependencies (red and green on the diagram). It follows that to recover object D from the initial set of objects, we only need to replay two out of the 5 transactions depicted in this graph.
Pitch Black employs a complex algorithm that builds a dependency graph and isolates non-overlapping transaction chains that are replayed on demand – depending on which objects need to be re-aligned with their true state. Compared to the traditional roll-forward replay model this is blazingly fast!
JIT recovery
The concept of decoupled transactions incidentally lends itself to the same observable behaviour as that of JIT compilation. Every other transactional storage facility employs an ahead-of-time recovery protocol, where the first post-failure action is to completely re-align the overall state with the journal. A system concurrently logging at a rate of 10,000 TPS, for example, falls behind the transaction-consistent state by precisely 10,000 transactions for every second that it delays synchronising the data images with the journal. When a failure occurs, the entire log is replayed. Because transactions are strictly ordered, the replay is performed sequentially, which forbids the system from operating until its state is completely re-aligned. Consider Pitch Black: although it can in principle, DTS does not materialise every single transaction from the journal into the object images. It is, in fact, flexible in how it synchronises the journal with the object images. The maximum allowable journal backlog can be tuned for higher throughput, which would ordinarily incur a significant recovery time penalty on every other system. Not for Pitch Black, where a decoupled transaction model enables it to operate on small fragments of the overall journal backlog to instantaneously recover only those objects that are needed at the time of the recovery. Remaining objects that are not required immediately following a failure can be recovered later, either on demand, or at the discretion of Pitch Black.
Apart from achieving unmatched throughput and near-zero recovery times, decoupled transactions have overturned the classic parallel computing problem: cache coherency. A high-throughput, single-node database has one problem to solve, that is, when to synchronise between the journal and the images. A clustered database is challenged by a second problem, when and how to update the object caches on its peer nodes (cohorts). Typically, this would involve some use of the communication fabric that provides the interconnects between the cohorts. Given a sufficiently high throughput, this places an immense load not only on the interconnect fabric, but also on the processing capacity of the cohorts. The JIT-recovery nature of decoupled transactions provides any Pitch Black node with an opportunity to generate a transaction-consistent image of an object without the participation of its previous owner. Faster-than-light transaction processing is here, and its black – Pitch Black.
DTS/S1: concepts
Pitch Black has been designed to unify all enterprise data storage requirements into a single, cohesive solution. We have achieved this by closely mimicking the heap allocation model of memory-managed development environments such Java(TM) and similar programming languages, and depicting similar object-lifecycle semantics in a persistent storage model. In plain terms, what this amounts to is the ability for application developers define classes as they normally would, instantiate those into objects, manipulate various fields and have those changes transparently committed to stable storage. The objects themselves are plain Java objects, potentially existing objects that already depict the business data, that have been enhanced by implementing a lightweight interface defined by the DTS/S1 container.
Pitch Black persists all data on an object-by-object basis, whereby each persistent entity is subject to separate revision control and system-wide locking. Because an individual object is so small by comparison to the entire organisational data store, DTS/S1 affords to make near-instant, permanent modifications to objects soon after it has committed transitional data to its recovery journal. Should one or more nodes in a DTS cluster fail at any point, the remaining nodes will absorb the increased load and provide immediate fail-over. The most recent transaction-consistent data state will immediately become available to the remaining nodes and the objects can be utilised immediately without incurring the penalty of a roll-forward replay.
Modified objects undergo revisions, where all previous revisions are retained and can be queried directly without reverse-replaying transactions to gain a historical perspective on the data. This is a yet another distinguishing factor that cannot be attained by competing products since they only store the most recent data image and a change log. Pitch Black can be configured to store checkpoint object images for each and every alteration of the data. Since data is checkpointed on a per-object basis, it is trivial to isolate areas of the system that may require frequent historical trend inspection, such as an organisation’s monthly sales/performance figures. These snapshots can be consulted directly, without the need to backtrack the journal or set up a separate data repository.
Two areas of computing have served as the primary influence for the design of Pitch Black. The first is managed-heap programming environments and adjacent technologies such as type safety, pointer arithmetic and garbage collection. Even JIT (Just In Time) compilation was not overlooked. This is closely approximated in how DTS stores and manipulates its objects internally. This also reflects how the application interacts with DTS.
The second area is revision control systems (RCS). A modern RCS, such as the popular Subversion, are highly transactional in nature. While they offer per-resource revision control, they also support bulk operations and an all-or-nothing atomic commit. These operations are fundamental in transactional object databases. But apart from these primitives, RCS systems expose historical views, differences and change sets. This is not the norm for transactional systems, but it is in Pitch Black.
Benefits of DTS/S1:
- Scalability in both the compute and storage planes: upgrade one or the other – whichever is the bottleneck in your business context.
- No-single-point-of-failure architecture: the failure of one node is mitigated by the rest of the cluster. Active nodes simply absorb the increased load up to their designated capacity.
- Load balancing: the business logic and transaction processing load can be distributed fairly among all nodes in a DTS cluster.
- All-in-one solution: Pitch Black accommodates arbitrarily complex object structures and large object collections. When financially committing to a new technology, an organisation must be sure that no hidden costs follow. Why buy three products when one is all it takes.
- Zero downtime: when other nodes take over from a failed node, the data is accessible immediately, without incurring a roll-forward replay. Thanks to floating objects and decoupled transactions, failover is seamless and immediate.
- Zero maintenance: no database administration, no tablespace management, no schemas, ever! Add or remove nodes on-the-fly, all while the system is online.
- Cross-platform and cross-filesystem compatibility: a Java virtual machine and a POSIX-compliant filesystems are all that’s required to run a Pitch Black cluster.
- Embeddable: micro edition ideal for embedded devices, desktop applications, games, CAD and other desktop and mobile applications with internal storage requirements.
- Performance: faster than greased lightning! 10 KTPS on entry-level server hardware. MTPS-scale at the higher end!
- Cost per transaction: plenty of headroom for expansion to tens of thousands of transactions per second on the same hardware that you started with.
- Tools: DTS Night Vision for developing complete back office solutions.
- Zero impedance mismatch: no abstraction layers to mediate between the business logic and the storage platform. Existing business objects can be persisted right out of the box: no ORM middleware required.
- Security and reliability: DTS/S1 is a fully-featured transaction processor. Pitch Black supports serialisable ACID transactions – the highest level of transaction atomicity possible. All objects undergo independent revision control and can be audited by appropriate personnel.
- Standards compliance: Pitch Black is compliant with the Java Transaction API (JTA) specification for transaction demarcation.
- Time to market: concentrate on developing business logic and user interfaces – DTS takes care of the rest. DTS/S1 supports agile software development right from the word go.
- Sustainability: decreased rack storage requirements, reduced power consumption and heat dissipation requirements.
Check out DTS/S1 at: http://obsidiandynamics.com/dts/
– Guy