Countdown to Corda Open Source

R3 will soon be open-sourcing Corda. Here’s what to expect.

As I confirmed a few months back, R3’s Corda platform will be open-sourced, under the Apache 2 licence, on November 30.

Corda is a distributed ledger platform designed and built from the ground up for the recording and automation of legal agreements between identifiable parties. It is heavily influenced by the requirements of the financial industry but we believe the community will find the underlying architecture will lend itself to a broad range of applications.

We’ve built Corda because we see requirements – especially in finance – that need a distributed ledger but which cannot be met by existing platforms.

  • Corda is the only Distributed Ledger platform designed by the world’s largest financial institutions to manage legal agreements on an automatable and enforceable basis.
  • Corda only shares data with those with a need to view or validate it; there is no global broadcasting of data across the network.
  • Corda is the only Distributed Ledger platform to support multiple consensus providers employing different consensus algorithms on the same network, enabling compliance with local regulations.
  • Corda is designed to provide a great developer experience and to make integration and interoperability easy: query the ledger with SQL, join to external databases, perform bulk imports, and code contracts in a range of modern, standard languages.

We designed it with the members of R3, the world’s largest financial services DLT consortium, but we think its applicability is far broader.  You can find out more in our introductory whitepaper and my blog post on why we’re building Corda and what makes it different. If you prefer videos, here’s a short interview I did with Simon Taylor of 11:FS that explains the thought process behind Corda.

What we’ll release on November 30 is pretty much the full codebase as it exists today and we will be improving it actively and openly from then on. In fact, the only code we’ve held back pertains to laboratory projects we’re working on with our members and work on our own commercial business products that will run on top of Corda.

So do take a look around when the code is released: there’s a lot in there that is still work-in-progress and not yet integrated. For example, you’ll find a fascinating approach to writing financial contracts in the experimental branch and ongoing work on our deterministic sandbox for the JVM.   We will, of course, also be developing a commercial version of Corda for those who need specific enterprise features and support, but the open source codebase is the foundation of everything we do.

This is a really important point: distributed ledger technologies will have such phenomenally powerful network effects that it is unthinkable that serious institutions would deploy base-layer ledger software that is anything other than fully and wholeheartedly open. And it’s why we’ve been committed all along to releasing Corda just as soon as we were sure it was heading in the right direction.  It is and so we are.

We will also be publishing a draft of our technical whitepaper.  This whitepaper outlines our roadmap to version 1.0 of Corda and production-readiness.

What to expect on November 30

We’re really proud of Corda and its progress to date. But, that said, Corda is far from finished. Mike Hearn will soon be publishing a “warts and all” description of quite how much work we still have to do. This is true for all other platforms in this space, of course, but I feel a particular responsibility to be transparent given the ambitions we have for Corda and the uses to which it will be put.

By way of example, perhaps a good way to help you figure out what we still have to do is to look at some items on the list of work we’ve set for the months ahead of us:

  • Functional completeness: Corda still has gaps in its functional capabilities. The technical whitepaper outlines the full vision and you’ll see us working on and merging a lot of functional enhancements in the coming months to implement the full vision in the paper.
  • Non-functional characteristics: We focused first on design and then on implementation of Corda’s core functionality. The work to ensure we meet our non-functional requirements, such as performance, is still ahead of us but we have a clear roadmap and have designed the platform with these needs firmly in mind.
  • Security hardening: There are lots of areas where we need to tighten up security. Much of this we know about and we have called it out in the code or associated docs. But there will, of course, be others. So just as you shouldn’t be using other enterprise DLT platforms in production just yet, please don’t download Corda and put it straight into production just yet either!
  • API Stability: Corda’s development is iterative and organic – and it is heavily influenced by the range of projects and applications to which our members are choosing to put it. As we learn about common patterns and discover assumptions that prove to be wrong, we adapt. In particular, this means that we do not commit to API stability or backwards-compatibility until version 1.0.  Expect parts of the implementation to change in the coming months, perhaps quite significantly!

But these things are transient: we know how to fix them and we’ll knock the issues off one-by-one in the coming months as we head towards version 1.0.  But we want you to be fully aware of them.

Why are we open-sourcing Corda now?

We had a vigorous internal debate about when was the right time to release Corda: wait until it was more mature, when we were confident we’d ironed out the bugs and made it fly?  Or wait only until the design roadmap was clear and then share it immediately with the world for comment, criticism, contribution and collaboration?

We’ve wholeheartedly chosen the latter path: to release early and to work openly.

We’re serious about inviting the community to critique, collaborate and contribute. To take one example, our friends at Digital Asset recently published an excellent paper describing a set of requirements for what they call a “Global Synchronisation Log” (GSL), encouraging those in the community to incorporate these requirements into their platforms. We think that Corda’s vision is extremely well aligned to the GSL concept and by open-sourcing our work whilst there is still time to tweak our design it means we maximise the opportunities for firms such as ours to collaborate.

But open-sourcing Corda when it is still fairly young is not without its risks!  In fact, I’m a little apprehensive. I’m a completer-finisher and I obsess over every detail. So the idea of releasing something before it’s perfect makes me feel uncomfortable.  You will find gaps, issues, problems. But that’s fine: please do share what you find.  Even better, submit a fix…!

In fact, I also have a hope that some of those who come to critique will find that they nonetheless like much of what they see, and may even join the community.

What happens next?

I performed a thought experiment a while back… I asked: what will the enterprise distributed ledger world look like when everything settles down in a few years? How many independent enterprise DLT platforms will the world need and which ones will they be?

My conclusion was that there will probably be at most three such platforms, each carefully designed and adapted for a specific set of requirements. They will all be fully open source. And they will be surrounded by thriving, inclusive communities.

And we firmly intend to ensure Corda is one of them.

Our open-source release next week is a key step on that journey.

How to get Corda on November 30

Corda’s home will be corda.net.

Head over to corda.net on November 30 for links to the codebase, simple sample applications and a tutorial to get started writing your own CorDapps.

 

 

On Distributed Databases and Distributed Ledgers

Why can’t companies wanting to share business logic and data just install a distributed database? What is the essential difference between a distributed database and a distributed ledger?

Last month, I shared the thinking that led to the design of Corda, which we at R3 will be open sourcing on November 30; and Mike Hearn and I were interviewed by Brian and Meher of Epicenter last week. We’ve been delighted by the response and are looking forward to working with those seek to build on Corda, help influence its direction or contribute to its development and maturation;  there’s a lot of work ahead of us!

But one or two observers have asked a really good question. They asked me: “Aren’t you just reimplementing a distributed database?!”

The question is legitimate: if you strip away the key assumptions underpinning systems like Bitcoin and Ethereum, are you actually left with anything? What is actually different between a distributed ledger platform such as Corda and a traditional distributed database?

The answer lies in the definition I gave in my last blogpost and it is utterly crucial since it defines an entire new category of data management system:

“Distributed ledgers – or decentralised databases – are systems that enable parties who don’t fully trust each other to form and maintain consensus about the existence, status and evolution of a set of shared facts”

“Parties who don’t fully trust each other” is at the heart of this. To see why, let’s compare distributed databases and Corda.

Comparing Corda to a distributed database

In a distributed database, we often have multiple nodes that cooperate to maintain a consistent view for their users.   The nodes may cooperate to maintain partitions of the overall dataset or they may cooperate to maintain consistent replicas but the principle is the same:  a group of computers, invariably under the control of a single organisation, cooperate to maintain their state.  These nodes trust each other.   The trust boundary is between the distributed database system as a whole and its users.    Each node in the system trusts the data that it receives from its peers and nodes are trusted to look after the data they have received from their peers.  You can think of the threat model as all the nodes shouting in unison: “it’s us against the world!”

This diagram is a stylised representation of a distributed database:

 distributed-database

In a distributed database, nodes cooperate to maintain a consistent view that they present to the outside world; they cooperate to maintain rigorous access control and they validate information they receive from the outside world.

So it’s no surprise that distributed databases are invariably operated by a single entity: the nodes of the system assume the other nodes are “just as diligent” as them: they freely share information with each other and take information from each other on trust. A distributed database operated by mutually distrusting entities is almost a contradiction in terms.

And, of course, if you have a business problem where you are happy to rely on a central operator to maintain your records – as you sometimes can in finance it should be said – then a distributed database will do just fine: let the central operator run it for you.  But if you need to maintain your own records, in synchrony with your peers, this architecture simply won’t do.

And there are huge numbers of situations where we need to maintain accurate, shared records with our counterparts. Indeed, a vast amount of the cost and inefficiency in today’s financial markets stems from the fact that it has been so difficult to achieve this. Until now.

Corda helps parties collaborate to maintain shared data without fully trusting each other

Corda is designed to allow parties to collaborate with their peers to maintain shared records, without having to trust each other fully. So Corda faces a very different world to a distributed database.

A Corda node can not assume the data it receives from a peer is valid: the peer is probably operated by a completely different entity and even if they know who that entity is, it’s still extremely prudent to verify the information.   Moreover, if a Corda node sends data to another node, it must assume that node might print it all in an advert on the front page of the New York Times.

The trust boundaries – the red curves in the diagram- are drawn in a completely different place!

decentralised-database

In Corda, nodes are operated by different organisations and do NOT trust each other; but the outcome is still a consistent view of data.

To repeat, because this distinction is utterly fundamental:  nodes of a distributed database trust each other and collaborate with each other to present a consistent, secure face to the rest of the world.   By contrast, Corda nodes can not trust each other and so must independently verify data they receive from each other and only share data they are happy to be broadly shared.

And so we call Corda a distributed ledger, to distinguish it from distributed databases. A distributed ledger that is designed painstakingly for the needs of commercial entities.

Put more simply: you simply can’t build the applications we envisage for Corda with traditional database technology.  And that’s what makes this new field so exciting.