Smart Contracts using a Turing complete language

As we continue to add support for the DokChain, enabling more business related transactions, we are also scrutinizing the mechanics and performance of the frameworks and methodologies.  As mentioned in the last blog, DokChain - Deep Dive Part One, we will now be diving deeper into the mechanics of smart contracts, particularly how the choice of programming language affects the design.

 

The first version of Bitcoin, released in January 2009 by Satoshi Nakamoto, contained a scripting language, named Script, designed to add flexibility to the conditions under which bitcoins can be transferred.  If you think about the transfer of funds as a financial contract, then you can consider Script to be a language for specifying very specific contracts.  As these contracts are completely determined by an explicit algorithm, with no need for any human intervention, we call them smart contracts.  Script was intentionally designed to restrict the kinds of contracts that could be created.  By limiting its scope, stronger promises can be made that the scripts will execute as intended and that malicious scripts cannot be created.

 

At the opposite end of the spectrum, programming languages capable of producing anything corresponding to our intuitive notion of algorithm are called Turing-complete (physical limitations like memory notwithstanding). The assertion that our current definition of algorithm matches our intuition is called the Church-Turing Thesis.  This group of languages includes nearly all of the languages that people tend to think of: C, Python, Fortran, Lisp, Javascript, etc, etc.  Despite the extraordinary utility of these languages, they can be difficult to reason about, both by humans and by computers.  You can’t, for instance, write a python program which accepts, as input, an arbitrary python program and tells you if that program is just the function f(x) = 2x (the function that doubles numbers).  At the heart of the matter, at least from a theoretical point of view, there is no algorithm which can decide which algorithms on which inputs actually terminate.  This is known as the halting problem, which was further generalized by Rice’s theorem to show that there is no non-trivial property describing a program’s behavior that can be decided algorithmically.  There are therefore severe limitations when using computers to detect software bugs in programs written in Turing complete languages.  This is not to say that people do not attempt to perform static analysis on computer programs, it’s just that success will always be limited.

 

Despite the uncontrolled nature of such expressive languages, there is great appeal in having the ability to turn any algorithm into a smart contract.  What bitcoin has solved so well for currency becomes a solution for any deterministic process.  Platforms such as Ethereum, Hyperledger, and Enigma have risen to take advantage of this powerful combination.  Ethereum, currently the most successful group to implement these ideas, mitigates some of the unpredictable nature of its smart contracts by imposing a predefined, per-contract limit to the amount of computation each contract can perform.  This, however, does not solve all the problems with such a system.  There is still the problem of detecting exploits, and in general reasoning about what a contract is capable of doing.  Another issue is the verifier’s dilemma, whereby an honest member of the group is disincentivized from verifying every computation, thus preparing the way for malicious attacks.

 

Malicious scripting was at the core of the DAO hack, not the low-level machine language of the Ethereum virtual machine.  Smart contracts are intended to be stand alone agreements, devoid of human wetware.  In other words, the code itself is meant to be the ultimate arbiter of the transaction method and business deal.  What went wrong with the DAO contract involved subtleties in the high level programming language used to create the contract.  A human can easily recognize that the contract behaved in an unintended way.  The attacker found a nonobvious way to get the payment portion of the contract stuck in a loop, recursively calling itself over and over until it had drained a very large amount of currency.  It is with this in mind that we are focused on proof validation methods for ensuring that contracts on the DokChain execute as expected.

 

Interestingly, there are collections of algorithms that are general enough to create any smart contract that would be needed in practice, but that do not suffer from things like the halting problem.  One such example is the collection of primitive recursive functions.  Two exceptionally nice features of these functions are that they are defined for all inputs (i.e. computations always halt), and computational bounds can always be determined. The primitive recursive functions are defined as all the functions that can be made from a collection of very basic functions (constant functions, the add one function, etc.) using composition and primitive recursion.  Primitive recursion is a very controlled type of recursion, whereby a function determines its value on input n+1 by applying a rule to the previous values f(n), f(n-1), …, f(0).  While there are algorithms which are not primitive recursive, such examples are impractical for use in a real world smart contract.  The Ackermann function, one of the first examples of an algorithm which is not primitive recursive, grows so fast that Ackermann(4,4) = 2^(2^(2^(65536))) and Ackermann(5,5) requires special notation to denote it in a reasonable amount of physical space.  The Ackermann function is defined using a kind of double recursion that allows it to grow so fast.  It clearly conforms to our intuitive notion of algorithm, thus preventing us from saying that the primitive recursive functions completely capture the notion of algorithm. Here is the definition:

 

screen-shot-2016-09-20-at-11-51-38-am

 

If you fix m=2 you get a linear function.  If you fix m=3 you get an exponential function.  At m=4 you get tetration. You can see why the diagonal function Ackermann(m, m) grows so fast, and why we would never want to use it in practice (except for maybe things like testing your compiler’s ability to handle recursion).  In general, the kinds of algorithms that we cannot capture with a primitive recursive function are either a fast growing function like the Ackermann function, or something like a python interpreter (i.e. a universal Turing machine) which does not make sense to have as a smart contract.

 

There are programming languages designed to take advantage of well-behaved collections of algorithms (and hence not Turing complete), usually falling under the category of total functional programming.  We mentioned earlier that the primitive recursive functions are defined on all inputs.  We say that a function is total if it is defined on all inputs. Programs in these languages are proved to be total by the compiler.  When combined with a descriptive type system, such as dependent types, very strong guarantees about the program’s behavior can be formally proved true by the compiler.  The majority of such languages are used as proof assistants, tools used by mathematicians to produce formal mathematical proofs.  Examples include Coq, Agda, and Isabelle.  More general purpose languages, such as Idris, are now emerging with the same guarantees of totality but designed to be friendlier to software developers.  It should be noted that all the languages mentioned above contain a strong enough logical apparatus to actually be able to prove that the Ackermann function is total.  It is well understood what it takes to prove that the primitive recursive functions are total (see Parson’s Theorem).  Alternative to using a programming language like those mentioned above, there are tools like Why3 which can take C code (or Java or others) and attempt to formally verify that a program is correct.  Success is not guaranteed, however, since C is a turing complete language.  

 

In an orthogonal direction, the company Tezos is approaching the verification problem from the purview of updates to a constitutional system based on a rolling governance model (similar to Nomic).  They are implementing a Proof Of Stake system written in OCaml.  While the DokChain is not incentivized by cryptocurrencies, we also believe that Proof Of Work has several shortcomings and that Proof Of Stake yields a more robust system, especially for private chain systems.

 

In light of all this, we are actively considering the multitude of paths that PokitDok can take toward becoming the defacto service provider for smart contract creation, issuance, and arbitrage in the health industry.  In future blogs we will be addressing how formal verification and methods such as Proof of Stake and Proof Of Authority will be provided on DokChain.

About jaredcorduan

Ph.D. From Dartmouth in mathematical logic with a passion for all things Charleston.

View All Posts

Leave a Reply

Your email address will not be published.