6 min read

Orcfax February technical update

Orcfax February technical update

  1. Introduction to the Orcfax vision
  2. Orcfax roadmap status
  3. The web3 data permance problem
  4. The Cardano eUTXO concurrency problem

Introduction to the Orcfax vision

Since launching in December 2021, the Orcfax project has been carrying out innovative research and development into the oracle problem: how can real-world data be introduced into a blockchain like Cardano while verifying these to be authentic and accurate facts?

The real-world data published by an oracle must be trustworthy as it has significant financial and social consequences for on-chain smart contracts and their stakeholders. Existing oracle providers generally break down their answer to the oracle problem into three components.

  1. collecting off-chain data
  2. validating the data
  3. making the data available on the blockchain

Traditional oracle providers are asking their users to trust the authenticity of their data based on the security measures of their technical infrastructure as well as the reputation of their corporate brand and that of their data provider partners, i.e. "if our data were false, that would make us look bad, so trust us."

One of the unique characteristics of the Orcfax project is that it is adding a fourth component to address the oracle problem. Orcfax permanently archives the data that was published on-chain along with contextual information about the off-chain data collection and validation.

This empowers stakeholders to "trust but verify" the actual data that was used as smart contracts inputs rather than trusting the brand and IT infrastructure of data providers.

Orcfax is developing "proofs-of-fact" as trustworthy provenance information about the chain of custody for its source data. To guarantee the authenticity of these proofs, it is applying industry quality standards for record-keeping along with an innovative new web3 application of the traditional archival science concept of the archival bond.

The focus on more sophisticated data use cases is another characteristic that distinguishes Orcfax from the first generation of oracle providers. These tend to be focused on providing currency price data to decentralized exchanges or sports data to gambling dApps. Orcfax is interested in more complex requirements for the authenticity, publication, and archiving of data used in RealFi scenarios such as supply chain traceability, ecosystem conservation, regenerative farming, and carbon sequestration.

Orcfax roadmap status

To deliver on the Orcfax vision it is important that our technical architecture is based on sound, research-driven requirements and design choices. We believe this will introduce a higher-quality oracle product into the web3 space.

The original project plan was to present a minimum viable product (MVP) for an Orcfax oracle after three months of R&D. However, during our initial software prototyping we ran into two technical hurdles. Rather than rush through work-arounds we have decided to revise our MVP delivery schedule. The two primary technical hurdles we encountered were the web3 data permanence problem and Cardano's concurrency problem. We provide more context on these below.

The web3 data permanence problem

Orcfax originally proposed IPFS as the platform to provide permanent storage for the archival information packages that contain its source data and corresponding provenance information. IPFS  is currently the leading decentralized storage platform used in the web3 space. However, during our R&D into applying IFPS for Orcfax system requirements we ran into a few roadblocks which has led us to reconsider its implementation.

IPFS is a data access technology, on par with HTTP. Used in isolation, it does not guarantee data permanence. Files posted to IPFS can disappear if there are no IPFS nodes retaining a copy. This is no different than a website at a particular HTTP address going offline if the host deletes its files on the webserver or fails to renew their domain registration. This is also similar to a torrent file that disappears from the BitTorrent network once it is no longer popular and there are no peers left with a copy on their computer to seed it. To address this issue, IPFS users may use a third-party "pinning" service like Pinata or maintain copies on their own IPFS cluster nodes. The recommended option is to pay Filecoin miners to store IPFS files.

Each of these workarounds requires constant availability monitoring and payment for these resources by a trusted third party.  These external dependencies are a risk to data permanence and reliability. Furthermore, Filecoin has a number of technical complications related to securing storage deals with miners, pricing predictability, "sealing" archived files, and long delays retrieving these cold storage files from miners when they need to be accessed.

The explicit goal of Orcfax is to eliminate these types of externalities and work in a fully-decentralized web3 architecture while also maintaining the usability and performance that end-users expect from a web 2.0 experience. Therefore, mid-way through its R&D phase, Orcfax made the decision to shift the design of its permanent archival storage component from an IPFS/Filecoin based platform to the Arweave network.

Arweave is an innovative new web3 storage platform that is based on a one-time, up-front endowment payment to fund permanent storage. This solves the perpetual availability monitoring and payment issue. It also introduces a unique, next-generation proof model that guarantees data storage and fast retrievability.

Arweave began to gain prominence in mid-2021 when most Solana NFT projects began to adopt it as their platform of choice to solve  the "NFT storage problem". It has quickly gained in prominence as a more reliable alternative to IPFS/Filecoin and is now being introduced to a variety of innovative new web3 use cases. A notable one is hot backup and indexing for the full history of Layer1 blockchain transactions using the KYVE network.

KYVE also provides the ability to develop use-case specific validator networks for Arweave data. Therefore, Orcfax has stopped its testing of an IPFS/Cosmos based solution for this component of its technical stack to investigate an Arweave/KYVE based alternative.

This mid-schedule switch from an IPFS and Cosmos based architecture to Arweave and KYVE is one of the main reasons there is now some delay in delivering the Orcfax MVP. We feel confident that this change is justified and worth the wait. Furthermore, the revised schedule now syncs with the expected delivery of a solution to the second major technical hurdle we ran into in our first round of R&D.

The Cardano eUTXO concurrency problem

The Cardano blockchain has adapted and improved upon Bitcoin's Unspent Transaction Output (UTXO) design.  In the UTXO model, a transaction has inputs and outputs, where the inputs are unspent outputs from previous transactions. Assets are stored on the ledger in unspent outputs, rather than in accounts as is done on the Ethereum blockchain.

Cardano's Extended UTXO (eUTXO) model extends Bitcoin UTXO design in two fundamental ways. Firstly, it allows blockchain addresses to hold arbitrary software logic, i.e. "smart contract scripts". Secondly, EUTXO outputs can carry arbitrary data in addition to blockchain addresses and token balances. This means scripts can carry state information.

These state information outputs are the obvious mechanism for delivering off-chain data to Cardano smart contracts. In fact, it is the only way that an on-chain Cardano script can consume external data. These are single-use UTXOs and herein lies the concurrency problem. The outputs of an oracle should be usable simultaneously by multiple, independent scripts. Each may submit a transaction to consume it at the same time. However, only one of these will be successful and, after it consumes the output, that UTXO will no longer be available to other scripts without some requirement to re-post it.

During our prototyping we realized that, for smart contracts dependent on oracle data, this issue creates a serious arbitrage risk. Namely, if one smart contract dApp is able to read oracle data (e.g. a change in the exchange rate between ADA and BTC) then it has an information advantage over other smart contracts that have to wait for that output data to become available to them in a subsequent block.

There are a number of technical workarounds to try and resolve this issue. Most of them involve off-chain coordination or posting an arbitrary number of the same data output transactions. None of these options are ideal or sustainable. All of them are moot in light of IOG's upcoming fix for this problem by way of Cardano Improvement Proposals 31, 32, and 33.

CIP-31 "Reference inputs" is of particular interest to oracle data providers and consumers. It will allow oracle providers to publish off-chain data to a single Cardano eUTXO and allow on-chain scripts to read it concurrently without consuming it.

Spending development time and funds to implement a temporary fix to the concurrency problem is wasteful in light of this upcoming enhancement. We had hoped that IOG would introduce it in the February Babbage hard fork but a recent IOG development update revealed that we should expect to see it in June hard fork instead. Therefore, we have paused the R&D on the Orcfax blockchain publication component until CIP-31 becomes available to developers on the Cardano Testnet.

In the meanwhile we are actively working on implementing Orcfax's data collection, validation and archiving component on Arweave and KYVE. We are also in active discussions with RealFi projects to learn more about their specific needs for off-chain data inputs and validation. Please sign up for our blog newsletter or follow us on Twitter if you'd like to be kept up to date on our progress.