This essay is about a topic which doesn’t get enough attentation in the public debate about blockchains - the accessibility of user data. What data should be public and what data should be private? Should summarises or snapshots of user data be public? How do we prevent fraud? Blockchain technology is still in its infancy. We can still actively decide in what direction we want to go since all players in the field are still small. For now the most promising use cases for blockchains are in the financial space. In the future we could trade, borrow & lend, receive our salary and make payments on-chain. That’s why I will limit the scope of this essay to financial data.

Types of financial data

There is financial data which is tied to individuals and other data which belongs to organisations. Here is a list of the different types of data and a short description who has access to it.

Individuals

  • Salary: known by employer, government and individual
  • Payments: known by user, business and payment processors such as Visa/Mastercard who sell the data in anonymized datasets
  • Tax/wealth: only known by the government and the citizen itself

Organisations

  • Revenue: known by the business owner, the government, the public if company is public and financial organisations such as Hedge Funds or payment processors that have often good estimates
  • Tax: known by the business owner, government and the public if company is public

Current state of the debate about accessibility of financial data

In my opinion we are stuck in a local optimum in the debate about how transparent or private our financial data should be. For years now the EU is pushing strongly for more privacy (right to forgot & GDPR) of personal data. This direction is dominating the debate and makes it difficult to start experiments around sharing data more transparently. Therefore, today we still heavily rely on whistelblowers to uncover tax fraud. Why are we pushing back against transparency so much? Let’s go through a few cases where transparency has a positive impact.

Salary data

Traditionally, employers don’t want employees to share their salary info with others as it makes salary nenotiations more difficult for them. With public salary data employers have a hard time to trick employees into accepting a lower salary than the average. Some professions have figured out a system to share salary data and put the power back into the hands of employees. In tech there is a website called levels.fyi where people constantly post their salaries anonymously. For example at Google for an L3 (junior engineer) the average starting salary is $183.329 which is shown in the image below. When you enter a salary negotiation at Google you know roughly what to a reasonable salary is. That’s especially valuable for women and minorities who still lack significantly behind men in terms of their salary. This level of transparency is ideal but only works if the group of people who are willing to share their data is big enough. Vitalik Transaction

I think sharing your salary data anonymously is strictly positive for society. It would be valuable for two reasons. First, other people can see which job professions pay more and can change positions. Public salary data would create a more efficient labor market. Second, it would help especially minorities with few professional connections to better judge what their work is worth.

Public tax records

Since more than 200 years everyone’s personal tax return is made public in Norway. This allows media to easily create lists of the wealthiest people in the country and display their tax records. Sverre Solberg, Manager of a co-working space in Norway, says: “We want the press to be able to tell us if someone makes more money than they’re trying to communicate through their own channels. It’s also a question of equality, as a social democracy we don’t want there to be a huge gap between the rich and the poor. An open tax return policy shows everyone how big that gap is, making it easier to discuss and address.” Right now, most countries don’t publish tax data on an individual level which makes it difficult to have public debates about it. There is a saying you can’t improve what you don’t meassure.

I think making tax records public is again strictly positive for society.

Public revenue info

The Securities Exchange Act of 1934 established the SEC as a response to the stock market crash of 1929. The act was intended to increase transparency in the financial markets. It gave the SEC the authority to require publicly traded companies to disclose certain financial and other information such as executive salaries to the public. This lead to better informed investors and consequently more efficient markets. As soon as one company found a product which generates significant revenue and has healthy margins other companies can start offering competiting products to drive prices down. That’s exactly what happened to the Cloud Computing market. Google, Microsoft and other companies saw how profitable AWS is and started to enter the space.

I think making revenue data of large cooperations public as again strictly positive for society.

Blockchain data is public by default

The three cases I just described won’t be implemented anytime soon in the real world. But they are more or less the reality on most blockchains. Today, on Ethereum all transaction data is public. When you send money from one wallet to another wallet all metadata such as the address of the sender, the address of the receipient and the amount is publicly available. However, public data doesn’t imply transparency. Let’s look at an example. The wallet with the address 0x220866B1A2219f40e72f5c628B65D54268cA3A9D currently has around $320 Mio worth of Ether. Just by looking at the address we don’t know who this wallet belongs to. But since we know that the wallet received $320 Mio from an address that belongs to Vitalik Butherin (see image below) we can be pretty sure that 0x220866B1A2219f40e72f5c628B65D54268cA3A9D also belongs to him. Otherwise why would he transfer all his assets to that address?

Vitalik Transaction

You can use the public transaction data to check how much a person received in on-chain salary payments, how much other gains a person made with trading. The same is true for businesses, you can see at every moment in time how much revenue an on-chain business has generated.

What are the consequences of public by default data?

Let’s look at a couple different situations where you use your wallet and see what consequences arise.

Customized user experience

Whenever you use a website in web3 you have to connect your wallet to get access. Below you can see an example of the website of AAve which is one of the most popular DeFi applications.

Connect Wallet

As soon as you connect your wallet the website knows your entire transaction history. Based on your transaction history the website knows stuff like:

  • Your balance
  • How many and which NFTs you have
  • How much you typically spend when using similar services

In the traditional world this data is only known by credit card companies. They anonymize your data and then sell it in packages for billions as described here. This type of data is very valuable since it’s used to make accurate predictions about how much money people are going to spend on what services. Blockchain data is free. There is no gate keeping. In theory, the website can use the data to tailor the user experience to each user. Imagine you log into a trading website and the website knows that you have already done 1000 trades. The website can then show you a much more advanced trading interface. Similary, ads in web3 can be targeted a lot better. A downside is that the same information can be used against you by e.g. showing you a higher price since your balance shows that you are wealthy and can afford it. Companies like Spindl are betting on that future. They help companies get as much information as possible about their users. The founder Antonio worked at Facebook before and built out the Facebook ads marketplace.

Fraud

When tax records were made public for one year in 1924 the Nebraska Senator Robert Howell said “Secrecy is of the greatest aid to corruption”. At that time making tax records public made it easier to find corruption cases. The same is true on blockchains. Corruption and fraud happens but traces of it are stored on the blockchain forever. At some point people start looking closer into the transaction history and will reveal the fraud. Here is one example where a twitter user figured out that FTX was creating artificial trading volume on the Serum exchange to make it look bigger that it actually was.

Zero Knowledge Proofs open up the data sharing design space

Zero Knowledge Proofs (ZKP) are a mathematical construct which allows one party to prove to another party that they know a certain piece of information, without revealing any information about what that piece of information is. ZKP are at the core of how blockchains will scale to 10000s of transactions per second. Besides scalability, they also open up the design space of how data can be shared on blockchains. Let’s go through a few applications of ZKPs.

Proof of solvency

After the centralized exchange FTX went bankcrupt Vitalik published an article where he introduced the concept of proof of solvency. Proof of solvency describes a concept how one party, e.g. an exchange, can proof that it has sufficienlty large funds to cover all user deposits. The concept can be generalised to a proof of claim. An organisation or a party can make a claim and verify that claim with a ZKP without revealing all transaction history. For example you can proof that your income was between $30k and $40k last with a ZKP and then show that proof to the IRS. The IRS can verify the proof and then assign you a tax bracket. The issue with ZKP attestations is that they are only as good and specific as the proof itself. When you see all transactions of a person you know exactly what happened whereas when you only know the result of the ZKP attestation you only know that that particular claim is correct. The user might have found a way to do fraudulent acivity which is not detected by the claim.

Privacy Mixers

Privacy Mixers are ZKP based smart contracts which allow users to send money from one wallet to another wallet without creating a connection between these two wallets. The most popular privacy mixer is Tornado Cash which was banned by the US government because it limits the ability of the government to oversee transactions and track criminals on-chain.

Aztec Network: private DeFi

Aztec is a DeFi rollup which allows you to shield all transaction metadata. This means you can make payments, borrow and lend money without another person knowing it’s you. This solution gives you full privacy and makes it impossible for others to track what you are doing. This comes closest to the initial idea of blockchains to create a way to send money around without government inference. It’s not clear to me yet how authorities can have an oversight in such a system to find bad guys. The compliance tools by the Aztec Network are rather minimal. There is a block and a daily deposit cap which makes it impossible to transfer large amounts of money into Aztec. This is probably not satisfactory for the government in the long run.

No real world adoption by ignoring compliance

I think it will be difficult to get any large scale real-world adoption without proper compliance tools since not many governments will accept it. Honestly, I don’t fully understand why many in the blockchain ecosystem are obsessed with privacy and are willing to sacrifice governmental oversight for it. Our economy and therefore our life would look a lot different without transparent well functioning markets.

Let’s start a debate

This marks the end of the essay. I hope it came accross that transparency can also be a value worth fighting for. Now, I invite you to contribute to the debate on the future of blockchain data. Here are a few questions that can be used as a starting point:

  • What financial information do we want to be public and transparent and why?
  • What financial information do we want to be private and why?
  • In what cases can ZKP attestations be used as a compliance tool and where do they fail?
  • What other compliance tools are necessary for web3 to achieve mainstream adoption?