Coinbase Vice President of Data Reveals the Road to Data 3.0 in the Blockchain Industry (www.blockcast.cc)
The complexity of cryptocurrency data brings new opportunities for data, that is, companies enter Data 3.0 to create value on a large scale through systematic intelligence and automation.
Written by: Michael Li, Vice President of Data at Coinbase The original text was published on TechCrunch, and the author authorized Chain News to publish the Chinese version
Data is a company’s gold mine. If properly managed, it can be used as an important tool to hold everyone accountable. It also provides clear information and insights that can allow companies to improve decision-making on a large scale.
However, most companies are stuck in the data 1.0 era, that is, they only treat data services as manual and passive services. Some companies have begun to move towards the data 2.0 landscape, which uses simple automation to improve team productivity. The complexity of encrypted currency data brings new opportunities for data, that is, companies entering the data 3.0 landscape can create value on a large scale through systematic intelligence and automation. This is the journey of digital 3.0.
Coinbase is neither a financial company nor a technology company, but a cryptocurrency company. This difference has a major impact on how we process data. As a cryptocurrency company, we process three important types of data (unlike conventional companies that only process one or two types of data), each of which is very complex and different:
Our work has always focused on how to make these different data work together, eliminate data islands, eliminate problems before they occur, and create opportunities for Coinbase that may not exist before, so as to achieve scale and create value.
I have personally worked in high-tech companies such as LinkedIn and eBay, as well as financial institutions including Capital One, and I have personally observed the evolution from data 1.0 to data 3.0. In Data 1.0, data is regarded as a reactive function that can provide temporary manual services or rescue in emergency situations.
In Data 2.0, simple tools and third-party solutions are used to automate part of manual and repetitive tasks to improve team productivity. However, in most cases, the data team still relies on additional staff to create more value. In the era of Data 3.0, open source and internal technology collaborate to create a data system to fundamentally realize the scale of value creation.
Data 3.0 Road to Nirvana
The biggest benefit of Data 3.0 is efficiency and consistency in all data streams. It enables enterprises to build a comprehensive data foundation to serve the long-term success of the enterprise while meeting immediate needs with limited resources. When the company is small and changes rapidly, this situation may not be obvious, but as the company expands and undergoes rapid growth, the lack of consistency between data streams may become a huge pain point. It is often difficult to correct it if you plan early.
Even the best technology companies in the world may have different engineering teams creating tailor-made data products and services to solve specific pain points, resulting in bad habits. This may leave huge pits in the standardized workflow of end-to-end data systems, making it difficult to build and operate data on a large scale. To make matters worse, these one-time jobs may become large enough to become an independent system, which requires considerable time for integration and migration. These usually become legacy systems, and over time, these legacy systems will bring heavy technical liabilities to the company.
In view of the continuous development of blockchain technology and data use cases, our data 3.0 work is far from complete. I would say that I am very proud of the progress we have made. The figure below outlines our work and system so far.
Data storage and processing
Regardless of the choice of a particular technology, a clear strategy needs to be formulated for the three main components: the separation of storage, the separation of computing, and the semantics of “single source of truth”. Decoupling these components and formulating clear technical strategies can avoid performance and expansion bottlenecks as the company develops.
Data platform and application
Although we use a combination of in-house technology, open source tools, and vendor solutions to meet various needs, we have made clear trade-offs when determining specific solutions in each category, so there will be no duplication or repetition in the future. Ambiguity. Our management event system, data orchestration workflow, business intelligence layer and experimental platform all implement this principle. It also brings a highly decoupled and scalable architecture.
Machine learning and platforms
Although machine learning has been “the most eye-catching” in the hype surrounding artificial intelligence in recent years, it is also the most cross-functional component of the data team. Our true end-to-end machine learning platform Nostradamus supports all of Coinbase’s machine learning models, including data pipelines, training, deployment, services, and experiments. Because the machine learning platform is built on the basis of comprehensive consideration of all other parts of the data ecosystem, it not only aims to enable machine learning to solve the immediate problems, but also to grow and expand along with the business.
Data Science and Data Products
These two areas are probably the most end-user friendly parts of the data team, because they are basically the presentation layer of refined data insights designed to satisfy our customers and create value for them. They are also the most direct beneficiaries of all the above-mentioned work.
The most important responsibility of the data team is that data scientists should separate themselves from the machine and focus on enabling the machine to provide data in a scalable manner and create value for consumers (rather than being an intermediary between the machine and the data consumer).
Disclaimer: As a blockchain information platform, the articles published on this site only represent the author’s personal views, and have nothing to do with the position of ChainNews. The information, opinions, etc. in the article are for reference only, and are not intended as or regarded as actual investment advice.
Comments
Post a Comment