Behind the scenes

Growing Business with Web2 and Web3 Data Integration: WEPIN Wallet Stats

2024-05-27

Author: Jinseok Kim, Data Engineer (https://www.linkedin.com/in/joshuajsk/)

[TL;DR]

  • Blockchain-based services need to integrate and analyze Web2 and Web3 data for growth, but both data types have limitations that make independent use difficult.
  • Wepin Workspace overcomes this through user statistics features that connect Web2 users with Web3 accounts, and builds a data warehouse to provide accurate and reliable statistics services.
  • Based on this, blockchain-based services can establish effective product growth strategies such as user acquisition and retargeting.
  • Wepin wallet plans to support customer service management and action execution through continuous product advancement.

1. Introduction

Recently, data has become more important than ever. We live in an era where data must be utilized to promote business growth. Beyond classic data utilization such as A/B testing and performance reporting, data is now being used in various ways to incorporate causal inference, machine learning, and even generative AI into products.

Especially with the activation of blockchain-based services, the nature of the data we need to utilize is fundamentally changing. Existing Web2-based data is user data collected on traditional services, including user behavior patterns, location information, and session information. On the other hand, Web3 data is data collected from blockchain networks, transparently available to everyone, including transaction history, virtual asset holdings, or contract utilization.

Web2 and Web3 data each have their pros and cons, and from the perspective of blockchain-based services, there are limitations to analyzing only one type independently. An environment that can integrate and analyze both types of data needs to be prepared to gain deeper insights, optimize user experience, and establish more efficient product growth strategies.

In this article, we will examine the characteristics and limitations of Web2 and Web3 data, then discuss ways to integrate these two data types and the benefits that can be gained from this. Finally, we will explain what value the user statistics feature of Wepin Workspace, which implements this, provides.

2. Characteristics of Web2 and Web3 Data

2.1 Characteristics of Web2 Data

Web2 data refers to the traditional web data we commonly know, and includes the attributes and behavior information of users who visit or log in to a service. As shown in the figure below, there are numerous users within the service, and each user generates numerous events.

Web2 data centered on users and event logs

Such Web2 data is collected and typically used for the following purposes:

  1. Business Strategy: Used for general business reporting purposes such as DAU, MAU, total cumulative users, etc.
  2. Performance Marketing: Used to measure the Customer's Lifetime Value in the service, compare advertising performance through ROAS (Return on Ad Spend), or select user segments for retargeting execution.
  3. UX: Used for purposes such as exploring aha moments, measuring conversion rates, and executing A/B tests to design a smooth user journey.
  4. Development: Used to improve product quality or provide personalized services through machine learning, recommendation systems, generative AI, etc.

2.2 Characteristics of Web3 Data

Web3 data is data recorded on the blockchain, including virtual asset holding information of accounts active within each network, transaction execution history, and contract information. In other words, as shown in the figure below, there are numerous accounts within the blockchain, and each account executes numerous transactions. (General user accounts, EOA, can execute transactions, and service-purpose accounts, CA, have internal transactions executed by EOA, but all these details can be checked.)

Web3 data centered on accounts and transaction logs

Web3 data can be transparently utilized by anyone if they can directly operate or access a full node or archive node that stores all data from the Genesis Block to the most recent block. Unlike Web2 data, it's not monopolized by a specific service but is data open to everyone.

For example, in the case of EVM-compatible networks including Ethereum, Web3 data consists of the following raw data:

  1. blocks: Creation information of each block
  2. transactions: Transaction occurrence history recorded on the protocol
  3. traces: Function call history requested to the protocol or contract
  4. logs: Event history generated through EVM computation from the protocol or contract
  5. creation_traces: Contract deployment history
  6. balances: Native coin (ETH) holdings of each account responded from the protocol, or real-time token holdings of each account responded from the contract (no raw data exists, directly requested from the node via RPC Method in real-time)

This Web3 data can ultimately be used for the following purposes:

  1. Business Strategy: Used for general business reporting purposes of blockchain-based services such as DAA (Daily Active Addresses), MAA (Monthly Active Addresses), etc.
  2. Performance Marketing: Used for overlap analysis purposes to measure virtual asset holdings and transaction execution history of each account, and to understand the performance of airdrop events or usage status of other services.

3. Limitations of Web2 and Web3 Data

3.1 Limitations of Web2 Data

From the perspective of companies operating blockchain-based services, Web2 data has the following limitations:

3.1.1 It's difficult to know blockchain data.

Blockchain-based services need to check not only data generated within the web but also transaction history and virtual asset holding status data on the linked blockchain. Through this, they need to check how high a level of activity the service users actually have on the blockchain, and how much economic power they possess. However, Web2 data doesn't tell us this.

3.1.2 Cross-service data cannot be collected.

Generally, users don't use just one service, but use multiple services simultaneously. However, because Web2 data is not structurally transparently disclosed, from the service company's perspective, they can only see the user's activity within the service, not the user's entire ecosystem activity history. In other words, Google's data manager cannot access Meta's Web2 data.

3.1.3 Facing personal information protection issues.

Worldwide corporate regulations on user cookie collection, such as the EU's GDPR (General Data Protection Regulation), are being strengthened day by day, making it increasingly difficult to collect Web2 data. Ultimately, there's a significant risk in using data with integrity due to omissions in the process of collecting service user data.

3.2 Limitations of Web3 Data

Of course, Web3 data also has the following limitations:

3.2.1 It's difficult to know data within the service.

Web3 data basically only contains execution history focused on transactions sent on the blockchain, and doesn't tell at all how well users are participating in detailed functions within the service.

3.2.2 User attribute data is very poor.

Web2 data generally collects demographic data such as country, device information, gender, age, etc. On the other hand, in Web3 data, the subject of activity is not the user but the account, and each account is identified by a random hexadecimal with pseudonymization. As a result, it doesn't provide any demographic data at all.

Especially, if one actual user uses multiple accounts, Web3 data has an inherent problem where even basic data analysis such as DAU, MAU is difficult to have integrity because it cannot aggregate this as one user.

4. Integration of Web2 and Web3 Data

4.1 Why Integrate?

As mentioned earlier, Web2 and Web3 data each have limitations. Therefore, for companies operating blockchain-based services, there are limitations in using the two data types independently.

Ultimately, only by being able to check all user attributes and behavior history within the service and blockchain can comprehensive user analysis, CRM, retargeting, personalized services, etc. be provided to achieve fast and effective product growth.

4.2. However, the Reality is...

Web2-based data analysis platforms: GA4, Mixpanel, Snowflake, etc.
Web3-based data analysis platforms: Dune Analytics, Flipside, Etherscan, etc.

Of course, there are many platforms that can analyze Web2 and Web3 data independently. However, unfortunately, platforms that can analyze Web2 and Web3 data comprehensively are very rare worldwide, and even if they exist, their data integrity and utilization value are relatively low.

5. Wepin Wallet's User Statistics Overcoming This

From this background and problem awareness, the user statistics feature of Wepin Workspace was born. In other words, it supports each company using the enterprise Web3 wallet solution to effectively manage end users by conducting integrated data analysis of Web2 and Web3.

5.1 Key Integration Point: "Connecting Users and Accounts"

Integrated data connecting users (Web2) and accounts (Web3)

Wepin user statistics started by connecting the identification of Web2 users and Web3 accounts as discussed earlier. This allows for analysis freely across Web2-based data items and Web3-based data items. It will help companies operating blockchain-based services to precisely identify core user groups and establish more effective strategies such as localization.

5.2 Technical Implementation: "Data Warehouse"

Data pipeline for user statistics feature service

To provide the user statistics feature precisely and stably, we built a business data warehouse. First, the Wepin app Production DB collects both Web2 and Web3 data of end users. Then, in a physically completely separate data warehouse dedicated DB, the data pipeline updates data in real-time or periodically using ELT tools.

The overall design process of the data warehouse applied the Kimball's Dimensional Modeling theory, which is most widely used to provide accurate and reliable statistical services. In other words, it has a structure that can flexibly and quickly expand when faced with advancing or updating user statistics features in the future.

5.3 User Statistics Preview (1) Token Holdings

Token holdings data of Wepin wallet users during a specific period

You can check the trend of token holdings for each country. Countries with high token holdings may indicate high potential contribution to the growth of our blockchain-based service. This will help in effectively conducting marketing or airdrop events targeting specific countries.

5.4 User Statistics Preview (2) Number of Holders

Data on the number of all users who received tokens in the app during a specific period

You can check the number of users holding specific tokens linked to our blockchain-based service by country. This will be an important data metric in establishing retargeting strategies.

6. Conclusion

Blockchain-based services should establish product growth strategies by integrating Web2 and Web3 data. Therefore, to effectively execute strategies such as user influx through viral events and business advancement strategies targeting existing users, it will be necessary to secure data that can target quality users. By having a data-driven product growth strategy, you can occupy an advantageous position in securing competitiveness in the market.

Ultimately, Wepin's user statistics feature was born to meet these customer needs, going beyond simply providing a wallet solution to contribute to efficient service growth. Not stopping here, including the user statistics feature, Wepin Workspace plans to continuously advance its product and introduce features that allow customers to comprehensively manage blockchain-based services and execute various actions.

[WEPIN Wallet SNS Channels]

Start Now