Methodology

Data mining

Data mining is crucial in our process. We’ll source data primarily from GitHub and Etherscan. Our focus includes popular ERC20 tokens, NFTs, token marketplaces, AMMs, and simple wallets covering most DeFi and Web3 topics. We prioritize codes with over 1,000 transactions weekly to ensure quality.

Model Selection

The model used in this project will take an example by GPT, or more specifically, GPT-1. Several Transformers are expected to be displayed in the model to achieve both natural language understanding and smart contract generation work.

Model Evaluation

The project will use the BLEU score to evaluate the model’s generated code against human-written code. We’ll also consider the ChrF metric for character-based evaluation. While detailed scoring methods are pending, we’ll employ the pass@k metric to assess the model’s task performance and may tailor problems for Web3 specialization.