
Briefing
The pervasive challenge of Sybil attacks in blockchain ecosystems, particularly during token airdrops, compromises fair resource distribution and system integrity. This research introduces a foundational breakthrough ∞ the Subgraph-based lightGBM algorithm, a supervised machine learning method that constructs two-layer transaction subgraphs and extracts comprehensive temporal, amount, and network structural features through propagation and fusion. This novel mechanism dramatically enhances the precision and recall of Sybil address identification, offering a critical advancement for securing decentralized applications and ensuring equitable participation in future blockchain architectures.

Context
Before this research, identifying Sybil addresses in blockchain airdrops primarily relied on unsupervised methods, which, while universally applicable, demanded substantial manual effort and struggled to adapt to evolving attacker strategies. These approaches often faced challenges in constructing manageable transaction graphs from vast blockchain data and in setting reliable clustering thresholds, leading to suboptimal accuracy and scalability limitations in real-world applications.

Analysis
The paper’s core mechanism, Subgraph-based lightGBM, operates by first constructing a localized “transaction subgraph” for each address, capturing its immediate transactional neighborhood up to two layers deep. This avoids processing the entire blockchain graph, significantly reducing computational load. Within these subgraphs, the model extracts three categories of features ∞ temporal patterns (e.g. the exact timing of an address’s first transaction, gas acquisition, and airdrop participation), financial metrics (transaction amounts and balances), and network structural characteristics (in-degree, out-degree, neighbor information).
These features are then propagated and fused across the subgraph layers, creating a rich, consolidated representation of an address’s behavior. A supervised lightGBM model is then trained on this comprehensive feature set to classify addresses as Sybil or legitimate, leveraging labeled datasets to achieve high precision and recall, fundamentally differing from prior, less precise unsupervised clustering methods.

Parameters
- Core Concept ∞ Subgraph-based lightGBM
- Problem Addressed ∞ Sybil Attacks in Airdrops
- Key Features Extracted ∞ Temporal, Amount, Network Structure
- Evaluation Metrics ∞ Precision, Recall, F1 Score, AUC
- Dataset Origin ∞ BAB (Binance Account Bound) Airdrop
- Performance Improvement ∞ AUC of 0.9806, outperforming Trusta’s 0.8642
- Applicable Blockchain Type ∞ Account-based (EVM-compatible)

Outlook
This research establishes a robust framework for Sybil detection, paving the way for future advancements in blockchain security. Next steps involve extending the methodology to a broader range of networks and airdrop events, cultivating a comprehensive Sybil address database with an iterative improvement cycle for model updates. This foundational work could unlock real-world applications in identifying transaction manipulation and assessing token liquidity risks within 3-5 years, thereby fostering a more secure and equitable decentralized financial landscape.

Verdict
This research fundamentally advances blockchain security by providing a highly effective, data-driven mechanism to counter Sybil attacks, reinforcing the integrity and fairness of decentralized economic systems.