This project calculates a risk score (0–1000) for Ethereum wallets using their on-chain transaction history.
You are given wallet addresses. The objective is to:
- Fetch transactions from Etherscan.
- Create features based on transaction behavior.
- Apply both rule-based and model-based scoring strategies.
- Transactions fetched using Etherscan API
- Feature set includes:
- Total transactions
- ETH sent / received
- Number of failed transactions
| Rule | Penalty |
|---|---|
| tx_count < 5 | -200 |
| failed_tx > 2 | -150 |
| total_out > total_in | -100 |
Score Formula:
score = 1000 - penalties
Features Used:
- tx_count
- total_in
- total_out
- failed_tx
Steps:
- Normalize features
- Cluster wallets using KMeans (2 clusters)
- Score based on distance to “safe” cluster centroid
Visualization:
Wallets are projected to 2D using PCA and colored by cluster.
An Excel file with wallet addresses:
wallet_id
0xabc...
0x123...
wallet_risk_scores_combined.csv: Contains both rule-based and model-based scoreskmeans_cluster_visualization.png: Cluster visualization
pip install pandas requests scikit-learn openpyxl matplotlibEdit wallet_risk_scoring_combined.py:
ETHERSCAN_API_KEY = "YOUR_KEY_HERE"python wallet_risk_scoring_combined.pyNormalization was applied only for model-based scoring using MinMaxScaler.
Rule-based scoring relies on fixed thresholds and does not require scaling.
Vanga Jai Prakash
Creator of ScoreChain2 – Wallet Risk Scoring (+ LLM add-on)
- Email: vangajaiprakash@gmail.com
- GitHub: Vanga-Jai-Prakash
