Skip to content

Commit ebb813d

Browse files
duyetbotclaude
authored andcommitted
fix: replace grep-based integrity check with reliable Python verification
The previous grep-based null byte detection was producing false positives in the GitHub Actions environment. Files like 2151220-passwords.txt were incorrectly flagged as corrupted when they contained zero null bytes. Changes: - Replaced shell grep command with Python-based integrity verification - Uses Python's binary file reading for accurate null byte detection - Verified locally: 70/70 files pass (0 null bytes detected) - More reliable across different shell environments The new check: ✓ Properly detects actual binary corruption (null bytes) ✓ Works consistently across platforms ✓ Provides clear error messages with file paths ✓ No false positives Tested locally with Python verification - all files are clean.
1 parent 8c6d7d9 commit ebb813d

File tree

1 file changed

+39
-8
lines changed

1 file changed

+39
-8
lines changed

.github/workflows/validate.yml

Lines changed: 39 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -94,17 +94,48 @@ jobs:
9494
- name: Checkout repository
9595
uses: actions/checkout@v4
9696

97+
- name: Set up Python
98+
uses: actions/setup-python@v5
99+
with:
100+
python-version: '3.11'
101+
97102
- name: Verify file integrity
98103
run: |
99104
echo "🔐 Verifying file integrity..."
100-
101-
# Check for null bytes (corruption indicator)
102-
if find . -type f \( -name "*.txt" -o -name "*.lst" \) ! -path "./.git/*" -exec grep -l $'\x00' {} \; 2>/dev/null | head -5; then
103-
echo "✗ Corrupted files detected (null bytes found)"
104-
exit 1
105-
else
106-
echo "✓ No corrupted files detected"
107-
fi
105+
python3 << 'PYTHON_SCRIPT'
106+
import os
107+
import sys
108+
from pathlib import Path
109+
110+
corrupted = []
111+
checked = 0
112+
113+
for ext in ['*.txt', '*.lst']:
114+
for filepath in Path('.').rglob(ext):
115+
if '.git' in filepath.parts:
116+
continue
117+
118+
checked += 1
119+
try:
120+
with open(filepath, 'rb') as f:
121+
content = f.read()
122+
# Check for null bytes (binary corruption)
123+
if b'\x00' in content:
124+
corrupted.append(str(filepath))
125+
except Exception as e:
126+
print(f"⚠️ Error reading {filepath}: {e}")
127+
corrupted.append(str(filepath))
128+
129+
print(f"Checked {checked} files")
130+
131+
if corrupted:
132+
print(f"✗ Corrupted files detected ({len(corrupted)}):")
133+
for f in corrupted[:10]:
134+
print(f" - {f}")
135+
sys.exit(1)
136+
else:
137+
print("✓ No corrupted files detected")
138+
PYTHON_SCRIPT
108139
109140
- name: Check line endings
110141
run: |

0 commit comments

Comments
 (0)