Skip to content

Commit f35e4c5

Browse files
feat: implement signal overrides for phone number reputation checks
1 parent c31a85f commit f35e4c5

File tree

10 files changed

+303
-143
lines changed

10 files changed

+303
-143
lines changed

.gitignore

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,4 +16,3 @@ venv/
1616
ENV/
1717
.env
1818
.cache/
19-

README.md

Lines changed: 123 additions & 138 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,35 @@
1-
# file: README.md
21
# phoneint (Phone Number OSINT)
32

4-
`phoneint` parses and enriches phone numbers offline (via `phonenumbers`) and runs optional, pluggable async reputation checks to produce an auditable risk score and report. It is designed to be transparent, explainable, and safe-by-default.
3+
`phoneint` parses and enriches phone numbers offline (via `phonenumbers`) and runs optional, pluggable async reputation checks to produce an auditable risk score and report. The goal is transparent, explainable OSINT with clear inputs, visible signals, and reproducible output.
54

65
## Disclaimer (Read First)
76

87
**This tool is for lawful, ethical OSINT research only.**
98
Do not use it to harass, stalk, dox, or violate privacy. Always comply with applicable laws and third-party Terms of Service.
109

11-
## What This Tool Does (At a Glance)
12-
13-
- Normalizes numbers to E.164 and common formats
10+
## Table of Contents
11+
12+
- What This Tool Does
13+
- How It Works
14+
- Features
15+
- Install
16+
- Quick Start (CLI)
17+
- GUI Usage
18+
- Signals and Scoring
19+
- Signal Overrides (Testing)
20+
- Configuration
21+
- Adapters
22+
- Owner Intelligence
23+
- Reports
24+
- Example Numbers
25+
- Troubleshooting
26+
- Development and CI
27+
- Docker
28+
- License
29+
30+
## What This Tool Does
31+
32+
- Normalizes numbers to E.164 and common display formats
1433
- Enriches with deterministic metadata (carrier when available, region, time zones, number type)
1534
- Optionally queries public OSINT sources via adapters
1635
- Produces explainable risk scoring and owner intelligence
@@ -22,17 +41,17 @@ Do not use it to harass, stalk, dox, or violate privacy. Always comply with appl
2241
1. **Parse + Normalize**: `phonenumbers` parses the input to E.164 and standard formats.
2342
2. **Deterministic Enrichment**: carrier, time zone, number type, and region are derived offline.
2443
3. **Adapter Checks (Optional)**: adapters query public sources and return evidence items.
25-
4. **Scoring**: risk score is calculated from explicit signals with a breakdown.
44+
4. **Signals + Scoring**: boolean signals are derived and a risk score is computed with a breakdown.
2645
5. **Owner Intelligence (Optional)**: evidence is converted into associations and confidence scores.
2746
6. **Reporting**: a full JSON report is built and can be exported to CSV/PDF.
2847

2948
## Features
3049

31-
- E.164 parsing + normalization (`phonenumbers`)
32-
- Deterministic enrichment: carrier (when available), region/country name, time zones, number type, ISO code
50+
- E.164 parsing and normalization (`phonenumbers`)
51+
- Deterministic enrichment: carrier, region, time zones, number type, ISO code
3352
- Async adapters (`httpx`): DuckDuckGo Instant Answer, Google Custom Search, public dataset checks
3453
- Explainable risk scoring with configurable weights (YAML/JSON)
35-
- Owner Intelligence with audit trail for PII-capable adapters
54+
- Owner intelligence with audit trail for PII-capable adapters
3655
- Reports: JSON + CSV; optional PDF (extra dependency)
3756
- Optional SQLite TTL caching
3857
- CLI and GUI (PySide6 + qasync)
@@ -43,28 +62,28 @@ Python 3.11+ recommended.
4362

4463
```bash
4564
python -m venv .venv
46-
.\.venv\Scripts\Activate.ps1
65+
\.\.venv\Scripts\Activate.ps1
4766

48-
pip install -U pip
49-
pip install .
67+
python -m pip install -U pip
68+
python -m pip install -e .
5069
```
5170

5271
Dev tools:
5372

5473
```bash
55-
pip install ".[dev]"
74+
python -m pip install ".[dev]"
5675
```
5776

5877
GUI:
5978

6079
```bash
61-
pip install ".[gui]"
80+
python -m pip install ".[gui]"
6281
```
6382

6483
PDF export:
6584

6685
```bash
67-
pip install ".[pdf]"
86+
python -m pip install ".[pdf]"
6887
```
6988

7089
## Quick Start (CLI)
@@ -94,143 +113,57 @@ phoneint report report.json --format csv --output evidence.csv
94113
phoneint report report.json --format pdf --output report.pdf
95114
```
96115

97-
Launch GUI:
116+
If you have a national-format number without `+CC`, provide a default region:
98117

99118
```bash
100-
phoneint serve-gui
101-
```
102-
103-
## GUI Highlights
104-
105-
- **Download report**: choose `json`, `csv`, or `pdf` and click Save.
106-
- **Owner Intelligence**: consent checkbox gates PII-capable lookups.
107-
- **Evidence list**: populated as adapters complete.
108-
- **Non-blocking**: UI remains responsive during async checks.
109-
110-
## Example Numbers
111-
These example numbers are reserved test ranges, fictional examples, or public-format demonstrations intended for documentation and testing only. Do not use them to query private services or to target real individuals.
112-
113-
### 1. Reserved Test Numbers (RFC / NANP)
114-
115-
These numbers are explicitly reserved for testing and documentation and are never assigned to real users.
116-
117-
```text
118-
+1 202-555-0100
119-
+1 202-555-0101
120-
+1 202-555-0147
121-
+1 202-555-0199
122-
```
123-
124-
Expected behavior (when running in example/test mode or when marked in harnesses):
125-
126-
- `number_classification`: `reserved_test_number`
127-
- `example_mode`: `true`
128-
- `risk_score`: `0`
129-
- No live OSINT checks performed; adapters should be mocked or skipped.
130-
131-
### 2. Toll-Free Number Examples
132-
133-
Useful for testing toll-free detection, multi-timezone handling, and business vs scam ambiguity.
134-
135-
```text
136-
+1 800-356-9377
137-
+1 888-555-0000
138-
+1 877-555-1212
139-
```
140-
141-
Expected behavior:
142-
143-
- `line_type`: `toll_free`
144-
- Timezone coverage may be broad or absent depending on enrichment metadata
145-
- Neutral or low risk in example mode unless synthetic signals are injected
146-
147-
### 3. Fictional Bangladesh Numbers (Example Mode)
148-
149-
These demonstrate country-specific parsing and carrier detection. Treat as fictional/demo data.
150-
151-
```text
152-
+8801700000000
153-
+8801800000000
154-
+8801900000000
119+
phoneint lookup 6502530000 --region US
155120
```
156121

157-
Expected behavior:
122+
## GUI Usage
158123

159-
- `number_classification`: `fictional_example`
160-
- `example_mode`: `true`
161-
- `risk_score`: `0`
162-
- Carrier may be mocked for demo purposes
124+
Launch:
163125

164-
### 4. International Fictional Examples
165-
166-
Useful for international formatting, country & timezone extraction, and multi-region validation.
167-
168-
```text
169-
+44 7000 000000 # UK-style fictional number
170-
+61 400 000 000 # Australia-style fictional mobile
171-
+49 151 00000000 # Germany-style fictional mobile
126+
```bash
127+
phoneint serve-gui
172128
```
173129

174-
Expected behavior:
130+
Highlights:
175131

176-
- Correct country detection
177-
- Valid international formatting (E.164 and international display)
178-
- No real OSINT signals in example mode
132+
- Download report: choose `json`, `csv`, or `pdf` and click Save.
133+
- Owner Intelligence: consent checkbox gates PII-capable lookups.
134+
- Evidence list: populated as adapters complete.
135+
- Non-blocking: UI remains responsive during async checks.
179136

180-
### 5. Spam / Risk Logic Demonstration (Mocked)
137+
## Signals and Scoring
181138

182-
These are NOT real scam numbers; use them to exercise scoring and signal detection in example/test harnesses.
139+
`phoneint` calculates a small set of transparent, boolean signals and then computes a risk score with a breakdown. Default signals include:
183140

184-
```text
185-
+1 202-555-0147 # used to simulate scam_db match in tests
186-
+1 800-555-9999 # used to simulate classifieds exposure
187-
```
188-
189-
Expected behavior (example mode only):
141+
- `found_in_scam_db`: true when the number is matched in the public scam dataset
142+
- `voip`: true when libphonenumber classifies the number as VoIP
143+
- `found_in_classifieds`: true when evidence URLs match classifieds domains
144+
- `business_listing`: true when evidence URLs match business directory domains
190145

191-
- Deterministic mocked signals (e.g., `found_in_scam_db: true` for the first item)
192-
- Scoring logic exercised without querying live data
146+
Scoring weights are configurable; see Configuration below.
193147

194-
How to use in tests or demos:
148+
## Signal Overrides (Testing)
195149

196-
- CLI: run with adapters mocked or with `--no-cache` and a local test adapter.
197-
- GUI: use a test harness that injects `example_mode` or pre-populates adapter results.
198-
- Always mark runs that use these numbers as `example_mode=true` in logs/reports so audits can distinguish synthetic data from live OSINT.
150+
When you need deterministic tests (or you do not have access to provider-specific VoIP numbers), use signal overrides. The override file lets you force signals for specific E.164 numbers.
199151

200-
---
152+
Default override file:
201153

154+
- `phoneint/data/signal_overrides.json`
202155

156+
Example:
203157

204-
If you have a national-format number without `+CC`, provide a default region:
205-
206-
```bash
207-
phoneint lookup 6502530000 --region US
158+
```json
159+
{
160+
"voip": ["+14155552671"],
161+
"found_in_classifieds": ["+12025550199"],
162+
"business_listing": ["+18005550100"]
163+
}
208164
```
209165

210-
## Sample Output (Human Summary)
211-
212-
```text
213-
E.164: +88027111234
214-
International: +880 2-7111234
215-
National: 02-7111234
216-
Region (ISO): BD
217-
Country code: 880
218-
219-
Carrier:
220-
Region: Dhaka
221-
Time zones: Asia/Dhaka
222-
Type: fixed_line
223-
ISO country code: BD
224-
Dialing prefix: 880
225-
226-
Risk score: 0/100
227-
228-
Breakdown:
229-
- found_in_scam_db: 0.0 (Matched a public scam dataset)
230-
- voip: 0.0 (libphonenumber classified the number as VOIP)
231-
- found_in_classifieds: 0.0 (Evidence URL matched a classifieds domain heuristic)
232-
- business_listing: -0.0 (Evidence URL matched a business listing domain heuristic)
233-
```
166+
You can also set a custom path via `PHONEINT_SIGNAL_OVERRIDES_PATH`.
234167

235168
## Configuration
236169

@@ -241,6 +174,7 @@ Environment variables:
241174
- `GCS_API_KEY` / `GCS_CX`: required only for the Google Custom Search adapter
242175
- `PHONEINT_*`: HTTP, cache, logging, default region, scoring weights (JSON)
243176
- `ENABLE_TRUECALLER` / `TRUECALLER_API_KEY`: required only for PII-capable Truecaller adapter
177+
- `PHONEINT_SIGNAL_OVERRIDES_PATH`: path to the signal override JSON file
244178

245179
YAML config (recommended for score weights and non-secret defaults):
246180

@@ -266,7 +200,7 @@ Or set `PHONEINT_CONFIG=config.yaml` in `.env`.
266200

267201
## Adapters
268202

269-
- `public`: checks a JSON dataset (ships with `phoneint/data/scam_list.json` as a demo)
203+
- `public`: checks a JSON dataset (ships with a demo list in `phoneint/data/scam_list.json`)
270204
- `duckduckgo`: DuckDuckGo Instant Answer API (not full web search)
271205
- `google`: Google Custom Search (requires your API key + CX)
272206

@@ -301,7 +235,7 @@ What it does not do:
301235
PII-capable adapters (e.g., Truecaller) are disabled by default and will only run when:
302236

303237
1. You provide official credentials via `.env` (never commit secrets)
304-
2. You explicitly enable the adapter in config/environment
238+
2. You explicitly enable the adapter in config or environment
305239
3. You explicitly confirm lawful purpose plus explicit consent (`--enable-pii` in CLI, checkbox in GUI)
306240
4. An audit trail is recorded in the report (`owner_audit_trail`)
307241

@@ -348,24 +282,75 @@ If you do not have official credentials or consent, leave PII disabled. Public e
348282
## Reports
349283

350284
- JSON: full report, includes owner intelligence and audit trail
351-
- CSV: evidence + owner associations + audit trail in a single long table
285+
- CSV: evidence and owner associations in a single long table
352286
- PDF: summary pages plus legal disclaimer (requires `reportlab`)
353287

354-
## Docker
288+
## Example Numbers
289+
290+
These example numbers are reserved test ranges or fictional examples intended for documentation and testing only. Do not use them to query private services or target real individuals.
291+
292+
Reserved test range (NANP):
293+
294+
```text
295+
+1 202-555-0100
296+
+1 202-555-0101
297+
+1 202-555-0147
298+
+1 202-555-0199
299+
```
300+
301+
Toll-free examples:
302+
303+
```text
304+
+1 800-356-9377
305+
+1 888-555-0000
306+
+1 877-555-1212
307+
```
308+
309+
International format examples:
310+
311+
```text
312+
+44 7000 000000
313+
+61 400 000 000
314+
+49 151 00000000
315+
```
316+
317+
If you want deterministic testing for scoring signals, use the signal overrides file described above.
318+
319+
## Troubleshooting
320+
321+
**Error: GUI dependencies not installed**
322+
323+
- Ensure you installed GUI extras into the active venv:
355324

356325
```bash
357-
docker build -t phoneint .
358-
docker run --rm phoneint lookup +16502530000
326+
python -m pip install ".[gui]"
327+
```
328+
329+
- If you still see the error, run the GUI directly to surface the real traceback:
330+
331+
```bash
332+
python run_gui_direct.py
359333
```
360334

361-
## Development
335+
## Development and CI
336+
337+
Local checks (the same checks you would typically run in CI):
362338

363339
```bash
364340
black phoneint tests
365341
mypy phoneint
366342
pytest
367343
```
368344

345+
If you add GitHub Actions, use these commands in your workflow to keep the build green.
346+
347+
## Docker
348+
349+
```bash
350+
docker build -t phoneint .
351+
docker run --rm phoneint lookup +16502530000
352+
```
353+
369354
## License
370355

371356
MIT. See `LICENSE`.

0 commit comments

Comments
 (0)