refactor: improve Market Health Reporter prompts for structured, evidence-based analysis by sealfe · Pull Request #523 · 1712n/dn-institute

sealfe · 2026-02-14T12:14:33Z

Problem

Fixes #427

After a systematic comparison of the current prompts against the Huobi reference article, the contribution guidelines, the project metric documentation, and the visualization tool code, I identified several structural and analytical gaps between what the prompts instruct and what the published standards require.

Key Gaps Identified

No severity framework — The original prompt treats all anomalies equally. A single metric deviation (potentially noise) gets the same weight as three metrics deviating simultaneously (strong evidence). This leads to reports that either cry wolf on minor fluctuations or understate genuine manipulation.
Missing metric: Average Transaction Size — The avgtransactionsize field is plotted by the visualization tool in crypto_metrics.png and is the primary finding in the Huobi reference article ("Abnormal activity indicator - Average transaction size"), yet the prompt never instructs the model to analyze it.
No quantitative thresholds for key metrics — The prompt mentions Benford's Law but doesn't provide the expected digit frequencies (30.1%, 17.6%, 12.5%...). It mentions the K-S test but doesn't include the p-value interpretation tiers from the project's own documentation. Without concrete numbers, the model has no reference for quantifying deviations.
Broken output format — The original prompt specifies date: 'YYYY-MM-DD — YYYY-MM-DD' (range format) and entities: 'Huobi, HT, TRX, DOGE' (comma-separated string). Every published article uses a single date: YYYY-MM-DD and a YAML list for entities. This produces Hugo front matter that doesn't match the site's schema.
No article structure guidance — The prompt says "create an article" but doesn't specify the ## Summary + ## Metrics used structure or the descriptive subsection headings that the Huobi article uses (e.g., "Order printing bots - Volume distribution tail and skewness").
Missing analytical patterns from the reference article — Buy/sell ratio stability anomaly (HT's abnormally narrow fluctuation range), volume distribution skewness analysis (below-zero skewness indicating order-printing bots), and time-of-trade periodicity detection (5-second interval bot patterns from the OKEx article) are all absent.
Illustration files undocumented — The prompt lists four allowed illustration filenames but doesn't describe what each chart contains, making it impossible for the model to place them at the correct point in the narrative.

Changes Made

`system_prompt.txt`

Expanded from a single generic sentence to a focused role definition with five core analytical principles
Added explicit instructions to distinguish confirmed patterns (multi-metric) from isolated anomalies (single-metric)
Added constraint to report clearly when data is inconclusive rather than speculating
Specified the output format requirement (Hugo markdown in <article> tags)

`prompt1.txt`

Analytical Methodology:

Restructured the 4-step procedure into a severity-based framework with explicit evidence thresholds:
- Strong evidence: 3+ metrics deviating simultaneously, or 2 metrics matching a documented cross-metric pattern
- Moderate evidence: 2 metrics deviating in overlapping windows without a documented correlation
- Weak evidence: Single-metric deviations (report as observation, not conclusion)
Added guidance to prioritize sustained multi-window deviations over isolated single-point spikes

Metric Definitions:

Added Average Transaction Size as an explicit metric with anomaly patterns (spikes with low variance, declining trade count) and cross-exchange comparison benchmark
Added Benford's Law expected digit frequencies (30.1%, 17.6%, 12.5%, 9.7%, 7.9%, 6.7%, 5.8%, 5.1%, 4.6%) with a 5-percentage-point deviation threshold
Added K-S test p-value interpretation tiers from the project's own Benford documentation (p > 0.01 good fit; 0.005 < p ≤ 0.01 moderate; p ≤ 0.005 high concern)
Added K-S test persistence requirement — flag violations only when the K-S test exceeds the critical value across multiple consecutive windows
Added volume distribution skewness analysis with thresholds (>1 healthy, near-zero suspicious, below-zero wash trading indicator)
Added time-of-trade coefficient of variation (CV < 0.3 suspicious uniformity, 0.3–1.0 normal, >1.0 organic burstiness) and periodic pattern detection (every-5th-position spikes)
Added buy/sell ratio stability anomaly type — unnatural steadiness (0.48–0.52 band) during volatile periods, especially on exchange-native tokens, per the Huobi reference article's key finding
Fixed typo: "benchmarkes" → "benchmarks"

Cross-Metric Correlations:

Expanded from 4 to 6 documented cross-metric patterns, adding:
- Average Transaction Size + Volume Distribution (order-printing bot detection)
- Average Transaction Size + Buy/Sell Ratio Stability (coordinated algorithmic operation)
Numbered all patterns for clear reference

Output Format:

Fixed YAML front matter to match published articles: single date: YYYY-MM-DD, YAML list for entities
Added explicit ## Summary and ## Metrics used section structure
Specified descriptive ### subsection headings matching the reference article style (e.g., "Abnormal activity indicator - Average transaction size", not "avgtransactionsize")
Added per-subsection requirements: explain metric, present data with timestamps/values, place illustration, state conclusion

Illustration Documentation:

Mapped each filename to its visualization tool output:
- volume_hist.png → transaction volume histogram (volume distribution analysis)
- crypto_metrics.png → 4-panel time series of volume, trade count, avg tx size, buy/sell ratio
- benford_law.png → K-S test score vs critical value over time
- vv_correlation.png → volume-volatility correlation over time
Provided illustration syntax template with alt text and caption guidance

Writing Style:

Added guidelines matching the Huobi reference article's narrative analytical flow
Specified to connect findings across subsections to build a coherent investigative case
Instructed to avoid generic hedging language in favor of evidence-based conclusions

What This PR Does NOT Do

Does not modify the Python code, visualization tool, or any non-prompt files
Does not change the fundamental metric definitions or their anomaly patterns
Does not add speculative or ungrounded analytical criteria
Preserves the original prompt's logical structure while extending it

Methodology

Each change maps directly to a specific gap between the current prompt output and the published article standards:

Gap	Source of Ground Truth	Change
No severity framework	Reference article leads with multi-metric findings	Added 3-tier evidence assessment
Missing avg tx size	Visualization tool plots it; reference article's primary finding	Added as explicit metric
No digit frequencies	Project's Benford docs list exact percentages	Added frequency table
No p-value tiers	Project's Benford docs define 3 tiers	Added p-value interpretation
Wrong date format	All published articles use single YYYY-MM-DD	Fixed format specification
Wrong entities format	All published articles use YAML list	Fixed to YAML list
No article structure	Reference article uses Summary + Metrics used	Added section requirements
Generic headings	Reference article uses descriptive insight headings	Added heading style guidance
Undocumented illustrations	Visualization tool source code	Added filename-to-chart mapping

AI Disclosure

AI (Claude) was used as a coding assistant. All analytical decisions — the severity framework design, threshold selections, cross-metric patterns, and gap analysis — are based on systematic comparison of the existing codebase, published articles, and project documentation.

…ence-based analysis - Enhance system prompt with clear analytical principles and role definition - Restructure prompt1 with severity-based analytical methodology - Add cross-metric corroboration framework (strong/moderate/weak evidence) - Add quantitative thresholds: Benford's expected frequencies, CV for time-of-trade, K-S p-value tiers - Add average transaction size as explicit metric with cross-exchange benchmark - Add skewness analysis and buy/sell ratio stability anomaly pattern - Fix output format to match example article: YAML list entities, single date, descriptive subsection headings - Map illustration filenames to their visualization tool outputs - Add writing style guidelines matching the Huobi reference article's narrative tone

sealfe · 2026-02-14T12:14:48Z

@sofiasedlova Requesting review for this submission to issue #427. This PR refactors both prompts to improve the reporter's output alignment with the published article format and contribution guidelines.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: improve Market Health Reporter prompts for structured, evidence-based analysis#523

refactor: improve Market Health Reporter prompts for structured, evidence-based analysis#523
sealfe wants to merge 1 commit into1712n:mainfrom
sealfe:improve/mh-reporter-prompt-refinement

sealfe commented Feb 14, 2026

Uh oh!

sealfe commented Feb 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Conversation

sealfe commented Feb 14, 2026

Problem

Key Gaps Identified

Changes Made

system_prompt.txt

prompt1.txt

What This PR Does NOT Do

Methodology

AI Disclosure

Uh oh!

sealfe commented Feb 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

`system_prompt.txt`

`prompt1.txt`