Skip to content

refactor: improve Market Health Reporter prompts for structured, evidence-based analysis#523

Open
sealfe wants to merge 1 commit into1712n:mainfrom
sealfe:improve/mh-reporter-prompt-refinement
Open

refactor: improve Market Health Reporter prompts for structured, evidence-based analysis#523
sealfe wants to merge 1 commit into1712n:mainfrom
sealfe:improve/mh-reporter-prompt-refinement

Conversation

@sealfe
Copy link

@sealfe sealfe commented Feb 14, 2026

Problem

Fixes #427

After a systematic comparison of the current prompts against the Huobi reference article, the contribution guidelines, the project metric documentation, and the visualization tool code, I identified several structural and analytical gaps between what the prompts instruct and what the published standards require.

Key Gaps Identified

  1. No severity framework — The original prompt treats all anomalies equally. A single metric deviation (potentially noise) gets the same weight as three metrics deviating simultaneously (strong evidence). This leads to reports that either cry wolf on minor fluctuations or understate genuine manipulation.

  2. Missing metric: Average Transaction Size — The avgtransactionsize field is plotted by the visualization tool in crypto_metrics.png and is the primary finding in the Huobi reference article ("Abnormal activity indicator - Average transaction size"), yet the prompt never instructs the model to analyze it.

  3. No quantitative thresholds for key metrics — The prompt mentions Benford's Law but doesn't provide the expected digit frequencies (30.1%, 17.6%, 12.5%...). It mentions the K-S test but doesn't include the p-value interpretation tiers from the project's own documentation. Without concrete numbers, the model has no reference for quantifying deviations.

  4. Broken output format — The original prompt specifies date: 'YYYY-MM-DD — YYYY-MM-DD' (range format) and entities: 'Huobi, HT, TRX, DOGE' (comma-separated string). Every published article uses a single date: YYYY-MM-DD and a YAML list for entities. This produces Hugo front matter that doesn't match the site's schema.

  5. No article structure guidance — The prompt says "create an article" but doesn't specify the ## Summary + ## Metrics used structure or the descriptive subsection headings that the Huobi article uses (e.g., "Order printing bots - Volume distribution tail and skewness").

  6. Missing analytical patterns from the reference article — Buy/sell ratio stability anomaly (HT's abnormally narrow fluctuation range), volume distribution skewness analysis (below-zero skewness indicating order-printing bots), and time-of-trade periodicity detection (5-second interval bot patterns from the OKEx article) are all absent.

  7. Illustration files undocumented — The prompt lists four allowed illustration filenames but doesn't describe what each chart contains, making it impossible for the model to place them at the correct point in the narrative.

Changes Made

system_prompt.txt

  • Expanded from a single generic sentence to a focused role definition with five core analytical principles
  • Added explicit instructions to distinguish confirmed patterns (multi-metric) from isolated anomalies (single-metric)
  • Added constraint to report clearly when data is inconclusive rather than speculating
  • Specified the output format requirement (Hugo markdown in <article> tags)

prompt1.txt

Analytical Methodology:

  • Restructured the 4-step procedure into a severity-based framework with explicit evidence thresholds:
    • Strong evidence: 3+ metrics deviating simultaneously, or 2 metrics matching a documented cross-metric pattern
    • Moderate evidence: 2 metrics deviating in overlapping windows without a documented correlation
    • Weak evidence: Single-metric deviations (report as observation, not conclusion)
  • Added guidance to prioritize sustained multi-window deviations over isolated single-point spikes

Metric Definitions:

  • Added Average Transaction Size as an explicit metric with anomaly patterns (spikes with low variance, declining trade count) and cross-exchange comparison benchmark
  • Added Benford's Law expected digit frequencies (30.1%, 17.6%, 12.5%, 9.7%, 7.9%, 6.7%, 5.8%, 5.1%, 4.6%) with a 5-percentage-point deviation threshold
  • Added K-S test p-value interpretation tiers from the project's own Benford documentation (p > 0.01 good fit; 0.005 < p ≤ 0.01 moderate; p ≤ 0.005 high concern)
  • Added K-S test persistence requirement — flag violations only when the K-S test exceeds the critical value across multiple consecutive windows
  • Added volume distribution skewness analysis with thresholds (>1 healthy, near-zero suspicious, below-zero wash trading indicator)
  • Added time-of-trade coefficient of variation (CV < 0.3 suspicious uniformity, 0.3–1.0 normal, >1.0 organic burstiness) and periodic pattern detection (every-5th-position spikes)
  • Added buy/sell ratio stability anomaly type — unnatural steadiness (0.48–0.52 band) during volatile periods, especially on exchange-native tokens, per the Huobi reference article's key finding
  • Fixed typo: "benchmarkes" → "benchmarks"

Cross-Metric Correlations:

  • Expanded from 4 to 6 documented cross-metric patterns, adding:
    • Average Transaction Size + Volume Distribution (order-printing bot detection)
    • Average Transaction Size + Buy/Sell Ratio Stability (coordinated algorithmic operation)
  • Numbered all patterns for clear reference

Output Format:

  • Fixed YAML front matter to match published articles: single date: YYYY-MM-DD, YAML list for entities
  • Added explicit ## Summary and ## Metrics used section structure
  • Specified descriptive ### subsection headings matching the reference article style (e.g., "Abnormal activity indicator - Average transaction size", not "avgtransactionsize")
  • Added per-subsection requirements: explain metric, present data with timestamps/values, place illustration, state conclusion

Illustration Documentation:

  • Mapped each filename to its visualization tool output:
    • volume_hist.png → transaction volume histogram (volume distribution analysis)
    • crypto_metrics.png → 4-panel time series of volume, trade count, avg tx size, buy/sell ratio
    • benford_law.png → K-S test score vs critical value over time
    • vv_correlation.png → volume-volatility correlation over time
  • Provided illustration syntax template with alt text and caption guidance

Writing Style:

  • Added guidelines matching the Huobi reference article's narrative analytical flow
  • Specified to connect findings across subsections to build a coherent investigative case
  • Instructed to avoid generic hedging language in favor of evidence-based conclusions

What This PR Does NOT Do

  • Does not modify the Python code, visualization tool, or any non-prompt files
  • Does not change the fundamental metric definitions or their anomaly patterns
  • Does not add speculative or ungrounded analytical criteria
  • Preserves the original prompt's logical structure while extending it

Methodology

Each change maps directly to a specific gap between the current prompt output and the published article standards:

Gap Source of Ground Truth Change
No severity framework Reference article leads with multi-metric findings Added 3-tier evidence assessment
Missing avg tx size Visualization tool plots it; reference article's primary finding Added as explicit metric
No digit frequencies Project's Benford docs list exact percentages Added frequency table
No p-value tiers Project's Benford docs define 3 tiers Added p-value interpretation
Wrong date format All published articles use single YYYY-MM-DD Fixed format specification
Wrong entities format All published articles use YAML list Fixed to YAML list
No article structure Reference article uses Summary + Metrics used Added section requirements
Generic headings Reference article uses descriptive insight headings Added heading style guidance
Undocumented illustrations Visualization tool source code Added filename-to-chart mapping

AI Disclosure

AI (Claude) was used as a coding assistant. All analytical decisions — the severity framework design, threshold selections, cross-metric patterns, and gap analysis — are based on systematic comparison of the existing codebase, published articles, and project documentation.

…ence-based analysis

- Enhance system prompt with clear analytical principles and role definition
- Restructure prompt1 with severity-based analytical methodology
- Add cross-metric corroboration framework (strong/moderate/weak evidence)
- Add quantitative thresholds: Benford's expected frequencies, CV for time-of-trade, K-S p-value tiers
- Add average transaction size as explicit metric with cross-exchange benchmark
- Add skewness analysis and buy/sell ratio stability anomaly pattern
- Fix output format to match example article: YAML list entities, single date, descriptive subsection headings
- Map illustration filenames to their visualization tool outputs
- Add writing style guidelines matching the Huobi reference article's narrative tone
@sealfe
Copy link
Author

sealfe commented Feb 14, 2026

@sofiasedlova Requesting review for this submission to issue #427. This PR refactors both prompts to improve the reporter's output alignment with the published article format and contribution guidelines.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve Market Health Reporter Prompts

1 participant

Comments