This repo provides the data, code, and scripts of our paper: Toward Better Temporal Structures for Geopolitical Events Forecasting. [PDF]
Authors: Kian Ahrabian*, Eric Boxer*, Jay Pujara.
Forecasting on geopolitical temporal knowledge graphs (TKGs) through the lens of large language models (LLMs) has recently gained traction. While TKGs and their generalization, hyper-relational temporal knowledge graphs (HTKGs), offer a straightforward structure to represent simple temporal relationships, they lack the expressive power to convey complex facts efficiently. One of the critical limitations of HTKGs is a lack of support for more than two primary entities in temporal facts, which commonly occur in real-world events. To address this limitation, in this work, we study a generalization of HTKGs, HyperRelational Temporal Knowledge Generalized Hypergraphs (HTKGHs). We first derive a formalization for HTKGHs, demonstrating their backward compatibility while supporting two complex types of facts commonly found in geopolitical incidents. Then, utilizing this formalization, we introduce the htkgh-polecatdataset, built upon the global event database POLECAT. Finally, we benchmark and analyze popular LLMs on the relation prediction task, providing insights into their adaptability and capabilities in complex forecasting scenarios.
Our environment is specified by constraints.txt, but it is easier to recreate it as follows:
bash setup.sh # Creates the "htkgh" env + Installs CUDA and vLLM
conda activate htkgh
pip install -r requirements.txt -c constraints.txt
tar -xzvf data.tar.gz # Extract datahtkgh_polecat: Overall dataset and test (_test) split.htkgh_polecat_anon: Anonymized variations of thehtkgh_polecatdataset:htkgh_polecat_anon_shuffle_ent_loc*: Shuffled entities and locations.htkgh_polecat_anon_shuffle_ent_loc_rel*: Shuffled entities, locations, and relations.
- Baselines:
baselines.py: Calculate frequency, recency, and copy baselines.
- GNNs:
htkgh_model.py: Model architectures.htkgh_data.py: Data preprocessing and handling for training and evaluation.train_htkgh.py: Defines model training loop and CLI for training and evaluattion.run_gnn_bag.sh,run_gnn_hg.sh: Run the bagging and hypergraph aggregation experiments, respectively.
- LLMs:
inference.py: Wrapper around the LLM code with the CLI for LLM evaluation.utils.py: Various utilities for prompt handling, data handling, etc.run_llm.sh: Runs the experiments with the given parameters.
If you make use of this code or data, please kindly cite the following paper:
@article{ahrabian2026toward,
title={Toward Better Temporal Structures for Geopolitical Events Forecasting},
author={Ahrabian, Kian and Boxer, Eric and Pujara, Jay},
journal={arXiv preprint arXiv:2601.00430},
year={2026}
}