GitHub - fportalesor/pcsa-community-detection

This repository contains the code and documentation for the research titled:

Application of Multiscale Community Detection Methods for Delineating Primary Care Service Areas in Chile

Abstract

In Primary Healthcare (PHC) systems where patients have free choice and may bypass their nearest centre, accurately delineating Primary Care Service Areas (PCSAs) is challenging. While previous research in hospital settings has shown the value of community detection algorithms for defining service areas from patient flows, these often generate multi-centre areas unsuitable for PHC, where each centre is expected to have a defined service area serving a local population.

Using geocoded patient-level consultation data, networks representing healthcare-seeking flows were constructed, and three community detection algorithms tailored to the PHC context were applied. Three edge-weighting schemes and multiple levels of spatial aggregation were tested to assess their influence on PCSA outcomes. A new refinement strategy was also developed to reassign spatially disconnected areas, ensuring each PCSA formed a single contiguous area.

Infomap outperformed modularity-based algorithms (Louvain and Leiden), producing 39 service areas out of 43 possible, with a mean localisation index of 0.68. The proposed “strongest connections” refinement strategy was faster while delivering delineations of comparable quality to the existing “minimum impact score” approach. A case study of Standardised Access Ratios revealed spatial disparities: most PCSAs compensated for limited private access with public care, while others faced compounded challenges, highlighting misalignments between demand and capacity.

Looking ahead, integrating GeoAI and contextual factors such as deprivation or projected demand into community detection algorithms could support goal-oriented, adaptive, and equity-sensitive PCSA delineation, enabling better planning and resource allocation.

Keywords: Primary Care Service Areas, Community Detection Algorithms, Healthcare Planning, Health Care Access

Study Area

Multiscale Community Detection Workflow

A) Stage 1: Data Preparation, B) Stage 2: Multi-Scale Regionalisation Process, C) Stage 3: Community Detection, D) Stage 4: Model Assessment

Data Availability

The dataset for this project is hosted on Zenodo: Download from Zenodo

The Zenodo repository contains the following folders:

raw/ – Original, unprocessed data files.
processed/ – Cleaned and prepared datasets used for analysis.
AZImporter/ – AZImporter (version 1.0.1 20/10/10) software downloaded from AZTool official website.
AZTool/ – AZTool (version 1.0.3 25/8/11) software downloaded from AZTool official website, including:
- voronoi.aat and voronoi.pat files created using AZImporter.
- 14 XML files with the parameters used to run the algorithm.
- A batch file to run all configurations sequentially.
- The 14 CSV outputs generated by AZTool.

Instructions: After downloading, place the raw/ and processed/ folders inside the data/ directory, and place the AZImporter/ and AZTool/ folders directly in the main project directory.

Environment Setup

This project includes a Conda environment configuration file.

Note: The environment has been tested on Ubuntu. Compatibility with Windows or other operating systems is not guaranteed.

To create the environment, run:

conda env create -f environment_3_11.yml

Then activate it:

conda activate <environment_name>

Example Workflow

Step 1: Combine and Standardise Chilean C`ensus Polygons

To replicate the first step of Stage 1, execute the following script:

python multipart_relabeller.py -u manzanas_apc_2023.shp -r \
microdatos_entidad.zip -o processed_polygons.parquet

This will merge and standardise the raw urban and rural census polygons found in the data/raw directory and create a spatial file called processed_polygons.parquet, saved in data/processed.

Step 2: Generate Voronoi polygons constrained by regional boundaries

This step generates Voronoi diagrams for the seven communes in the study area. The process is parallelised across 12 CPU cores by dividing each commune into intermediate areas.

To view available options and default parameters, run:

 python voronoi_polys.py --help

To execute the default process:

 python voronoi_polys.py -i processed_polygons.parquet -r COMUNA_C17.shp

This will generate a file named voronoi.gpkg containing:

One layer per commune with constrained Voronoi diagrams.
A combined layer that merges all commune-level Voronoi polygons.

Two Jupyter notebooks located in the notebooks directory provide supporting context:

polygon_prep.ipynb documents geoprocessing steps and parameter testing before and after Voronoi generation.
hidden_polys.ipynb shows how the VoronoiProcessor class handles hidden polygons.

Step 3: Calculate Voronoi attributes

This step concludes Stage 1. It involves estimating population counts and socioeconomic groupings for each polygon based on point-level data. These enriched polygons were used as input for the AZTool software.

Although the original script calculate_attributes.py was used to perform this step, it cannot be run due to the use of confidential geocoded data. For demonstration purposes, the notebook split_polys.ipynb reproduces the logic using synthetic data.

The resulting shapefiles are stored in data/processed:

voronoi_data.shp: attributes calculated on full polygons.

voronoi_data_split.shp: attributes recalculated after polygon splitting.

To explore the resulting population distribution, use the notebook pop_distribution.ipynb.

Step 4: Tract Generation (AZTool)

This step forms Stage 2 of the workflow and involves using AZTool to generate 14 tract outputs.

Because AZTool is a Windows-based application (written in VB.NET), this step must be performed in a Windows environment.

Instructions:

Copy the AZTool and AZTImporter folders to a Windows session (e.g. Desktop).
Open the AZTImporter folder and run AZTImport.exe.
In the GUI:

Set voronoi_data_split.shp (from data/processed) as the input shapefile.
Save the output AAT file as voronoi.aat inside the AZTool folder.

Next, go to the AZTool folder, which contains 14 XML files, each corresponding to a different tract configuration. The AZTool_M.exe executable can be used to process each configuration, but for convenience, a batch script Run_AZTool_M.bat is provided.

Open Run_AZTool_M.bat in a text editor and ensure that the BASE path variable matches the actual location of the AZTool folder.

Running the batch file will sequentially execute all 14 XML configurations and produce CSV files in the format TractOutput_%%S.csv, where %%S refers to the target population.

Once completed, copy the AZTool folder back into your Linux session.

To generate the final Tract outputs by dissolving the Voronoi polygons based on AZTool assignments, run the following command in the same terminal:

 python create_tracts.py -i voronoi_data_split.shp -azt voronoi.pat

By default, the script will iterate over each CSV file containing tract assignments and create individual layers within a tracts.gpkg file. Each layer will be named using the format tracts_<TARGET_POP>.

Additionally, a summary file named tracts_summary.xlsx will be generated, containing key statistics for each tract configuration.

Step 5: Community detection Pipeline

The following script replicates the parameter grid search described in the research paper. To reproduce the process using the provided pre-calculated matrix, run:

 python run_models.py -m matrices.parquet --pop-values 1000

To inspect the available parameters and options, use:

 python run_models.py --help

Several Jupyter notebooks are available in the notebooks folder to assist with analysing the outputs using the developed Python classes:

flow_networks.ipynb Generates flow graphs based on the pre-computed matrix.
pcsa.ipynb Illustrates how the selected community detection output is used to dissolve tracts and form the final Primary Care Service Areas (PCSAs). It also includes the calculation of the Localisation Index for each service area.
spatial_enforcement.ipynb Compares the two spatial enforcement strategies alongside the default community detection output.

Patient flow networks to PHC centres in urban (A) and rural (B) communes, aggregated by tracts averaging 300 registered patients. PCSAs are delineated in black. Edge weights reflect visit share magnitudes, classified into five Natural Breaks categories. Small black nodes denote tract centroids, and red circles indicate PHC centres, scaled by total consultations.

Name		Name	Last commit message	Last commit date
Latest commit History 116 Commits
access_metrics		access_metrics
community_detection		community_detection
data		data
notebooks		notebooks
plots		plots
polygon_processors		polygon_processors
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
calculate_attributes.py		calculate_attributes.py
create_tracts.py		create_tracts.py
environment_3_11.yml		environment_3_11.yml
multipart_relabeller.py		multipart_relabeller.py
paths.py		paths.py
run_models.py		run_models.py
voronoi_polys.py		voronoi_polys.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Application of Multiscale Community Detection Methods for Delineating Primary Care Service Areas in Chile

Multiscale Community Detection Workflow

Data Availability

Environment Setup

Example Workflow

About

Uh oh!

Releases

Packages

Languages

fportalesor/pcsa-community-detection

Folders and files

Latest commit

History

Repository files navigation

Application of Multiscale Community Detection Methods for Delineating Primary Care Service Areas in Chile

Multiscale Community Detection Workflow

Data Availability

Environment Setup

Example Workflow

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages