Skip to content

OpenCageData/address-formatting

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

935 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Address Formatting

Build Status

Templates and test cases for address formats used in territories around the world. The templates can be processed in any programming language (see list of processors).

Example

Given a set of address parts:

house_number:  17
road:          Rue du Médecin-Colonel Calbairac
neighbourhood: Lafourguette
suburb:        Toulouse Ouest
postcode:      31000
city:          Toulouse
county:        Toulouse
state:         Midi-Pyrénées
country:       France
country_code:  FR

We want to compile an address in the format consumers expect:

17 Rue du Médecin-Colonel Calbairac
31000 Toulouse
France

Why Use This?

The intended use case is database or geocoding systems (forward, reverse, autocomplete) where we know both the country of the address and the language of the user/reader. The address is displayed to a consumer (for example in an app) and not used to print on an envelope for actual postal delivery. We use it to format output from the OpenCage Geocoding API.

Scope

What we handle:

  • Incomplete data
  • Anything with a name (peaks, bridges, bus stops)

What we don't handle (unlike physical postal mail):

  • Apartment/flat numbers, floor numbers
  • PO boxes
  • Translating the destination address language (whatever language is input is output)

Processing Logic

Our goal is a series of programming language-independent templates that can be processed by whatever software you like.

Open-Source Implementations

Language Repository Notes
Android AndroidAddressFormatter
Elixir ex_address_formatting
Go address-formatter
Java address-formatter-java
JavaScript address-formatter
Kotlin address-formatter-kotlin
Perl Geo-Address-Formatter
PHP address-formatter-php
PowerShell AddressFormatter Cross-platform
Python addressformatting No longer maintained
Ruby address_composer
Rust address-formatter-rs No longer maintained
Scala address-formatter

We welcome more language implementations. The more people who use the templates, the more likely bugs will be reported.

If you write a processor, please submit a pull request adding it to the list. Include this repo as a git submodule so we all use the same templates/configuration and stay in sync. See how we do it in the Perl parser for an example.

International Coverage

As of March 2024:

Metric Count
Known territories 251
Territories with tests 251 (100%)
Territories with rules 251 (100%)
Territories without rules or tests 0 (0%)

This output is generated by bin/coverage.pl. Run bin/coverage.pl -d for a detailed breakdown.

The list of all known territories is in conf/country_codes.yaml.

Note: The list contains all officially assigned ISO 3166-1 alpha-2 codes. This is not a political statement about the status of any territory.

We need more language-specific abbreviations. See conf/abbreviations. Pull requests welcome!

File Format

  • Configuration: YAML format
  • Templates: Mustache with one variation: {#first} sections take the first alternative for which a variable could be interpolated

Both formats are human-readable, strict, handle escaping, and support comments. YAML allows references ("anchors") to avoid duplication; Mustache allows sub-templates ("partials").

How to Add Your Country/Territory

Step 1: Create Test Cases

Edit the .yaml testcase for the country/territory in testcases/countries. File names correspond to ISO 3166-1 alpha-2 codes (see conf/country_codes.yaml).

To get sample data:

  1. Find an addressed location (house, business, etc.) in your target territory on OpenStreetMap
  2. Get the coordinates (lat, long)
  3. Enter the coordinates into the OpenCage Geocoding API demo
  4. Check the resulting JSON in the Raw Response tab

Step 2: Define Formatting Rules

Edit conf/countries/worldwide.yaml:

  • If your territory uses an existing generic format (defined at the top of the file): map your country_code to the generic template. You may still want to add cleanup code (see the DE entry as an example).
  • If not: define a new rule set (which may or may not be generic). You may also need to define new state/region mappings in conf/state_codes.yaml.

Step 3: Test

Process the .yaml test via a processor (see above) and verify the input produces the desired output. We run these checks automatically against pull requests to prevent regressions.

Questions? Submit an issue.

Formatting Rules

Rule Description
replace: Regex operating on input values. Useful for removing bureaucratic cruft like "London Borough of". Prefix with key= (e.g., city=) to operate only on that key.
postformat_replace: Regex operating on the final output.
add_component: Add a component with format component=XXXX.
change_country: Change the country value of the input. Useful for dependent territories. Supports substitutions like $state. See testcases/countries/sh.yaml for an example.
use_country: Use the formatting configuration of another country. Useful for dependent territories to avoid duplicating configuration.

Roadmap

More tests are always needed. For every rule about addresses there are exceptions and edge cases.

Planned features:

  • Basic error checking (e.g., ignore values that obviously cannot be postcodes)
  • Rules for postcode format validation

We welcome your pull requests. Together we can address the world!

License

MIT License - see LICENSE.txt for details.

Resources

Testing Data

Lists of random addresses/postcodes/coordinates for testing (general or country-specific).

Further Reading

Related Projects


About OpenCage GmbH

OpenCage logo

We run a worldwide geocoding API and geosearch service based on open data. Learn more about us.

We also organize Geomob, a series of regular meetups for location-based service creators. If you like geo stuff, check out the Geomob podcast.

About

templates to format geographic addresses

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 38

Languages