Templates and test cases for address formats used in territories around the world. The templates can be processed in any programming language (see list of processors).
Given a set of address parts:
house_number: 17
road: Rue du Médecin-Colonel Calbairac
neighbourhood: Lafourguette
suburb: Toulouse Ouest
postcode: 31000
city: Toulouse
county: Toulouse
state: Midi-Pyrénées
country: France
country_code: FRWe want to compile an address in the format consumers expect:
17 Rue du Médecin-Colonel Calbairac
31000 Toulouse
France
The intended use case is database or geocoding systems (forward, reverse, autocomplete) where we know both the country of the address and the language of the user/reader. The address is displayed to a consumer (for example in an app) and not used to print on an envelope for actual postal delivery. We use it to format output from the OpenCage Geocoding API.
What we handle:
- Incomplete data
- Anything with a name (peaks, bridges, bus stops)
What we don't handle (unlike physical postal mail):
- Apartment/flat numbers, floor numbers
- PO boxes
- Translating the destination address language (whatever language is input is output)
Our goal is a series of programming language-independent templates that can be processed by whatever software you like.
| Language | Repository | Notes |
|---|---|---|
| Android | AndroidAddressFormatter | |
| Elixir | ex_address_formatting | |
| Go | address-formatter | |
| Java | address-formatter-java | |
| JavaScript | address-formatter | |
| Kotlin | address-formatter-kotlin | |
| Perl | Geo-Address-Formatter | |
| PHP | address-formatter-php | |
| PowerShell | AddressFormatter | Cross-platform |
| Python | addressformatting | No longer maintained |
| Ruby | address_composer | |
| Rust | address-formatter-rs | No longer maintained |
| Scala | address-formatter |
We welcome more language implementations. The more people who use the templates, the more likely bugs will be reported.
If you write a processor, please submit a pull request adding it to the list. Include this repo as a git submodule so we all use the same templates/configuration and stay in sync. See how we do it in the Perl parser for an example.
As of March 2024:
| Metric | Count |
|---|---|
| Known territories | 251 |
| Territories with tests | 251 (100%) |
| Territories with rules | 251 (100%) |
| Territories without rules or tests | 0 (0%) |
This output is generated by bin/coverage.pl. Run bin/coverage.pl -d for a detailed breakdown.
The list of all known territories is in conf/country_codes.yaml.
Note: The list contains all officially assigned ISO 3166-1 alpha-2 codes. This is not a political statement about the status of any territory.
We need more language-specific abbreviations. See conf/abbreviations. Pull requests welcome!
- Configuration: YAML format
- Templates: Mustache with one variation:
{#first}sections take the first alternative for which a variable could be interpolated
Both formats are human-readable, strict, handle escaping, and support comments. YAML allows references ("anchors") to avoid duplication; Mustache allows sub-templates ("partials").
Edit the .yaml testcase for the country/territory in testcases/countries. File names correspond to ISO 3166-1 alpha-2 codes (see conf/country_codes.yaml).
To get sample data:
- Find an addressed location (house, business, etc.) in your target territory on OpenStreetMap
- Get the coordinates (lat, long)
- Enter the coordinates into the OpenCage Geocoding API demo
- Check the resulting JSON in the Raw Response tab
Edit conf/countries/worldwide.yaml:
- If your territory uses an existing generic format (defined at the top of the file): map your
country_codeto the generic template. You may still want to add cleanup code (see theDEentry as an example). - If not: define a new rule set (which may or may not be generic). You may also need to define new state/region mappings in
conf/state_codes.yaml.
Process the .yaml test via a processor (see above) and verify the input produces the desired output. We run these checks automatically against pull requests to prevent regressions.
Questions? Submit an issue.
| Rule | Description |
|---|---|
replace: |
Regex operating on input values. Useful for removing bureaucratic cruft like "London Borough of". Prefix with key= (e.g., city=) to operate only on that key. |
postformat_replace: |
Regex operating on the final output. |
add_component: |
Add a component with format component=XXXX. |
change_country: |
Change the country value of the input. Useful for dependent territories. Supports substitutions like $state. See testcases/countries/sh.yaml for an example. |
use_country: |
Use the formatting configuration of another country. Useful for dependent territories to avoid duplicating configuration. |
More tests are always needed. For every rule about addresses there are exceptions and edge cases.
Planned features:
- Basic error checking (e.g., ignore values that obviously cannot be postcodes)
- Rules for postcode format validation
We welcome your pull requests. Together we can address the world!
MIT License - see LICENSE.txt for details.
Lists of random addresses/postcodes/coordinates for testing (general or country-specific).
- Our blog post announcing this project and the motivations behind it
- Falsehoods Programmers Believe about Addresses by Michael Tandy
- OpenStreetMap - Open address data
- OpenAddresses - Open address data
- OpenCage Geocoder - Convert coordinates to formatted addresses
- what3words - An alternative to traditional addresses
We run a worldwide geocoding API and geosearch service based on open data. Learn more about us.
We also organize Geomob, a series of regular meetups for location-based service creators. If you like geo stuff, check out the Geomob podcast.
