-
Notifications
You must be signed in to change notification settings - Fork 14
Description
Definitions
A masking definition contains the following parts :
- the generator : describe the process to generate a new value
- the coherence context : describe the level of coherence expected for the new value (consistency with other current values or previous values)
- the location : where the value will be written in the json data
The generator is usually defined by the mask part of the masking.yml, except for "hash" and "hashInUri" masks which contains a coherence element.
The coherence is usually defined by some properties added to the mask : seed, cache or the hash part in "hash" and "hashInUri" masks.
The location is defined by the selector part.
What we need to store in a masking library, is only the generator part. When applied in a given context, we can choose where we apply it (selector) and how we handle consistency (cache, seed, hash + what source field is used).
Note: we can allow coherence information in some dedicated masks.
Note: we can allow selector information in case of multiple fields output.
Examples
This generator :
- randomChoiceInUri: "pimo://nameFR"Can be used in differnt contexts :
# synthesize new data :
- selector:
jsonpath: "name1"
masks:
- add: ""
- randomChoiceInUri: "pimo://nameFR"
# synthesize new data consistently with another field:
- selector:
jsonpath: "name2"
masks:
- add: ""
- randomChoiceInUri: "pimo://nameFR"
seed:
field: "id"
# pseudonymize consistently with another field:
- selector:
jsonpath: "name3"
mask:
randomChoiceInUri: "pimo://nameFR"
seed:
field: "id"
...How to define a mask library
The library should expose a variety of data types
- how to generate a french familly name (locale fr_FR)
- how to generate a french siret
- how to generate a birth date
- etc ...
This can be done by storing a single file for each data type, that contains the list of masks to apply.
filename : person_name_fr_FR.yml
version: "1":
masking:
- selector:
jsonpath: "."
mask:
randomChoiceInUri: "pimo://nameFR"It's similar to a normal masking. Except for the "." jsonpath that allow to write on the current location in the json stream (where the mask is applied).
Some generators can take parameters
filename : nir.yml
masking:
- selector:
jsonpath: "gender" #if present then gender is used a parameter
masks:
- add: true #add parameter if not present
- randomChoice: [1, 2]
preserve: "value" #preserve parameter value if present
# other parameters ...
- selector:
jsonpath: "nir"
masks:
- add: true #in this example, the result will be created in a new subfield
- template: '{{if eq .gender "M" }}1{{else}}2{{end}}{{.birth_date | substr 8 10}}{{.birth_date | substr 3 5}}{{.department_code | printf "%02d"}}{{.city_code | printf "%03d"}}{{.order | printf "%03d"}}'
- template: '{{ sub 97 (mod (int64 .nir_start) 97)}}'How to use masks library
The library can be a folder, a git repository, a website, ...
A new property need to be created to load the library, in the masking.yml
version: "1"
librairies:
- "http://domain.org/mylibrary"
- "pimo://internal-library"
- "https+git://github.com/repo/library.git@v0.1.0"
- "file://mylocalibrary"Then a mask from library can be used via a new type of mask
- selector:
jsonpath: "nir"
mask:
generate:
using: "nir" # name of the yaml file in the libraryPassing parameters : option 1
- selector:
jsonpath: "nir"
mask:
generate:
using: "nir" # name of the yaml file in the library
with:
gender: "M"or, if we want to use an existing field as parameter
- selector:
jsonpath: "nir"
mask:
generate:
using: "nir"
with:
gender: { from: "gender" }Passing parameters : option 2
# precreate a param with a value
- selector:
jsonpath: "gender"
mask:
constant: "M"
# call mask on the current document (selector: ".")
- selector:
jsonpath: "."
mask:
generate:
using: "nir" # name of the yaml file in the library