You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Brief Description
JSON Schema is a rich language for expressing constraints on JSON data. If we strictly consider JSON Schema validation (rather than any other use of JSON Schema), in many cases there are multiple ways to express the same constraints. For example, the schema:
will have the same validation outcome on all instances as the schema:
{"enum": ["foo", "bar"]}
One might say that this second schema is in some way "better" than the first one in some way that could be made precise.
The same is true for the schemas {"required": ["foo"]} and {"title": "My Schema", "required": ["foo"]}, and one might say the first one is "better" than the second for the purpose of validation.
We can define two schemas to be "equivalent" if they have this property that any instance is valid under one if and only if it is valid under the other, and if we have two equivalent schemas S and S' we might wish to define an algorithm for transforming these schemas into a form which is "canonical" or "normal" such as above.
There are existing attempts to do this for various use cases, but no central place where a self-contained set of normalization rules are written down and a self-contained tool exists to perform the procedure. Let's try and write a simple one!
Expected Outcomes
Investigate the existing implementations of normalization in the wild. There are at least two known ones, one being here.
Define a set of normalization rules, with configurability for cases where there are multple reasonable canonical forms
Define a set of test cases for schemas which are equivalent under these rules, and for the target canonical form for each set of schemas
Write a Python library which performs the normalization and emits the normalized schema
Empirically test our normalization procedure by running normalized schemas through Bowtie and comparing whether a given implementation returns the same results
Skills Required
An existing understanding of JSON Schema's keywords, which can be used to think about areas which might create possible "denormalization" (e.g. keywords which when used together overlap)
Familiarity writing Python, and ideally using JSON Schema from Python
Experience testing pieces of software by writing test cases, here likely in the form of writing JSON Schema + instance examples
Careful diligence in reading and understanding the existing procedures used (in the link above, as well as in a number of JSON Schema journal articles) and the ability to compare the previous work with each other
I love this, and I think it has some interesting overlap with my linting proposal: #856. The things that should be normalised can make very good linting rules that we can aim to auto-fix for schemas. If I can help in any way, please count me in.
Brief Description
JSON Schema is a rich language for expressing constraints on JSON data. If we strictly consider JSON Schema validation (rather than any other use of JSON Schema), in many cases there are multiple ways to express the same constraints. For example, the schema:
will have the same validation outcome on all instances as the schema:
One might say that this second schema is in some way "better" than the first one in some way that could be made precise.
The same is true for the schemas
{"required": ["foo"]}
and{"title": "My Schema", "required": ["foo"]}
, and one might say the first one is "better" than the second for the purpose of validation.We can define two schemas to be "equivalent" if they have this property that any instance is valid under one if and only if it is valid under the other, and if we have two equivalent schemas
S
andS'
we might wish to define an algorithm for transforming these schemas into a form which is "canonical" or "normal" such as above.There are existing attempts to do this for various use cases, but no central place where a self-contained set of normalization rules are written down and a self-contained tool exists to perform the procedure. Let's try and write a simple one!
Expected Outcomes
Skills Required
Mentors
@Julian
Expected Difficulty
medium
Expected Time Commitment
175
The text was updated successfully, but these errors were encountered: