The previous unit demonstrated how to retrieve (reference metadata) configuration details for a dataflow and iterate over these attributes to drive process steps.
These metadata can be utilized to customize the behavior of diverse statistical processes, including data collection, validation, and mapping.
In this unit, we'll take a closer look at this configuration process, as well as the ways in which pysdmx can assist in creating the physical data model for a dataflow, facilitating data validation, data mapping and generating the filesystem structure, along with the required metadata in all cases. We'll conclude by going over the process of using vtlengine (VILT) for validation.
In a scenario where we receive a data submission for validation, mapping, and integration, each step can be configured differently.
Select each option for more information.
Configuration options
Configuration options may depend on the ingested data or business unit practices. For instance, consider validation:
Configuration steps
These configuration options can be captured using SDMX reference metadata.
To do this:
Dataflow example
For example, for the BIS_MACRO dataflow maintained by BIS, options could include:
The basic steps to follow to create the physical data model are:
A description of each of these steps, along with python code can be found on the pysdmx site.
There are various types of validation, and we'll focus on structural validation in this scenario. Structural validation ensures that the structure of data meets the expectations.
For this scenario, the necessary metadata depends on the desired thoroughness of validation. At a minimum, we need the data structure information. However, for more comprehensive validation, we may consider additional constraints from the dataflow or provision agreement.
We'll examine more about the required metadata on the next screen.
Let's look at each of the necessary metadata for the scenario presented on the previous screen.
Select each option to learn more.
Expanding on this last example, we could define this subset of data using constraints, i.e. setting the frequency dimension to "daily" and the currency codes to the subset of codes that are published on a daily basis (e.g. CHF, CNY, EUR, JPY, USD, etc.) and we would "attach" these constraints to the dataflow. Taking these additional constraints into account makes the validation more strict.
The basic steps to follow are:
A description of each of these steps, along with python code can be found on the pysdmx site.
Pysdmx facilitates mapping data in a metadata-driven fashion, relying solely on the metadata stored in an SDMX Registry.
Select each question below to learn more.
For our example, the objective is to store data in folders organized by dataflows. In each dataflow folder, we want to have sub-folders by data providers. Access to folders should be granted via appropriate roles with access requests approved by the manager of the organizational unit owning the dataflow.
Pysdmx can aid in generating the filesystem structure in a metadata-driven fashion, relying solely on metadata stored in an SDMX Registry.
Select each option to learn more about the required metadata.
The basic steps to follow are:
More information on using pysdmx to create a filesystem layout, organize dataflows, and grant access via dedicated roles may be found here.
More information on pysdmx and vtlengine integration is available here.
Let's complete one final question before concluding. Which of the following metadata indicate which providers supply data for a dataflow?
Select your answer and then select Submit.
Provision agreements and data providers indicate which providers supply data for a dataflow. Constraints, such as expecting a provider to supply data only for its own country, can be applied.
The correct answer is option 4.
Provision agreements and data providers indicate which providers supply data for a dataflow. Constraints, such as expecting a provider to supply data only for its own country, can be applied.
Need help finding something? I am an AI Assistant that's here to help!
What are you looking for?
By using this AI-powered service ("Service"), you acknowledge and agree to the following:
This Service uses generative AI to assist with statistical analysis and research. While the Service strives to deliver useful information, the output ("Output") may contain inaccuracies, omissions, or biases. The Output is provided for informational purposes only and should not be considered professional advice. You remain responsible for how you interpret and use the Output.
The BIS makes no warranties regarding the accuracy or completeness of the Output and accepts no liability for any loss or damage resulting from its use.
Do not include or share personal, private, confidential or proprietary information when using the Service.
By using this technology, you agree to the Terms and Conditions.
Ask and get clear explanations about SDMX standards.
Find tools and documentation on website quickly.
Ask about API, software and libraries supporting SDMX.
Locate technical guides, specifications, and FAQs.