Unit 3: Defining the Universe of Data

The universe of data is determined at the time a Reporting Template is built and is based on a number of factors.

This unit describes how to specify and, if desired, constrain the universe of data in the Reporting Template.

About the universe of data

The universe of data for a Reporting Template describes the cells which will be built into, or excluded from, the data reporting form.

For example, if a dataflow has two dimensions, Country and Employment Status then the universe could look like this:

Select the table to enlarge.

However, it may be that data is collected at the level of the Country, so each Country will have its own Reporting Template, containing only the data they are expected to provide, which for the United Kingdom (UK) could result in a universe of data that looks like this:

Select the table to enlarge.

It is possible that one country does not, or is not expected to, report data on Students, so a Reporting Template for France may differ from the UK:

Select the table to enlarge.

Factors determining the universe of data

The universe of data is determined at the time a Reporting Template is built, based on the dimensions set to be variable in the definition, and the following factors:

  • The enumerated list of values defined by the codelist for each dimension in the dataflow (excluding the frequency and time dimension).
  • The restrictions applied to these lists, based on the dataflow or data provider (content constraint).

Select each factor to reveal more.

Data structure definition

The full universe can be calculated by multiplying the length of each codelist for each dimension defined by the data structure definition. This universe can be quite large, and generally needs further restriction by applying the second factor: Content constraints.

Note that the frequency dimension is excluded from this calculation because, when the XLSX workbook is generated, it is generated for a specific time period or range. The time period is defined using SDMX date formatting e.g. a template generated for 2002-2004 would fix the frequency for the XLSX workbook to “annual”.

Content constraints
Content constraints help to tailor a worksheet to a specific collection or data provider.

For example, take a data structure with four dimensions with the given number of possible values:

  1. Frequency (8)
  2. Domain (20)
  3. Reference Area (240),
  4. Indicator (65,000)

The total reporting data universe is:

  • 20 * 240 * 65,000 = 312 million cells (we do not include frequency in the calculation).

The data structure may be re-used across multiple dataflows: Central Government Debt, Trade, Balance of Payments, etc. Each dataflow only collects data for a subset of indicators and fixes the domain.

In this instance, there would be a content constraint defined to restrict domain to a single value for the Trade dataflow and restrict the indicators to only the ones which are collected for Trade e.g. 20.

The data universe is now only:

  • 1 * 240 * 20 or 4,800 observation cells.

The next layer of restrictions can be applied to either the provision agreement, or data provider. When the collection is from UK organisations, the reference area dimension only allows the UK to be reported. Now the XLSX file will only contain:

  • 20 cells for the United Kingdom to complete for the Trade collection.

Content constraints can get quite complicated, for example partial series can be defined for both inclusion or exclusion. As such, it is possible that an XLSX worksheet will contain cells which should not have observations reported against. In this case, the cell will be hashed out, and non-editable.

Fixed values

When an XLSX file is generated for a specific data provider, some dimensions may result in fixed values.

For example:

  • If the reporting country dimension is restricted to one country for a data provider, the default is to place the fixed value in the header section of the worksheet.
  • However, it is also possible to configure the worksheet to exclude (or hide) any number of dimensions from the worksheet if they have a fixed value.
  • The value will still be read back in when the FMR imports/converts the file, but it won’t appear on the worksheet.

What do you know?

The universe of data is determined at the time a Reporting Template is built, and depends on which of the following factors?

Select all that apply and then select Submit.

Coming next …

Now that you’ve completed the step in the Reporting Template design workflow where we restrict the reporting universe, let’s turn our attention to the next step – Defining ancillary metadata.

AI assistant

Need help finding something? I am an AI Assistant that’s here to help!

Welcome to SDMX AI assistant

What are you looking for?

SDMX AI assistant

By using this AI-powered service (“Service”), you acknowledge and agree to the following:

This Service uses generative AI to assist with statistical analysis and research . While the Service strives to deliver useful information, the output (“Output”) may contain inaccuracies, omissions, or biases. The Output is provided for informational purposes only and should not be considered professional advice. You remain responsible for how you interpret and use the Output.

The BIS makes no warranties regarding the accuracy or completeness of the Output and accepts no liability for any loss or damage resulting from its use.

Do not include or share personal, private, confidential or proprietary information when using the Service.

By using this technology, you agree to the Terms and Conditions.

How the assistant can help you

Understand SDMX standards

Ask and get clear explanations about SDMX standards.

Navigate the website

Find tools and documentation on website quickly.

Explore SDMX tools

Ask about API, software and libraries supporting SDMX.

Access documentation

Locate technical guides, specifications, and FAQs.

LLMs can make mistakes, please verify important information.

Why was the answer unhelpful?