Generic Statistical Business Process Model
- Level 0, the statistical business process;
- Level 1, the nine phases of the statistical business process;
- Level 2, the sub-processes within each phase;
- Level 3, a description of those sub-processes.
- Purpose (value added);
- Guides (for example manuals and documentation);
- Enablers (people and systems);
- Feedback loops or mechanisms.
- Quality management - This process includes quality assessment and control mechanisms. It recognizes the importance of evaluation and feedback throughout the statistical business process;
- Metadata management - Metadata are generated and processed within each phase, there is, therefore, a strong requirement for a metadata management system to ensure that the appropriate metadata retain their links with data throughout the GSBPM;
- Statistical framework management - This includes developing standards, for example methodologies, concepts and classifications that apply across multiple processes;
- Statistical programme management - This includes systematic monitoring and reviewing of emerging information requirements and emerging and changing data sources across all statistical domains. It may result in the definition of new statistical business processes or the redesign of existing ones;
- Knowledge management - This ensures that statistical business processes are repeatable, mainly through the maintenance of process documentation;
- Data management - This includes process-independent considerations such as general data security, custodianship and ownership;
- Process data management - This includes the management of data and metadata generated by and providing information on all parts of the statistical business process.
- Provider management - This includes cross-process burden management, as well as topics such as profiling and management of contact information (and thus has particularly close links with statistical business processes that maintain registers);
- Customer management - This includes general marketing activities, promoting statistical literacy, and dealing with non-specific customer feedback.
- Human resource management;
- Financial management;
- Project management;
- Legal framework management;
- Organizational framework management;
- Strategic planning.
Source: Information Systems Architecture for National and International Statistical Offices - Guidelines and Recommendations, United Nations, 1999, http://www.unece.org/stats/documents/information_systems_architecture/1.e.pdf
- The GSBPM places data archiving at the end of the process, after the analysis phase. It may also form the end of processing within a specific organization in the DDI model, but a key difference is that the DDI model is not necessarily limited to processes within one organization. Steps such as "Data analysis" and "Repurposing" may be carried out by different organizations to the one that collected the data.
- The DDI model replaces the dissemination phase with "Data Distribution" which takes place before the analysis phase. This reflects a difference in focus between the research and official statistics communities, with the latter putting a stronger emphasis on disseminating data, rather than research based on data disseminated by others.
- The DDI model contains the process of "Repurposing", defined as the secondary use of a data set, or the creation of a real or virtual harmonized data set. This generally refers to some re-use of a data-set that was not originally foreseen in the design and collect phases. This is covered in the GSBPM phase 1 (Specify Needs), where there is a sub-process to check the availability of existing data, and use them wherever possible. It is also reflected in the data integration sub-process within phase 5 (Process).
- The DDI model has separate phases for data discovery and data
analysis, whereas these functions are combined within phase 6 (Analysis)
in the GSBPM. In some cases, elements of the GSBPM analysis phase may
also be covered in the DDI "Data Processing" phase, depending on the
extent of analytical work prior to the "Data distribution" phase.
Source: Data Documentation Initiative (DDI) Technical Specification, Part I: Overview, Version 3.0, April 2008, http://www.ddialliance.org.
- determines the need for the statistics;
- confirms, in more detail, the statistical needs of the stakeholders;
- establishes the high level objectives of the statistical outputs;
- identifies the relevant concepts and variables for which data are required;
- checks if current collections and / or methodologies can meet these needs;
- prepares the business case to get approval to produce the statistics.
- A description of the "As-Is" business process (if it already exists), with information on how the current statistics are produced, highlighting any inefficiencies and issues to be addressed;
- The proposed "To-Be" solution, detailing how the statistical business process will be developed to produce the new or revised statistics;
- An assessment of costs and benefits, as well as any external constraints.
- producing documentation about the process components, including technical documentation and user manuals
- training the business users on how to operate the process
- moving the process components into the production environment, and ensuring they work as expected in that environment (this activity may also be part of sub-process 3.4 (Test production system)).
- preparing a collection strategy;
- training collection staff;
- ensuring collection resources are available e.g. laptops;
- configuring collection systems to request and receive the data;
- ensuring the security of data to be collected;
- preparing collection instruments (e.g. printing questionnaires, pre-filling them with existing data, loading questionnaires and data onto interviewers' computers etc.).
- matching / record linkage routines, with the aim of linking data from different sources, where those data refer to the same unit;
- prioritising, when two or more sources contain data for the same variable (with potentially different values).
- the identification of potential errors and gaps;
- the selection of data to include or exclude from imputation routines;
- imputation using one or more pre-defined methods e.g. "hot-deck" or "cold-deck";
- writing the imputed data back to the data set, and flagging them as imputed;
- the production of metadata on the imputation process.
- checking that the population coverage and response rates are as required;
- comparing the statistics with previous cycles (if applicable);
- confronting the statistics against other relevant data (both internal and external);
- investigating inconsistencies in the statistics;
- performing macro editing;
- validating the statistics against expectations and domain intelligence.
- completing consistency checks;
- determining the level of release, and applying caveats;
- collating supporting information, including interpretation, briefings, measures of uncertainty and any other necessary metadata;
- producing the supporting internal documents;
- pre-release discussion with appropriate internal subject matter experts;
- approving the statistical content for release.
- formatting data and metadata ready to be put into output databases;
- loading data and metadata into output databases;
- ensuring data are linked to the relevant metadata.
- preparing the product components (explanatory text, tables, charts etc.);
- assembling the components into products;
- editing the products and checking that they meet publication standards.
- maintaining catalogues of data and metadata archives, with sufficient information to ensure that individual data or metadata sets can be easily retrieved;
- testing retrieval processes;
- periodic checking of the integrity of archived data and metadata;
- upgrading software-specific archive formats when software changes.
- identifying data and metadata for archiving in line with the rules defined in 8.1;
- formatting those data and metadata for the repository;
- loading or transferring data and metadata to the repository;
- cataloguing the archived data and metadata;
- verifying that the data and metadata have been successfully archived.
- identifying data and metadata for disposal, in line with the rules defined in 8.1;
- disposal of those data and metadata;
- recording that those data and metadata have been disposed of.
- Seeking and analysing user feedback;
- Reviewing operations and documenting lessons learned;
- Examining process metadata and other system metrics;
- Benchmarking or peer reviewing processes with other organizations.
|Metadata handling|| i. Statistical Business Process Model: Manage metadata with a focus on the overall statistical business process model.
ii. Active not passive : Make metadata active to the greatest extent possible. Active metadata are metadata that drive other processes and actions. Treating metadata this way will ensure they are accurate and up-to-date.
iii. Reuse : Reuse metadata where possible for statistical integration as well as efficiency reasons
iv. Versions : Preserve history (old versions) of metadata.
|Metadata Authority|| i. Registration: Ensure the
registration process (workflow) associated with each metadata element
is well documented so there is clear identification of ownership,
approval status, date of operation, etc.
ii. Single source : Ensure that a single, authoritative source ('registration authority') for each metadata element exists.
iii. One entry/update : Minimize errors by entering once and updating in one place.
iv. Standards variations : Ensure that variations from standards are tightly managed/approved, documented and visible.
|Relationship to Statistical Cycle / Processes|| i. Integrity: Make metadata-related work an integral part of business processes across the organization.
ii. Matching metadata : Ensure that metadata presented to the end-users match the metadata that drove the business process or were created during the process.
iii. Describe flow : Describe metadata flow with the statistical and business processes (alongside the data flow and business logic).
iv. Capture at source : Capture metadata at their source, preferably automatically as a bi-product of other processes.
v. Exchange and use : Exchange metadata and use them for informing both computer based processes and human interpretation. The infrastructure for exchange of data and associated metadata should be based on loosely coupled components, with a choice of standard exchange languages, such as XML.
|Users|| i. Identify users: Ensure that users
are clearly identified for all metadata processes, and that all metadata
capturing will create value for them.
ii. Different formats : The diversity of metadata is recognized and there are different views corresponding to the different uses of the data. Different users require different levels of detail. Metadata appear in different formats depending on the processes and goals for which they are produced and used.
iii. Availability : Ensure that metadata are readily available and useable in the context of the users' information needs (whether an internal or external user).