In the process of enterprise data standardization, it is expected to feedback value to business through data standardization management. So, the importance of data quality can not be overemphasized. In this process, the generation of low quality data is inevitable, because the mass data initialization, unprocessed historical data diffusion and emergency businesses will all affect the quality of data standard coding library. Thus, what the enterprises can do is to control the probability of generating low quality data and to discover low quality data in time and deal with it effectively. Therefore, the correct understanding on enterprise data quality management is that it is to reduce and control the production rate and the existence rate of low quality data through scientific, effective and professional management and technical supports, and discover low quality data in time and deal with it effectively, and keep standard code library's high health degree rather than not generating low quality data.
Data quality management platform (DQMP) is the core standard component of SunwayWorld's information standardization and management integration Platform solution (6P+2E+Mobile), ensuring the data quality of the enterprise data standard library and transparent visual analysis of the standard data code library. However, due to factors such as the large amount of data, the complexity of data information and the high professional requirements of the data coding library, manual guarantee of quality is difficult. Thus, the standard data coding library should be tested by professional quality management tools so as to discover incomplete and abnormal (but real) data to be processed, iterant and noise data to be removed. Data health analysis should be provided by a professional data quality management platform in order to steering data cleaning and governance, so as to ensure the uniqueness, integrity,consistency, and improve data quality.
Detect the constraint of data contents related to classification codes and meta data, including measurement units, prefix symbols, suffix symbols, joint symbols, maximum, minimum, associated values and other constraints.
Detect the integrity of data contents related to classification codes and meta data, including whether blank is permitted, length detection, enumerator detection, association verification and detection and consistency detection.
Detect the similarity of data contents related to classification codes and meta data and provide editing distance- and cosine law-based (letter- or word- based) similarity detection method. In other words, data similarity detection is done based on the similarity algorithm of lexical analysis and grammatical analysis and by making use of terminologies of different industries.
Support to configure soundness analysis parameters, make normal monitoring and analysis on the standard coding library, produce status analysis reports of assorted master data coding libraries based on the soundness parameter model and provide the list of to-be-processed data, thus providing a basis for data cleaning.
Detect the uniqueness of data contents related to classification codes and meta data to ensure data uniqueness.
Data quality control and configuration
Through the Data Quality Management Platform, data models of different types are configured with the corresponding quality control and analysis parameters to achieve normal quality monitoring and management on standard data of different types, thus realizing accurate duplicate checking and fuzzy duplicate checking among data and providing configurable data verification functions. The platform also supports verification and inspection of data uniqueness, completeness and consistency.
It supports the configuration of similar data matching conditions.
The system supports regular repeated code inspection on master data and provides the list of repeated codes of master data.
Support accurate duplicate checking function and configure duplicate checking rules.
Support batch export of the list of repeated codes of master data.
Support to establish unified approval process.
Support publication of the list of repeated codes and collection of opinions: the list of repeated codes of master data will be only announced to subsidiaries or business units which use to-be-deleted master data in the business system.
Achieve various inspection functions on data through configurable data inspection conditions.
processing conditions of each business system of the released list of repeated codes: establish the mapping relation of repeated codes of master data and track the business processing conditions of deleted master data (including the processing status of outstanding business and master data).
Support approval, publication, opinion collection, release and export of the list of repeated codes; realize the establishment of data constraint rules.
Realize mandatory inspection function on fields.
Realize inspection function on relationship fields.
The system supports regular health analysis of master data and provides health analysis reports of master data.