Data Editing
Processing and Data Consistency
This task is comprised of a set of tasks whose main purpose is to provide the data collected in the census in an organized manner in a database, which will facilitate the analysis of the census results through tabulations. This activity has the following objectives:
- To implement a centralized process in the headquarters in the department of Lima. This is done to digitize, verify and ensure consistency of the data.
- To implement a secure network to process census records
- Computer security at the software, hardware, and communications level
- Quality control
- Use of the internet to process, monitor, and supervise the activities and processes
The functional organization of the data processing task comprises the following tasks:
Information Processing Systems:
It refers to the development of the necessary programs to record the data, code responses automatically and with assistance for the open-ended questions, to process and check the consistency of the information (coverage, flows, ranges, arithmetic-logical relationships, and cross relationships), and the tabulations that will make it possible to have consistent and coherent information necessary to satisfy the objectives of statistical research.
Quality control:
It refers to the quality assurance process that is conducted for the questions processed during each batch of work. To do so, an algorithm that automatically returns the error rates created by omitted responses and variable consistency is used. The algorithm and the design of the basic reports that the system must issue to control the coverage of the omitted data are provided by the Executive Directorate of Sampling and Sample Framework (DMMM)
Consistency of Census coverage and variables:
This phase is carried out once the district is complete; It consists of verifying that the data at all levels or geographical breaks are complete, in case of missing data, it is about locating the cards and processing until the district is completely complete. Likewise, consistency rules are applied to identify
the types of errors and generate dynamic tables as input for the next phase.
Consistency:
this task consists of verifying that the data at all levels of aggregation is complete. In the event of missing data, the missing records should be located and processed until the data is complete. Consistency rules are applied to identify the types of errors present and generate dynamic tables as input for the next phase.
The questions for which automatic codes are will be called "Automatic coding".
For records for which the alias table is applied, the assignment is done automatically and will be called "Automatic coding".
For records for which a specialist operator assigns the corresponding code, it will be referred to as "Assisted Coding".
Imputation: It consists of detecting data errors and applying consistency rules to make an automatic correction of errors or inconsistencies in the data, through the application of special programs for data correction.
DataBase:
Standardized Database: In the data processing phase (coverage, structure, consistency and imputation), the data will be organized in a standardized database, the standardization allows for smaller data structures that, in addition to being simpler and more stable, make data maintenance easier, as well as avoid data redundancy and protect data integrity.
- Final Database: It is the database generated with the fully corrected and consistent data, this database is the one used to generate the reports or statistical tables. It is planned that the unnamed database will be made available to users through the MINAG and INEI web pages.
Tabulation Generation
Tables will be generated consistently and coherently once the departmental databases are available, which enables the analysis of the final census results.
Documentation
It consists of organizing and ordering all the documentation concerning the design and implementation of the processing systems. All computer applications used must be documented, detailing what functions it performs, the files it works with (input and output), the lists it issues and the relationship it has with other components of the process.
Preparation of Reports
Refers to the management and development of reports that must be issued by those responsible for the processing tasks. These reports must be issued promptly for each task, to enable the monitoring of the progress of the activities.
This task also includes the preparation of a final report in which everything carried out during the process will be presented in detail, which will constitute a reference document for the development of similar activities in the future.[Processing and Data Consistency
This task is comprised of a set of tasks whose main purpose is to provide the data collected in the census in an organized manner in a database, which will facilitate the analysis of the census results through tabulations. This activity has the following objectives:
- To implement a centralized process in the headquarters in the department of Lima. This is done to digitize, verify and ensure consistency of the data.
- To implement a secure network to process census records
- Computer security at the software, hardware, and communications level
- Quality control
- Use of the internet to process, monitor, and supervise the activities and processes
The functional organization of the data processing task comprises the following tasks:
Information Processing Systems:
It refers to the development of the necessary programs to record the data, code responses automatically and with assistance for the open-ended questions, to process and check the consistency of the information (coverage, flows, ranges, arithmetic-logical relationships, and cross relationships), and the tabulations that will make it possible to have consistent and coherent information necessary to satisfy the objectives of statistical research.
Quality control:
It refers to the quality assurance process that is conducted for the questions processed during each batch of work. To do so, an algorithm that automatically returns the error rates created by omitted responses and variable consistency is used. The algorithm and the design of the basic reports that the system must issue to control the coverage of the omitted data are provided by the Executive Directorate of Sampling and Sample Framework (DMMM)
Consistency of Census coverage and variables:
This phase is carried out once the district is complete; It consists of verifying that the data at all levels or geographical breaks are complete, in case of missing data, it is about locating the cards and processing until the district is completely complete. Likewise, consistency rules are applied to identify
the types of errors and generate dynamic tables as input for the next phase.
Consistency:
this task consists of verifying that the data at all levels of aggregation is complete. In the event of missing data, the missing records should be located and processed until the data is complete. Consistency rules are applied to identify the types of errors present and generate dynamic tables as input for the next phase.
The questions for which automatic codes are will be called "Automatic coding".
For records for which the alias table is applied, the assignment is done automatically and will be called "Automatic coding".
For records for which a specialist operator assigns the corresponding code, it will be referred to as "Assisted Coding".
Imputation: It consists of detecting data errors and applying consistency rules to make an automatic correction of errors or inconsistencies in the data, through the application of special programs for data correction.
DataBase:
Standardized Database: In the data processing phase (coverage, structure, consistency and imputation), the data will be organized in a standardized database, the standardization allows for smaller data structures that, in addition to being simpler and more stable, make data maintenance easier, as well as avoid data redundancy and protect data integrity.
- Final Database: It is the database generated with the fully corrected and consistent data, this database is the one used to generate the reports or statistical tables. It is planned that the unnamed database will be made available to users through the MINAG and INEI web pages.
Tabulation Generation
Tables will be generated consistently and coherently once the departmental databases are available, which enables the analysis of the final census results.
Documentation
It consists of organizing and ordering all the documentation concerning the design and implementation of the processing systems. All computer applications used must be documented, detailing what functions it performs, the files it works with (input and output), the lists it issues and the relationship it has with other components of the process.
Preparation of Reports
Refers to the management and development of reports that must be issued by those responsible for the processing tasks. These reports must be issued promptly for each task, to enable the monitoring of the progress of the activities.
This task also includes the preparation of a final report in which everything carried out during the process will be presented in detail, which will constitute a reference document for the development of similar activities in the future.