MEIRU data processing description

Paper questionnaires-based data collection

 image for paper based data collection

Forms are printed and allocated to field interviewers: depending on the study requirements, these may be blank or pre-filled with some participant information. These are then administered to respondents in the field to gather responses. The completed forms are then brought back into the data office where they are double entered into two separate MS Access databases. Discrepancies between the entered records in the two databases are resolved and one authoritative database is maintained.For some questions in the forms, interviewers enter responses as free text. Data officers then use lookup tables to assign appropriate codes to the free text responses.

After double entry and resolution of any discrepancies between the records, the data are further validated using a standard set of checks defined by a senior data manager and the researchers responsible for the project. Identified errors are resolved by the data officer working on the data. If needed, revisits to the field are made to verify with the interviewees.The data are then made available to the researcher for monitoring and/or the data scientist for consistency checks with other existing data and possible merging with data from other studies as needed.

The data scientist then makes the analysis-ready or the raw datasets available to the researchers responsible for the project for analysis. Any issues identified in the data by the researchers are reported to the data officers for cleaning. When there are no further edits needed to the data, they are transferred to the archiving database.

A suite of tailored VBA programs and MS Access forms are used for the administration of the various data collection, entry, validation, and cleaning activities for a study.Properly anonymized subsets of the data will be prepared and made available to bona fide third-party researchers

Electronic data collection

So far MEIRU has used Open Data Kit, RedCap, SurveyCTO and other platforms that collaborators have chosen.

1 . Open Data Kit (ODK)


                              image for electronic based data collection

Participants are allocated to interviewers who perform the interviews on a tablet or phone loaded with the correct form. Depending on the requirements of the study forms may be blank or pre-filled with participant information. At the end of each day of fieldwork, the tablets are brought to the office and connected to the server through the Local Area Network to copy data from the tablets to the servers. Bespoke middleware developed in VBA or C# is used to transfer data from ODK MySQL databases into MEIRU ’s main MS Access database systems. Once the data are in MS Access, standard validations defined by the designated data manager and the researchers are performed on the data to identify and correct errors. The validations and cleaning are done by a designated data officer. The data officer liaises with the interviewers and researchers for the project as needed in cleaning the data. Where necessary, the interviewer may go back to the field to resolve errors that are impossible to solve from the data office.

The data are then made available to the researcher for monitoring and/or the data scientist for consistency checks with other existing data and possible merging with data from other studies as needed The data scientist then makes the analysis-ready or raw datasets available to the researchers. responsible for the project for analysis. Any issues identified in the data by the researchers are reported to the data officers for cleaning. When there are no further edits needed to the data, they are transferred to the archiving database.Properly anonymized subsets of the data will be prepared and made available to bona fide third-party researchers.

2 . SurveyCTO

     image for paper based data collection
  1. Whenever an interviewer finalises a form; that form is immediately encrypted using the public key (which is created by the SurveyCTO admin).
    • That means that even if a completed form stays on the tablet without being uploaded, e.g., due to limited connectivity, it cannot be read/accessed by a third party (or even by the interviewer).
  2. Completed forms are then transmitted to the server using SSL protocol (double encryption).
  3. SurveyCTO admin decrypts the data using the private key created earlier and downloads them from the server using SurveyCTO desktop application. The data are downloaded in form of a CSV file.
  4. Decrypted data are transferred into MEIRU main MS Access databases using bespoke VBA/ C# middleware.
  5. Once the data are in MS Access, standard validations defined by the designated data manager and the researchers are performed on the data to identify and correct errors. The validations and cleaning are done by a designated data officer.
  6. The data officer liaises with the interviewers and researchers for the project as needed in cleaning the data. Where necessary, the interviewer may go back to the field to resolve errors that are impossible to solve from the data office.
  7. The data are then made available to the researcher for monitoring and/or the data scientist for consistency checks with other existing data and possible merging with data from other studies as needed.
  8. The data scientist then makes the analysis-ready or raw datasets available to the researchers responsible for the project for analysis.
  9. Any issues identified in the data by the researchers are reported to the data officers for cleaning.
  10. When there are no further edits needed to the data, they are transferred to the archiving database.
  11. During the data collection phase, the SurveyCTO admin downloads data from the servers and transfers to the MEIRU databases on a weekly or  other agreed frequency basis. Copies of the data on the SurveyCTO servers will be deleted. Properly anonymized subsets of the data will be prepared and made available to bona fide third-party researchers.
  12. For more on encryption in SurveyCTO click here