Backcheck Protocol
Overview
All ARCED projects are required to perform backchecks for their surveys as part of quality control. This manual serves as a comprehensive guide to backchecks. contain ARCED’s protocols and best practices on how to implement a backcheck, design questionnaires, and analyze the data to take appropriate actions based on results
💡Tip: Apart from some special cases, backcheck isn’t optional; it should be integrated into the project plan from the get-go.
What is a Backcheck?
Background
Even with the best intentions, a lot of issues might go unaddressed and mistakes might be made. This affects the overall quality of data and end up affecting the project and the findings. In some cases, the enumerators might also cut corners or use shortcuts while conducting a survey, knowingly or unknowingly. These shortcuts might include:
- Skipping sections
- Failing to prompt properly (leading or probing questions)
- Modifying scripts, examples or entire questions.
Sometimes, it might also be the case that the questionnaire is not performing well in the field, i.e. the answers of the respondents keep changing or key outcomes vary every time the question is asked. In these cases, moderation and updating of the questionnaire might be required in order to collect high-quality data.
In order to disincentivize the enumerators from cutting corners and also to methodologically monitor how the questionnaire is performing, a backcheck survey is conducted.
⭐Note: Goal of the backcheck survey is to assess enumerator performance, question performance and data quality – not to correct “wrong answers”.
Method
In a back-check survey, a highly dependable enumerator with a similar profile to the main survey team visits the respondent a second time after the main survey is conducted (a few hours or a few days later depending on the survey). The surveyor conducts a mini-survey with the respondent that includes important questions and key outcome variables in order to check back with the respondent’s original responses, ensuring accuracy and accountability. For this reason, a back-check survey is also sometimes referred to as a field audit or reinterview.
When not to conduct backcheck surveys?
Although back-check surveying is an essential tool for quality assurance, there are some scenarios where it is not appropriate to conduct back-check surveys. These might include:
- In small-scale surveys with lower risk and higher supervision on the field directly, for example: listing surveys, pilot surveys, etc. However, this can change considering the size and difficulty of the questionnaire.
- In surveys where the tracking of the respondent is not possible, for example: the survey is conducted with passersby in a particular location.
- In surveys that only ask time-sensitive questions such as activities or goods consumed in the past week.
- In surveys that are randomized in a way that asking the question a second time might affect the treatment.
- In highly sensitive situations where ethical considerations need to be prioritized, for example, an interview of a violence victim.
- In surveys where data is collected automatically via IoTs or from secondary sources. In these cases, different methods for quality control need to be adopted.
Designing the Backcheck Survey
Key Considerations
This section is a how-to of backcheck planning that outlines key considerations that are to be taken into account while designing and implementing the backcheck survey:
Considerations | Issue | Best Practice |
General | ||
How many backchecks will be conducted? |
|
|
When to start the backcheck survey and when should they be conducted? |
|
|
How long should the backcheck survey be? |
|
|
Quality Control | How do you select the sample? |
|
How many versions of the survey should there be? |
|
|
Attrition | What should be done about missing and replacement respondents? |
|
How many times will your backcheck team visit a hard-to-find respondent? |
|
|
Logistics & Team Structure | How should the backcheck surveyors be selected and how to train them? |
|
How will you spread them across your enumeration areas, surveyors, and survey period? |
|
|
Budget | How do I budget for backchecks? |
|
Selecting Respondents
Backcheck surveys should always be randomly selected. This can either be done by writing a Stata do file or through SurveyCTO directly. In case of a shorter survey, backchecks should be stratified by the surveyor and it should be made sure that each enumerator is checked consistently across the enumeration period. In case of larger surveys, it’s alright to not do the stratification.
Some randomization Stata code examples are provided here: (https://github.com/ARCED-Foundation/code-samples/tree/master/Stata/Randomization%20Code%20Examples). You may have to separately add missing or replaced respondents to the list. Do not assign backcheck samples before the actual survey takes place or is attempted.
This can also be done in SurveyCTO directly. For this, a variable needs to be created in the form for each respondent which randomly assigns them for backcheck. For example:
type | name | relevance | calculation |
calculate | bc_rand | once(random()) | |
calculate | bc_selected | if(${bc_rand}<=.15, 1, 0) |
There is no direct way to create enumerator-wise stratification in surveyCTO.
This variable should be published into a dataset so that it is updated in real-time and then added as a preload for the backcheck form. For more information on SurveyCTO server datasets:
💡Tip: Make sure to provide any pertinent information about missing or replacement respondents the back-checker may need to know.
Field Materials
Do’s
- Provide a list of communities and respondents along with information on how to find them. This can be given directly in the SurveyCTO form or print-outs if necessary.
- Create an amended consent and introduction highlighting this as a follow-up [survey for data quality reasons].
- Provide copies of research authorizations, local surveying approvals, etc.
- Prepare paper and other materials for training and fieldwork similar to the main team.
Don’ts
- Do not provide them with the original response or the treatment assigned if an impact evaluation/RCT/experiment.
- Do not give them information on whether the respondent was considered missing/ replaced in the original survey.
Analysis Framework
The backcheck data should be analyzed daily to find errors, discrepancies, and difficulties as part of arceddataflow. In order to do so, a variety of questions of three distinct types should always be included in the backcheck survey, with some questions for identifying the respondent.
Identifying Respondent And Interview Information
Always include the following questions to confirm if the interview occurred and provide details that may help explain discrepancies, including location, who was present, and date:
- Identifying information for the respondent to make sure you have the right person and the person was surveyed before.
- If the respondent was read the full informed consent statement, and whether they understood it
- If the respondent received the gift or incentive, if there was one
- How comfortable the respondent felt with the surveyor
- If the surveyor was polite
- If the surveyor answered all the respondents’ questions
- The GPS location of the back-check interview
- For sensitive questions, note who was present during the backcheck interview
- If the backcheck cannot be done, reasons why
- If a missing respondent, whether the person was absent on the day of the survey
- If there are anthropometric measures, behavioral games, or health tests, were they asked to complete them?
Question Types
Based on acceptable rate of error, type of questions asked, and stability of measures; backcheck variables have been categorized into three categories by long-term practitioners in the field and have been widely accepted by organizations such as J-Pal, IPA, World Bank-Dime, etc. Arced Foundation has also adopted this categorization in its backchecking process.
Type 1
Check whether the surveyors are a) performing the interview and b) with the right respondent.
These are questions that should never ever change, regardless of interviewer, location or time of day. A discrepancy on this type of question is a major red flag that indicates the interview may not have occurred. Examples of these questions include:
- Gender
- Household size (number of members in household)
- Age (within a certain range. Age can be tricky for older adults, especially in the context of rural Bangladesh, for example)
- Facts ( marriage, education, number of children, etc.)
- Sample Frame (bank customer, school student)The error rate should be 5-10% for these variables. We can do a sign rank test to understand the significance of the error.
⭐Note: Based on the survey context (for example, older adults in rural Bangladesh), sometimes variables like age, year of marriage, etc. can be type-2 variables.
Type 2
Assess how well the surveyors are administering the survey.
These questions assess how well the surveyors are following survey protocols. The responses to these questions are unlikely to change, but they are questions where the team will be tempted to cut corners. These may have been difficult for surveyors to understand or to administer due to complexity or sensitivity, including:
- Categorization questions (i.e. the surveyor categorizes the respondent’s answer)
- Questions with a lot of examples
- Skip Questions
In these questions, it is difficult to determine whether errors are due to the respondent or surveyor as the questions are hard to understand. The error rate should be 10-15% for these variables. We can do a sign rank test to understand the significance of the error.
Type 3
Check the stability of your measures on key outcomes.
These are questions where you want to understand the stability of the measure, such as key outcomes, interaction, or stratification variables that are integral to understanding the intervention. These questions aim to understand the survey, not the surveyor, performance. These include:
- Income or consumption models
- Profits, revenues or costs
- Quantities of inputs or goods
- Labor supply or plot size
- Scales, preferences, or opinions
PIs should be discussed in selecting these questions. The error rate should be 15-20% for these variables. We can do a t-test to understand the significance of the error.
⭐Note: The reliability ratio of the question is also a good measure of how stable the question is across time and interviewer.
Do's and Don’ts:
Do’s
- During questionnaire design, create an analysis framework and make some preliminary decisions on what to consider an error.
- Record discrepancies, maintain a log of errors, and set realistic expectations with clear action steps.
- During the initial weeks, be very involved in order to identify errors & discrepancies early.
- Aim to integrate this into your master tracking system to keep all field statistics in the same place
Don’ts
- Avoid asking subjective questions such as preferences, rankings, or hypothetical judgements as the response might change./li>
- Avoid time-sensitive questions as they can change without indicating an error.
- Don’t pick questions that have little variation as this decreases the probability of detecting an error.
Coding for Backcheck
Backcheck analysis is included in ARCED dataflow and the analysis can directly be executed by only running the code. ARCED dataflow uses ipabcstats Stata command for the backcheck analysis. Some small adjustments need to be made to the setup.do file for this. The steps for doing this are listed below:
Step-1: Turn on the switch for backcheck in the actions section, i.e. change it from 0 to 1.
Step-2: In the file path section, change the location of the bcdata file to where you want to store it. It should look like this:
Step-3: Fill up the backcheck survey information as necessary. The details of what this should contain are provided in the setup.do file.
Backcheck Report
Once the code for Backcheck has been run, it will produce a report that looks like this:
The first sheet provides a summary of the error rate for each type of error throughout the survey period.
hhid | enumsl | bcsl | differences | # compared | % different |
OAA-3607 | Sharmeen | Ovee Biswas | 10 | 30 | 33.33% |
OAA-7849 | Ummay Ayesha | Mazajul Islam | 10 | 30 | 33.33% |
OAA-918 | Forsheda begum | Md Mustafa | 9 | 30 | 30.00% |
hhid | enumsl | bcsl | differences | # compared | % different |
OAA-3607 | Sharmeen | Ovee Biswas | 10 | 30 | 33.33% |
OAA-7849 | Ummay Ayesha | Mazajul Islam | 10 | 30 | 33.33% |
OAA-918 | Forsheda begum | Md Mustafa | 9 | 30 | 30.00% |
The third sheet lists and compares specific questions with the highest differences or discrepancies.
hhid | enumsl | bcsl | differences | # compared | % different |
OAA-3607 | Sharmeen | Ovee Biswas | 10 | 30 | 33.33% |
OAA-7849 | Ummay Ayesha | Mazajul Islam | 10 | 30 | 33.33% |
OAA-918 | Forsheda begum | Md Mustafa | 9 | 30 | 30.00% |
The third sheet lists and compares specific questions with the highest differences or discrepancies.
hhid | enumsl | bcsl | differences | # compared | % different |
OAA-3607 | Sharmeen | Ovee Biswas | 10 | 30 | 33.33% |
OAA-7849 | Ummay Ayesha | Mazajul Islam | 10 | 30 | 33.33% |
OAA-918 | Forsheda begum | Md Mustafa | 9 | 30 | 30.00% |
The fourth sheet lists enumerator stats based on the three types of variables.
hhid | enumsl | bcsl | differences | # compared | % different |
OAA-3607 | Sharmeen | Ovee Biswas | 10 | 30 | 33.33% |
OAA-7849 | Ummay Ayesha | Mazajul Islam | 10 | 30 | 33.33% |
OAA-918 | Forsheda begum | Md Mustafa | 9 | 30 | 30.00% |
The fifth sheet lists backchecker stats based on the three types of variables.
hhid | enumsl | bcsl | differences | # compared | % different |
OAA-3607 | Sharmeen | Ovee Biswas | 10 | 30 | 33.33% |
OAA-7849 | Ummay Ayesha | Mazajul Islam | 10 | 30 | 33.33% |
OAA-918 | Forsheda begum | Md Mustafa | 9 | 30 | 30.00% |
The final sheet lists variable stats showing which variable fell under what category along with differences, error rate and mean comparison test results if there were any.
Action
Important considerations
- The results of the backcheck comparison should be discussed with the field leadership team. Only if the surveyor needs to be monitored, re-trained or fired should the results be addressed directly with the individual surveyor.
- Make an action plan and set clear standards when to fire surveyors, when to re-train, when to follow up with respondents, and when to re-do surveys.
- Decisions on re-doing surveys need to happen in consultation with PI’s and generally only happen if surveys were systematically falsified.
- If you plan to redo surveys, important issues to consider include respondent fatigue, cost implications, and how you can mimic in which a respondent’s cohort was interviewed.
- In cases where a surveyor falsified only a specific portion of the survey, it is usually only necessary to re-administer that portion.
- It’s usually not necessary to change the dataset based on the back-check results, even if you identify that the original data is incorrect. If the number of errors is enough to make a statistically significant change in the data, you might need to redo the entire survey.
- Your action plan to respond to backcheck results is important to share with your PI. Some PIs are more interested than others, but all PIs should agree on when it’s necessary to take more serious measures in response to backcheck results (re-surveying, firing, etc).
💡Tip: Data correction should not be done based on backcheck.
💡Tip: You do not need to create an entire separate dataflow for backcheck survey. Just download the backcheck data during downloading the main survey. For details, refer to the How to Go with the Flow (ARCED dataflow guide).
Action Plan
See the box below for an example of an action plan for dealing with discrepancies:
- In case of discrepancy between main survey and backcheck, the audio audit should be checked if available to understand what the issues were.
- The enumerator comments should be checked to identify if the issue was flagged before hand. If necessary, the enumerator and the backchecker should be discussed with personally.
- If the need arises, the field supervisor should be send to the household in question in order to reconduct the a part of survey or completely redo the survey.
- Additional surveys conducted by the same enumerator should be audited to ensure similar issues did not arise in other surveys conducted by them.
- In case of a first-time offense, the enumerator should receive a warning in order to not repeat such instances again.
- In case of a second-time offense, the enumerator should receive another warning with salary deduction or other punitive measures as necessary.
- If even after repeated warning there is no improvement and instances of offense keep recurring it can lead to firing or termination of contract.
It must be noted that discrepancies doesn’t automatically mean breach of instructions, sometimes they can also arise from survey instruments, inconsistent answer from the respondent, time sensitivity, human errors, etc. Thus they should be dealt with diligence and empathy.