Slide 2
Data Quality: Assessment & Management
Data Quality: Assessment & Management
Slide 2
Welcome to our last session! During this session we will start discussing data quality. This is a key issue. If you can’t trust your data, you can’t trust anything about your M&E results. We’ll talk about five criteria we can use to evaluate the quality of our data, and briefly discuss legal and ethical considerations, of which we all need to be aware.
Slide 2
Data Quality: Assessment & Management
Slide 2
Notes
Slide 4
Data Quality: Assessment & Management
Data Quality: Assessment & Management
Slide 4
When we approach data quality, we will always have to balance it with costs. Realistically, we’re not going to be able to eliminate all threats to quality, but we can identify them and manage them well. In some cases, the best we can do will be to explicitly acknowledge that there is a threat that we can’t mitigate. That’s still important to do. In other cases, we’ll be able to take concrete steps to protect quality.
Data quality can be managed by systematically addressing five principal threats.
Slide 4
Data Quality: Assessment & Management
Slide 4
Notes
Slide 5
Data Quality: Assessment & Management
Data Quality: Assessment & Management
Slide 5
But before we discuss how to manage quality, let’s figure out what “quality” really means.
Can I borrow someone’s mobile phone? Actually, I need 4 volunteers to show me
their phones. Thanks!
Ok, let’s examine these 4 phones. They are all just a bit different. Which one do you
think is the ‘best’ mobile phone?
[Ask for several opinions] I like this one because it’s the biggest! Ok, WHY do you
this phone or that phone is the ‘best’?
[Ask for several reasons why one mobile phone may be ‘best’ or ‘worse’ than the other]
So you see, we all have slightly or greatly different ideas of what constitutes “quality”.
Slide 5
Data Quality: Assessment & Management
Slide 5
Notes
Slide 6
Data Quality: Assessment & Management
1. Create a definition of “quality” in groups of 3-4 people
10 min
2. Present each group’s definition
10 min
3. Work toward a consensus definition
30 min
Data Quality: Assessment & Management
Slide 6
When it comes to data quality, we can’t all have different definitions. It’s important that we share an idea of what quality means in this context so that we can analyse, interpret, and share results of our M&E process in a way that generates useable information.
So, let’s focus on how we should define “quality,” not for smartphone, but for data. Here’s how the exercise will work:
• Now, we will divide up into groups of about 3-4 people [change as necessary according to the group size, space, etc.]
• Each group should produce a definition of “quality” as it applies to data. Be prepared to argue for your definition! You have 10 minutes to produce your argument, then we will come back together. Each group will present their definition and then we’ll see if we can produce a consensus.
Use your blackboard, whiteboard or flip chart to capture the main points raised by each group, then guide the group discussion towards a consensus. In the next slides, you will present the definition that we use here in the workshop, which you can compare with the definitions created by the students.
Slide 6
Data Quality: Assessment & Management
Slide 6
Notes
Slide 8
Data Quality: Assessment & Management
Questions:
• Is there a relationship between the activity and what you are measuring?
• What is the data transcription process? What is the potential for error?
• Are steps being taken to limit transcription error?
Threats:
• Definitional Issues
• Proxy Measures
• Inclusions / Exclusions
• Data Sources
Data Quality: Assessment & Management
Slide 8
Validity means we are measuring we have intended to measure or capture. We can’t assume that this is the case. It’s important to go back to the purpose of an indicator to be clear on what we are measuring and why.
Threats to validity include:
Definitional issues. Remember, any word in an indicator could be understood differently by different people, or in different contexts. Even superficially simple words, like “service,” “people,” “person,” or “number of” might not be clear enough.
Proxy measures: We’ve already mentioned proxy measures in our indicator session. Are we sure that the proxy we’ve selected is really getting at the indicator we want to measure?
Inclusions / Exclusions.
Data sources.
In the upper right-hand corner we see an indicator that Maria included in her M&E Framework. What is a threat to validity in the wording of this indicator?
What about the word “trained”? How is this defined? By people who passed a test? By people who attended all the days of the workshop? By people who attended one day? What would you do with this indicator to try to mitigate some of that possible confusion?
Slide 8
Data Quality: Assessment & Management
Slide 8
Notes
Slide 9
Data Quality: Assessment & Management
Questions:
• Is the same instrument used from year to year, from site to site?
• Is the same data collection process used from year to year, site to site?
• Are there procedures in place to ensure that data are free of significant errors and that bias is not introduced?
Threats:
• People
• Time
• Place
• Collection methodologies,
• Instruments / Tools
• Personnel training,
• Analysis and manipulation methodologies
Data Quality: Assessment & Management
Slide 9
What do we mean when we say that something is “reliable”?
Reliable measures are consistent. They measure the same thing, the same way, every time.
There are three factors to consider when we look at threats to reliability: people, places, and time.
These three factor can affect our ability to do things consistently.
Collection methodologies, when not standardized or understood in the same way across collection areas, can affect data reliability. People use different collection methods or use the same collection methods, differently.
Collection instruments can also lead to inconsistent measures.
You can have data quality assurance measures in place, but if data collection personnel are not fully trained, a wide range of threats to the reliability of your data may still exist. Training personnel does not simply mean handing over an instruction sheet or providing an overview of the data collection tool. Thorough training is key.
Look at the upper right-hand part of the page. Maria’s workshop is held over two days, and full attendance is required in order to pass and receive the certificate. On the first day of the workshop, Maria passed this around the classroom at 9am, when the first session started. She then collected it, took it back to her office, and transferred the names to an Excel spreadsheet. On the second day, her colleague taught the workshop. Instead of passing the sheet around at 9am, she herself marked the sheet at 2pm, when the group returned from lunch break. Then, she delivered the sheet to Maria. What are some potential issues with the collection instrument she is using to measure attendance?
Slide 9
Data Quality: Assessment & Management
Slide 9
Notes
Slide 10
Data Quality: Assessment & Management
Questions:
• Are data available on a frequent enough basis to inform decisions?
• Is a regular schedule of data collection in place to meet management needs?
• Are data from within the reporting period of interest (i.e., are the data from
a point in time after the activity has begun?)
• Are the data reported as soon as possible after collection?
Threats:
• Collection frequencies
• Reporting frequencies
• Time dependency
Data Quality: Assessment & Management
Slide 10
Data are collected for use. So, it only makes sense that we need to collect data within a timeframe in which it is useful. If data can’t be collected, reported or used while it is still relevant and useful, then it’s probably not worth the effort or resources to collect it at all.
Threats to timeliness include: Collection frequencies, which can be too frequent or not frequent enough in relation to the data, reporting, or decision making.
If we don’t leave enough time to collect data, we might be tempted to rush, leaving us open to error, but if we leave too much time, we might not capture the information we need.
Let’s think about Maria’s case. With the clinical trial set to start in two months, and the Foundation asking for an evaluation report, what is the timeframe you think she has to work with for collecting post-training work observations? How would you organize it?
Slide 10
Data Quality: Assessment & Management
Slide 10
Notes
Slide 11
Data Quality: Assessment & Management
Questions:
• Is the margin of error less than the expected change being measured?
• Are the margins of error acceptable for decision making?
• Have issues around precision been reported?
• Would an increase in the degree of accuracy be more costly than the increased value of the information?
Threats:
• Source error/bias
• Instrumentation error
• Transcription error
• Manipulation error
Import
Data Quality: Assessment & Management
Slide 11
When we use data, we know that there will probably be some errors made at some point. This is something that we try to avoid, but it often occurs despite our best intentions. However, is the margin of error in the data less than the expected change the activity was intended to effect? If not, what are we measuring? An introduction of bias into the data likewise affects its accuracy.
Threats to precision can include: Source error: which is when the person or tool doing the measuring introduces bias into the measurement. Instrumentation errors can also occur when a measurement or data capture tool or method is designed in a way that introduces error. Manipulation, as simple as translating numbers into averages or percentages can introduce error. Transcription is a classic way to introduce error.
What if Maria makes a mistake when she moves her sign-in sheet from a paper version to an Excel version? What could she do to avoid that?
Slide 11
Data Quality: Assessment & Management
Slide 11
Notes
Slide 12
Data Quality: Assessment & Management
Questions:
• Are there measures in place to detect and correct potential manipulations of the data?
• Are the objectives realistically achievable given the resources used?
• Are there ways to ensure that tools designed to collect the data are available as needed and that the filled forms and databases are kept safely?
Calculate
Threats:
• Time
• Technology
• Temptation
• Corruption
• Intentional or unintentional
• Personal manipulation
• Technological failures
• Lack of audit verification and validation
Data Quality: Assessment & Management
Slide 12
When we talk about integrity in data quality, we are talking about the truthfulness of data.
Threats to data integrity can occur either willfully, or by accident. There are three key threats: time, temptation, and technology.
What could be some incentives to produce data that are false? What if Maria’s GCP course ends and it turns out that no one has passed. The Foundation is expecting report, and the clinical trial is set to start within two months. Maria would never falsify her data, but we must be aware that people can be tempted to produce false data.
Technology can facilitate data management, but it can cause problems. Look at Maria’s spreadsheet in the upper right-hand corner. She’s using it to calculate pass and fail grades. What could she have done to mitigate this threat?
Other typical examples of loss of data integrity could be:
When those who are collecting data are paid according to meet targets that are hard to reach, they may be tempted to introduce false data.
Errors that happen in transcription – especially from paper forms to electronic databases.
A data collection tool depends on online access that isn’t reliable.
Water leaks, fires or other accidents that destroy documents.
Accidentally deleted data.
Slide 12
Data Quality: Assessment & Management
Slide 12
Notes
Slide 13
Data Quality: Assessment & Management
Safeguard privacy and ensure ethical handling of data
Ensure transparency, accountability, and collaboration
Data Quality: Assessment & Management
Slide 13
There are many legal and ethical considerations when it comes to collecting, handling, and reporting data. In training, we frequently deal with personally identifiable information. In many countries, there are strict laws that deal with how that data can be handled, shared, or used. It’s always important to know how to use data well, legally, and ethically.
On the other hand, there is a growing consensus that we should share the results of evaluations, of all kinds of projects, activities, and programmes, widely and publicly if possible. This ensures that we are transparent, and accountable for our work...and it helps others by adding to our shared body of research.
What are the key issues in your institutions, countries or fields when it comes to the legal and ethical handling of data?
Slide 13
Data Quality: Assessment & Management
Slide 13
Notes
Slide 14
Data Quality: Assessment & Management
2. Return to the group and discuss any disagreements in classification
15 min
1. Place the sticky notes where they belong on the wall
5 min
Data Quality: Assessment & Management
Slide 14
[Prepare for this exercise by creating packets of 6 labels with the 6 major stages of the data management process: Source, Collection, Collation, Analysis, Reporting, Usage. Create a space on your wall or board (or a table if necessary) where students can stick their notes or, alternatively, write the names of the stages. You can label columns 1 through 6 so they have a sense of where to place the labels / write.]
So, we’ve talked about data quality, and how to mitigate five major threats to it. But we must be aware that data doesn’t only exist in our M&E process at one point and we don’t only deal with or interact with data in only one way.
In fact, we can talk about a “data management process” in M&E that identified many clear steps at which we use or interact with data. Threats to data quality can present themselves at any of these steps. But what are they? We’ve said that indicators are where data enter our M&E system. Where do they go from there?
Let’s do a quick exercise to think about the flow of our data management process. I’ll hand each of you a packet of six sticky notes, each with a step of the data management process. Using your intuition, each of you will go to the wall and individually stick them under the numbers from 1 to 6 to indicate how you envision this process.
Then, we’ll return to the group and ask a few of you to explain how you’ve chosen to arrange these steps. Once we have heard from you, we’ll go back to take a closer look at each of the stages of the data management process.
[Once all the labels are placed, look. Ask some students to explain their choices. If there is disagreement, facilitate a discussion to try to get to a consensus view. After a 10-15 minutes you can move to the detailed definitions. This exercise should be kept short.]
Slide 14
Data Quality: Assessment & Management
Slide 14
Notes
Slide 15
Data Quality: Assessment & Management
Data Quality: Assessment & Management
Slide 15
So, here’s how we arrange the steps or stages of the data management process. Does this make sense? Let’s get into the details.
For the purpose of this workshop, we are going to create a simple data quality plan that accounts for mitigating threats to the five data quality criteria we’ve just discussed through the data management process. In a more complex plan, you would need to include indicator information sheets, a risk analysis, and an audit trail reference.
At each stage of the data management process, you’ll need to indicate how you are going to mitigate threats to the five criteria for data quality for each indicator. For this workshop, we’ll ask you to go through this process for at least two indicators. An operational M&E Framework should include this process for all indicators used.
Slide 15
Data Quality: Assessment & Management
Slide 15
Notes
Slide 16
Data Quality: Assessment & Management
• Data entered in instrument is incomplete, inconsistent, or mislabled
• Instrument can’t be used due to location or resources (no Internet connection)
• Training is insufficient and collectors don’t know how to use instruments
• Collection instrument is changed mid-stream
• Provide detailed trainings
• Observe data collection to ensure consistency
• Provide necessary materials for the context
• Document data collection process or instrument changes
Data Quality: Assessment & Management
Slide 16
What data quality threats can be introduced at the source?
Let’s think of some. How would be mitigate them?
Involving the producers, collectors and processors of data in this conversation is key to detect and mitigate potential problems. In some cases, it might be possible to do so with instrument design or special training for collectors.
Slide 16
Data Quality: Assessment & Management
Slide 16
Notes
Slide 17
Data Quality: Assessment & Management
• Data entered in instrument is incomplete, inconsistent, or mislabled
• Instrument can’t be used due to location or resources (no Internet connection)
• Training is insufficient and collectors don’t know how to use instruments
• Collection instrument is changed mid-stream
• Provide detailed trainings
• Observe data collection to ensure consistency
• Provide necessary materials for the context
• Document data collection process or instrument changes
Data Quality: Assessment & Management
Slide 17
What data quality threats can be introduced at collection?
Let’s think of some. How would be mitigate them?
Develop and test instruments. Provide clear, consistent training. Observe how data are being collected and develop one routine way of performing this task and standard operating procedures. Make sure that all items needed to perform data collection are available and appropriate for the context.
Slide 17
Data Quality: Assessment & Management
Slide 17
Notes
Slide 18
Data Quality: Assessment & Management
• Transcription errors
• Lost files or misplaced data
• Technical problems whem compiling data sets
• Establish protocols and / or checklists
• Randomly sample and verify data
• Make note of errors you discover and report them
Data Quality: Assessment & Management
Slide 18
What data quality threats can be introduced at collation?
Let’s think of some. How would be mitigate them?
Determine a clear process for collating data and follow it each time you use a checklist or other protocol. Establish a verification process to avoid data entry error and other mistakes. Report any errors you do discover.
Slide 18
Data Quality: Assessment & Management
Slide 18
Notes
Slide 19
Data Quality: Assessment & Management
• Technological problems
• Use of inadequate analysis techniques
• Incorrect assumptions
• Explicitly describe analysis techniques and assumptions
• Verify tools used for analysis
• Ask for expert opinion
Data Quality: Assessment & Management
Slide 19
What data quality threats can be introduced at analysis?
Let’s think of some. How would be mitigate them?
One of the most important steps here is to be clear about your analysis, how you have manipulated what data, under what conditions, and with what tools. If you’re not sure what technique or methods to use, consult with others who have more experience.
Slide 19
Data Quality: Assessment & Management
Slide 19
Notes
Slide 20
Data Quality: Assessment & Management
• Violations of data privacy
• Using a format or synthesis that is not useful to audience
• Selective use of results
• Creating a narrative that doesn’t match with the data
• Simple errors
• Use an external reviewer to get feedback on reports
• Create reports for a specific audience
• Be honest in your reporting include all your results – not just ones that look good
• Lead with the data
• Protect data privacy and confidentiality when reporting
Data Quality: Assessment & Management
Slide 20
What data quality threats can be introduced at reporting?
Let’s think of some. How would be mitigate them?
Think carefully about the audience for your report. What do they need to know and what’s the best way to get them that information? Let the data lead your reporting, and be transparent, even when that means reporting unflattering data. Always be careful to protect data privacy and confidentiality when necessary. Use an external reviewer to check for errors.
Slide 20
Data Quality: Assessment & Management
Slide 20
Notes
Slide 21
Data Quality: Assessment & Management
• Creating a story and fitting the data to it
• Withholding data from all or specific audiences
• Not fully understanding the data
• Look at and analyse the data before deciding what it means
• Share all relevant data
• Make sure you understand your own data – discuss it in your team
Data Quality: Assessment & Management
Slide 21
What data quality threats can be introduced at reporting?
Let’s think of some. How would be mitigate them?
It’s important that you understand your own data and feel comfortable with it. If you don’t, bring in other team members to discuss it, or even external colleagues or experts. Don’t share or report on data that you don’t understand. When you do share data, share all of it, even if it doesn’t look good for your activity. Transparency is essential to credibility. You will learn and help others learn. Look at the data before you create a story about what it means. Be sure that any narratives you create reflect what you’ve seen in the data.
Slide 21
Data Quality: Assessment & Management
Slide 21
Notes
Slide 22
Data Quality: Assessment & Management
2. In pairs, discuss and give feedback on your data quality plan
30 min
1. Using the M&E Framework template, complete a data quality plan for at least two indicators
1 hr
3. Return to the group to present your work
45 min
Data Quality: Assessment & Management
Slide 22
Now it’s time to get back into your M&E frameworks. You’ll have an hour in this session to review your work so far and then complete, or at least move forward with, section 4. You’ll need to define a data quality plan for at least two of your indicators. For these two indicators, identify potential threats to the five criteria for data quality at each stage of the data management process and indicate how you plan to mitigate and recognize these threats.
Once you’ve had some time to work, we will ask you to find another person to work with. Discuss your data quality approach and give feedback. Figure out any questions that you have.
Then, we’ll return to the group for a round of presentations, open discussion and questions.
Slide 22
Data Quality: Assessment & Management
Slide 22
Notes
Slide 23
Data Quality: Assessment & Management
Here’s what we covered in this session:
1. Defining “data quality”
2. Five criteria for quality
3. Legal and ethical considerations
4. Structure of data quality plans
Here were our learning objectives:
1. Define “Monitoring” and “Evaluation”; compare and contrast definitions
2. Explain the importance of M&E for research capacity building activities
3. Establish goals and objectives for research capacity building activities
Data Quality: Assessment & Management
Slide 23
Data Quality: Assessment & Management
Slide 23
Notes