The W. Edwards Deming Institute Blog

Analyzing Data Requires an Understanding of the System Generating the Data


This recent article, Has Uber Forced Taxi Drivers to Step Up Their Game?, explores how competition from Uber has pressured the legacy taxi business to improve customer service.

In exploring whether the data supports the idea that there has been an effort to improve customer service the article mentions that complaints about taxis increased in 2012, and then provides an explanation of a system change that is likely the cause. In that year taxis were required to prominently display the phone number to complain. Without knowing more than the article tells that seems like a logical explanation of the increase to me. And that understanding is very important to understanding what the data is telling us.

This highlights a very important factor when looking at data, you must understand the processes and system that generated the data. If you do not you will draw faulty conclusions.

If you bring in a new effort to focus on customers and solicit more feedback if you don’t get an increase in complaints that is likely not an indication of success but an indication of failure. One of the easiest way to reduce the number of complaints counted is to make complaining, in a way that is counted, difficult.

If you tie performance appraisals or bonuses to improved results, you will drive behavior to make the number look better (which isn’t the same as driving better results for the business and customers). Making the numbers look better through manipulation (of the data or system) is usually much easier to do (for example, by changing the process to make it harder to complain – or by just not recording verbal complaints even if the operational definition for the collection of data says those should be recorded) than it is to improve the process so people are actually happier with your service.

Data is important. Using data to measure the effectiveness of new efforts is important. But you need to understand the risks of being led astray. That risk is much greater if those analyzing the data are not intimately familiar with the processes generating the data and the operational definitions used to collect the data.

Related: Dangers of Forgetting the Proxy Nature of DataUnknown and Unknowable DataCustomer, or User, GembaExecutive Leadership

Categorised as: data


  1. Doug Stilwell says:

    “But it Doesn’t Mean Anything”

    For the past 10-15 years educators have been diligently working to become more aware of and familiar with student learning data in an attempt to improve learning and achievement for students, and this is a good thing. In Iowa we have called this being “data driven.” Conversations are filled with questions such as “What’s the data?” or “What do the data say?” so that the best decisions can be made to improve student learning.

    However, much like Mr. Hunter indicates in his blog, it is important to not just focus on what the data “is,” but to focus on what it “means.” In order to better understand what the data means, “context is king.” In order to understand what he data means one must understand more broadly the system that is producing the data. In other words, it would be a mistake to look at a data point or even a set of data points and make instructional decisions without fully understanding how classroom “learning systems” operate and produce results.

    In the movie “The Sound of Music” there is a scene during which Maria is teaching the children, for whom she is the governess, to sing. At one point when the children are echoing Maria’s “do-so-la-fa-mi-do-re,” Greta (the youngest of the VonTrapp children) stops and says, “But it doesn’t mean anything.” This is a wise insight educators should heed. We know what the data “is” but do we know what it “means?”

  2. […] there is a significant risk that data doesn’t provide a decent view of reality. Without an appreciation for the gemba, where the data was collected, it is easy to be mislead by the […]

Leave a Reply

Your email address will not be published. Required fields are marked *