What are statistical tools?
Statistics is a branch of science that deals with samples, measurements, calculations and estimation of population properties.
Statistical Process Control
Statistics found its application in manufacturing way back in the 1920s with Dr Shewhart of Bell Laboratories. We acclaim him for his application of Statistics in Manufacturing Science to predict when the process will produce a defect.
From there on, the application of statistics in manufacturing spread across the world and became instrumental in the Japanese Quality Revolution.
According to Dr Shewhart, there are only two mistakes the people in manufacturing are doing –
 Reacting when they are not supposed to react and
 Not reacting when they supposed to react
i.e., there are two types of deviations happen in any process. One being inherent variations (aka, common cause variations) of the process. And the second is variations happening due to some unique disturbances (aka assessable causes or special causes).
He recommended using the statistical process of differentiating between these two types of causes. To simplify the differentiation, he used control charts for the same.
7 QC Tools
When Dr Deming and Dr Juran visited Japan in 1951, they instilled a culture of databased quality improvement. Dr Deming introduced process flow diagrams, other basic graphical tools. At the same time, Dr Juran taught tools Pareto charts. Similarly, industry experts in Japan had devised tools like causeandeffect diagram (Dr Ishikawa), whywhy analysis and more.
Statistics and Six Sigma
Statistician Bill Smith with the help of Dr Michael Harry developed a problemsolving method based on statistics in Motorola. Processes can benefit by reaching a quality level of 99.99967% of acceptance through applying this statistical methodology. This level of quality is denoted as sixth level in Sigma Scale.
Bob Galvin, the then CEO of Motorola named the method after its result, i.e., Six Sigma Methodology.
Six Sigma methodology is a framework that uses the best practices of project management, principles of statistics and best practices of statistical and improvement techniques.
We frequently use below Statistical techniques in Six Sigma methodology.
 Sampling – sampling methods and then size,
 Data Collection and Visualisation,
 Screening of Data,
 Stratification – using QC tools,
 Measurement System Analysis,
 Stability of process – using control charts,
 Process capability measurements,
 Comparison of multiple data sets,
 Establishing the relationship between two or more variables,
 Utilising control charts to control the parameters after improvement.
Application of Statistical Tools
Now we have a plethora of statistical tools, and we have an ocean of knowhows in the repository.
But I frequently see that process experts getting overwhelmed by the tools. And sometimes, we end up applying the wrong tool. Even worse, we expect tools to give us more information out of the intended scope.
We need to remember, that every tool is developed for a defined purpose. And hence, they work well only when we use them correctly.
Many a time, people ask me is there a ready reckoner for beginners as well as experts.
Tools are tools!
Tools are tools; They benefit the user depending on how rightly he uses them and how well he uses them. So, I am sharing the Application Matrix.
This matrix talks about the tool, its intended purpose of application and the outcome we can derive from it. And it clarifies when to apply the tool and prerequisites.
Statistical Tools – Application Matrix
Statistical Tools Application Matrix.csv
Key word  What is it?  When to use  Decision Making  Deciding Factor  Outcome  Prerequisite  Point of caution  

Histogram  Data Distribution  A tool that describes data in terms of frequency and its distribution  When you want to visualise data in the form of a curve  Indicative  Central Location and Min & Max value of data, distribution across the range and shape of the data  Continuous Data  Only one data set can be studied using one histogram. Not suitable for Sequential properties (process) 

Pareto Chart  Prioritise & Vital Few  A tool to compare different attributes based on their frequency of appearance  When you want to prioritise on major category / major failures  Decisive  % Cumulative contribution > 80%  Vital 20% categories among the analysed categories  Frequency or count (repetition in raw data)  Require process expertise to categorise 
Box Plot  Data Distribution & Compare  A tool to describe data distribution & compare the location and spread of multiple data sets  Similar to histogram but points out median and outliers in data; and when you want to compare the distribution of more than one data set  Indicative / Decisive  Median Line, Height of box and length of Whisker  Central location (concentration), spread of data (height of the box), skewness (position of median line) and Outliers. We can compare multiple data sets in a single graph with multiple box plots.  One parameter of multiple data sets  Indicative or comparative decisions can be taken. Not an absolute decision making tool like Pareto or Regression 
Time Series  Trends and Patterns  A tool to see process behaviours over a time period / change of process behaviour over the time  When you want to understand the change in the data against a specific interval of time or sequence  Indicative (Suspect)  Can Identify typo errors, visual validation of data correctness, trends and patterns of data  The interval has to be constant (time interval / sequence interval). We need to follow Simple Systematic sampling  Need process expertise to interpret the graph  
Control Chart  Statistical Stability  A tool to check whether the process needs adjustments or improvements  When you want to see whether the process is under statistical control / stability (without disturbance)  Decisive / Indicative (suspect)  Any point of control limit  Statistical stability of process & presence of any special cause variation. A process with one or more special causes is considered not ready for improvement. Control Limits also indicate where we can expect the next data point  We need to follow Systematic sampling to derive a control chart. The interval has to be constant (time interval / sequence interval). Systematic Subgroup sampling will be required to construct a Xbar R Chart of Xbar S Chart  Higher stability does not mean higher capability 
Normality Test  Normality  A tool to check whether the data is following a normal distribution (bellshaped curve) or not  When you want to measure how much close (resembles) the data is to the ideal normal curve  Decisive  pvalue (more than 0.05 is said to be close enough to Normal data)  Same as purpose  
Process Capability Cp  Capability  A tool to measure the ability of our process to meet customer requirement with respect to the studied parameter in short term  When you want to measure our capability of process to meet customer specifications  Decisive  Cp Value  How less is our process variation in comparison with the allowed variation  The interval has to be constant (time interval / sequence interval).  To be used for only established process (not for a new process). Use of short term Standard Deviation. 
Process Capability Cpk  Capability  A tool to measure the ability of our process to meet customer requirement with respect to the studied parameter in short term  When you want to measure the capability of process to meet customer specifications  Decisive  Cpk Value (Least of CpkUpper and CpkLower is considered as Process Cpk  How much close is my mean to the target and how much less my process variations compared to allowed variations  The interval has to be constant (time interval / sequence interval).  Use of short term Standard Deviation 
DPMO  Capability  A tool to measure the number of defects the process will produce  When you want to assess our process capability in terms of number of defects produced by the process  Decisive  DPMO Value  How many defects will my process produce in future  We need to count the number of defects produced in a process and not the number of defective pieces. i.e., % rejection data cannot be converted into DPMO  It is different from PPM. Optimum number of Defect Opportunities 
Brainstorming  Cause Analysis  A tool to collect expert ideas or opinions on a particular subject from a small team  When you want to gather all possible causes of a rejection / all possible ideas for a solution  Indicative (Suspect)    List of all possible cause of failure or solutions that are mutually exclusive and collectively exhaustive  People participating in brainstorming has to have some basic knowledge about the process and the problem. Everyone has to participate.  Every point counts. To be exhaustive 
Fishbone Analysis  Cause Analysis  A framework to collect ideas from people related to 6 Categories of failures (6Ms)  When you want to gather all possible causes of a rejection / all possible ideas for a solution  Indicative (Suspect)    If we covered all the 6 categories then there is a high probability that we have covered all possible causes  People participating in brainstorming has to have some basic knowledge about the process and the problem.  Everything can not be done using spreadsheets. 'Man' has to be considered as 5th category not first. 
WhyWhy Analysis  Cause Analysis  A tool to dig into the root cause  When you want explore all root causes  Indicative (Suspect)    List of all possible root causes  Focus and dedication of Time. People participating in brainstorming has to have some basic knowledge about the process and the problem.  Knowing where to stop 
Regression  Relationship  A tool to check the relationship between two factors  When you want mathematically determine the relationship between two process variables  Decisive  RSquare Value  Mathematical equation stating the relationship between the analysed variables. Whether one variable is impacting the other variable, if so how strongly it impacts.  Data of two variables (more than 2 is also possible) preferably collected at same time (simultaneously)  Correlation is not causation 
Hypothesis Testing  Comparison & Validity of Assumption  A tool to compare two data sets for equality or one data set against a standard  When you want to statistically validate / compare the difference between properties of two or more data sets or one data set property against a standard value (spec)  Decisive  pvalue (more than 0.05 is favouring assumption of equality)  Statistical (scientific) decision about the population of data sets  whether equal/same or significantly different.  Clarity on what to compare  mean or dispersion. Right way of forming statement of assumptions (Null & Alternate Hypotheses).  May contradict mathematical decision. 'Equal' does not mean equal  rather means unable to find significant difference. 