Statistical Tools & Application Matrix - CSense Management Solutions Pvt Ltd

June 24, 2021

What are statistical tools?

Statistics is a branch of science that deals with samples, measurements, calculations and estimation of population properties.

Statistical Process Control

Statistics found its application in manufacturing way back in the 1920s with Dr Shewhart of Bell Laboratories. We acclaim him for his application of Statistics in Manufacturing Science to predict when the process will produce a defect.

From there on, the application of statistics in manufacturing spread across the world and became instrumental in the Japanese Quality Revolution.

According to Dr Shewhart, there are only two mistakes the people in manufacturing are doing –

Reacting when they are not supposed to react and
Not reacting when they supposed to react

i.e., there are two types of deviations happen in any process. One being inherent variations (aka, common cause variations) of the process. And the second is variations happening due to some unique disturbances (aka assessable causes or special causes).

He recommended using the statistical process of differentiating between these two types of causes. To simplify the differentiation, he used control charts for the same.

7 QC Tools

When Dr Deming and Dr Juran visited Japan in 1951, they instilled a culture of data-based quality improvement. Dr Deming introduced process flow diagrams, other basic graphical tools. At the same time, Dr Juran taught tools Pareto charts. Similarly, industry experts in Japan had devised tools like cause-and-effect diagram (Dr Ishikawa), why-why analysis and more.

Statistics and Six Sigma

Statistician Bill Smith with the help of Dr Michael Harry developed a problem-solving method based on statistics in Motorola. Processes can benefit by reaching a quality level of 99.99967% of acceptance through applying this statistical methodology. This level of quality is denoted as sixth level in Sigma Scale.

Bob Galvin, the then CEO of Motorola named the method after its result, i.e., Six Sigma Methodology.

Six Sigma methodology is a framework that uses the best practices of project management, principles of statistics and best practices of statistical and improvement techniques.

We frequently use below Statistical techniques in Six Sigma methodology.

Sampling – sampling methods and then size,
Data Collection and Visualisation,
Screening of Data,
Stratification – using QC tools,
Measurement System Analysis,
Stability of process – using control charts,
Process capability measurements,
Comparison of multiple data sets,
Establishing the relationship between two or more variables,
Utilising control charts to control the parameters after improvement.

Application of Statistical Tools

Now we have a plethora of statistical tools, and we have an ocean of know-hows in the repository.

But I frequently see that process experts getting overwhelmed by the tools. And sometimes, we end up applying the wrong tool. Even worse, we expect tools to give us more information out of the intended scope.

We need to remember, that every tool is developed for a defined purpose. And hence, they work well only when we use them correctly.

Many a time, people ask me is there a ready reckoner for beginners as well as experts.

Tools are tools!

Tools are tools; They benefit the user depending on how rightly he uses them and how well he uses them. So, I am sharing the Application Matrix.

This matrix talks about the tool, its intended purpose of application and the outcome we can derive from it. And it clarifies when to apply the tool and prerequisites.

Statistical Tools – Application Matrix

Statistical Tools Application Matrix.csv

	Key word	What is it?	When to use	Decision Making	Deciding Factor	Outcome	Prerequisite	Point of caution
Histogram	Data Distribution	A tool that describes data in terms of frequency and its distribution	When you want to visualise data in the form of a curve	Indicative		Central Location and Min & Max value of data, distribution across the range and shape of the data	Continuous Data	Only one data set can be studied using one histogram. Not suitable for Sequential properties (process)
Pareto Chart	Prioritise & Vital Few	A tool to compare different attributes based on their frequency of appearance	When you want to prioritise on major category / major failures	Decisive	% Cumulative contribution > 80%	Vital 20% categories among the analysed categories	Frequency or count (repetition in raw data)	Require process expertise to categorise
Box Plot	Data Distribution & Compare	A tool to describe data distribution & compare the location and spread of multiple data sets	Similar to histogram but points out median and outliers in data; and when you want to compare the distribution of more than one data set	Indicative / Decisive	Median Line, Height of box and length of Whisker	Central location (concentration), spread of data (height of the box), skewness (position of median line) and Outliers. We can compare multiple data sets in a single graph with multiple box plots.	One parameter of multiple data sets	Indicative or comparative decisions can be taken. Not an absolute decision making tool like Pareto or Regression
Time Series	Trends and Patterns	A tool to see process behaviours over a time period / change of process behaviour over the time	When you want to understand the change in the data against a specific interval of time or sequence	Indicative (Suspect)		Can Identify typo errors, visual validation of data correctness, trends and patterns of data	The interval has to be constant (time interval / sequence interval). We need to follow Simple Systematic sampling	Need process expertise to interpret the graph
Control Chart	Statistical Stability	A tool to check whether the process needs adjustments or improvements	When you want to see whether the process is under statistical control / stability (without disturbance)	Decisive / Indicative (suspect)	Any point of control limit	Statistical stability of process & presence of any special cause variation. A process with one or more special causes is considered not ready for improvement. Control Limits also indicate where we can expect the next data point	We need to follow Systematic sampling to derive a control chart. The interval has to be constant (time interval / sequence interval). Systematic Sub-group sampling will be required to construct a X-bar R Chart of X-bar S Chart	Higher stability does not mean higher capability
Normality Test	Normality	A tool to check whether the data is following a normal distribution (bell-shaped curve) or not	When you want to measure how much close (resembles) the data is to the ideal normal curve	Decisive	p-value (more than 0.05 is said to be close enough to Normal data)	Same as purpose
Process Capability Cp	Capability	A tool to measure the ability of our process to meet customer requirement with respect to the studied parameter in short term	When you want to measure our capability of process to meet customer specifications	Decisive	Cp Value	How less is our process variation in comparison with the allowed variation	The interval has to be constant (time interval / sequence interval).	To be used for only established process (not for a new process). Use of short term Standard Deviation.
Process Capability Cpk	Capability	A tool to measure the ability of our process to meet customer requirement with respect to the studied parameter in short term	When you want to measure the capability of process to meet customer specifications	Decisive	Cpk Value (Least of Cpk-Upper and Cpk-Lower is considered as Process Cpk	How much close is my mean to the target and how much less my process variations compared to allowed variations	The interval has to be constant (time interval / sequence interval).	Use of short term Standard Deviation
DPMO	Capability	A tool to measure the number of defects the process will produce	When you want to assess our process capability in terms of number of defects produced by the process	Decisive	DPMO Value	How many defects will my process produce in future	We need to count the number of defects produced in a process and not the number of defective pieces. i.e., % rejection data cannot be converted into DPMO	It is different from PPM. Optimum number of Defect Opportunities
Brainstorming	Cause Analysis	A tool to collect expert ideas or opinions on a particular subject from a small team	When you want to gather all possible causes of a rejection / all possible ideas for a solution	Indicative (Suspect)	-	List of all possible cause of failure or solutions that are mutually exclusive and collectively exhaustive	People participating in brainstorming has to have some basic knowledge about the process and the problem. Everyone has to participate.	Every point counts. To be exhaustive
Fish-bone Analysis	Cause Analysis	A framework to collect ideas from people related to 6 Categories of failures (6Ms)	When you want to gather all possible causes of a rejection / all possible ideas for a solution	Indicative (Suspect)	-	If we covered all the 6 categories then there is a high probability that we have covered all possible causes	People participating in brainstorming has to have some basic knowledge about the process and the problem.	Everything can not be done using spreadsheets. 'Man' has to be considered as 5th category not first.
Why-Why Analysis	Cause Analysis	A tool to dig into the root cause	When you want explore all root causes	Indicative (Suspect)	-	List of all possible root causes	Focus and dedication of Time. People participating in brainstorming has to have some basic knowledge about the process and the problem.	Knowing where to stop
Regression	Relationship	A tool to check the relationship between two factors	When you want mathematically determine the relationship between two process variables	Decisive	R-Square Value	Mathematical equation stating the relationship between the analysed variables. Whether one variable is impacting the other variable, if so how strongly it impacts.	Data of two variables (more than 2 is also possible) preferably collected at same time (simultaneously)	Correlation is not causation
Hypothesis Testing	Comparison & Validity of Assumption	A tool to compare two data sets for equality or one data set against a standard	When you want to statistically validate / compare the difference between properties of two or more data sets or one data set property against a standard value (spec)	Decisive	p-value (more than 0.05 is favouring assumption of equality)	Statistical (scientific) decision about the population of data sets - whether equal/same or significantly different.	Clarity on what to compare - mean or dispersion. Right way of forming statement of assumptions (Null & Alternate Hypotheses).	May contradict mathematical decision. 'Equal' does not mean equal - rather means unable to find significant difference.

Click here to download PDF version

Statistical-Tools-Application-Matrix Download

Further Clarification

Feel this article useful?

Categories: Statistics