Introduction
This paper conducts quantitative analysis of the car ownership by the state in the years 2015 and 2016. The analysis conducted includes descriptive statics; mean, median, mode, variance, minimum, maximum and standard deviation. Also, graphical analysis such frequency histogram, box plot and scatter plots are conducted. The data for 2015 and 2016 are then compared, and their correlation determined.
- The arrangement of the 2016 Vehicles per Household data in descending order generates the table below
Jurisdiction | 2016 Vehicles per Household |
Murrieta, California | 2.36 |
Jurupa Valley, California | 2.32 |
Moreno Valley, California | 2.32 |
West Jordan, Utah | 2.3 |
Corona, California | 2.29 |
Simi Valley, California | 2.29 |
Fontana, California | 2.27 |
Norwalk, California | 2.27 |
Pomona, California | 2.27 |
Santa Ana, California | 2.25 |
Temecula, California | 2.25 |
In this data set, Jurupa Valley, Moreno Valley, California shares an average of 2.32, Corona and Simi Valley, California shares an average of 2.29, Fontana, Norwalk, and Pomona shares the average of 2.27 while Santa Ana, California and Temecula, California Shares an average of 2.25. This result in 4 classes as observed in the histogram.
The plotting of the histogram in excel for the averages generated the histogram chart below

Figure 1 2016 Histogram
- Relative Frequency Graph
The frequency plot for the data set 2016 percentage of households without vehicles was very unusual. The graph did not indicate any normal distribution pattern for the data set. The graph is presented in the chart below due to some errors in the research model or data collection.

Figure 2 2016 frequency histogram
- Box plot
In order to construct the box plot, the minimum, first quartile, third quartile, median and maximum values are determined. These values are then used to obtain the P-values as indicated performed in excel and displayed in the second column of the table.
The box chat is displayed in figure 3.
Table one
Table 1 Box Plot Values
Minimum | 0.70% | 0.70% | |
Quartile 1 | 4.90% | 4.20% | |
Median | 7.30% | 2.40% | |
Quartile 3 | 10.90% | 3.60% | |
Maximum | 54.40% | 43.50% |

Figure 3 2016 box plot
Figure 4 box chat
- Numerical Descriptive Statistics for the “2016 Vehicles per Household
The descriptive analysis for the 2016 vehicles per household generated the result displayed in the table below
Table 2Figure 5 2016 numerical (descriptive) analysis
Average | 1.715362776 |
Standard Deviation | 0.298962781 |
Maximum | 2.36 |
Minimum | 0.63 |
Median | 1.72 |
Mode | 1.69 |
Variance | 0.089378745 |
The average value for the 2016 vehicles per household is 1.7154 (correct to 4 decimal places). The maximum value is 2.36 while the minimum value is 0.63. The minimum and the maximum values are located around the mean. These three values can, therefore, be used as a measure of central tendency. The mode is 1.69, and the median is 1.72 these values to illustrate the central tendency or the distribution of cars per household. The standard deviation value is 0.299 correct to three decimal places. The low standard deviation indicates that the number of cars per household is concentrated around the mean. The variance also emphasizes this statistical analysis given that it is lower than the standard deviation; the variance is 0.0894 correct to four decimal places.
- Outliers in 2016 Percentage Of Households Without Vehicle
In the determination of outliers, the first quartile, third quartile and inter-quartile ranges were determined. These values were then used to obtain the lower and the upper bound values. The values were then used to extrapolate the outliers in the excel as displayed in the excel format. The result for the analysis generated an above 95% true for the outliers. This indicates that the data set was greatly biased or skewed. The impact of these is that the data is unreliable. Outliers affect the quality of dataset and should be dealt with in case they were a result of experimental error.
Table 3 5. Outliers in 2016 Percentage of Households without Vehicle
First Quartile | 0.049 |
Third Quartile | 0.109 |
Inter-Quartile Range | 0.06 |
Upper Bound | 0.199 |
Lower Bound | -0.041 |
- Comparing 2016 and 2015 percentages for households without vehicles.
- Descriptive analysis
The descriptive analysis for the data set generated the following results
Table 4 2015 Descriptive Analysis Result
average | 9.70% |
standard deviation | 0.07331056 |
maximum | 54.50% |
minimum | 1.00% |
median | 7.80% |
mode | 0.055 |
variance | 0.005374438 |
The values generated in the 2016 calculations were not percentages as this are. Therefore these are two standards which make their comparison impossibility unless they were of the same unit.
- Outliers.
The values used in the determination of the outliers are as indicated below. The outliers indicated false for all the values as contained in the excel. This indicates that the data collected in the section of the data were accurately collected which justify the result of the study.
Table 5 2015 outlier values
First Quartile | 0.049 |
3rd Quartile | 0.109 |
Inter-Quartile | 0.06 |
Upper Bound | 0.199 |
Lower Bound | -0.041 |
- Frequency Histogram.
The frequency plot for the result of 2015 was similar to the plot for 2016. There was no normal distribution curve for the data set. The plot is displayed in the chart below and also contained in the excel

Figure 5 frequency histogram for the 2015 dataset
- box plot
The box plot for the data set in 2015 is similar to that of 2016. The plot is contained in the below. The third series lies before 10.00% while the fourth series greatly before 10.00% with a small portion spilling over to the second range of 10-20%.

Figure 6 2015 box plot
- Scatter plot for the 2015 and 2016 the average number of households with vehicles
The scatter plot displays a rising linear representation which is an indicator of a positive correlation. Thus there is a correlation between the data sets.
Figure 8 scatter plot for the 2015 and 2016 data on average households with vehicles
Conclusion
The quantitative analysis indicates the correlation between the data samples for 2015 and 2016. However, there are differences as represented in the outlier analysis whereby 2016 had a higher percentage of almost 100% outliers while the 2015 dataset had no outlier. This indicates that the research procedure for the 2016 analysis could have been falsified, or the results were accurate but there is a confounding variable.