Quantitative analysis: households with vehicles

Introduction

This paper conducts quantitative analysis of the car ownership by the state in the years 2015 and 2016. The analysis conducted includes descriptive statics; mean, median, mode, variance, minimum, maximum and standard deviation. Also, graphical analysis such frequency histogram, box plot and scatter plots are conducted. The data for 2015 and 2016 are then compared, and their correlation determined.

  1. The arrangement of the 2016 Vehicles per Household data in descending order generates the table below
Jurisdiction2016 Vehicles per Household
Murrieta, California2.36
Jurupa Valley, California2.32
Moreno Valley, California2.32
West Jordan, Utah2.3
Corona, California2.29
Simi Valley, California2.29
Fontana, California2.27
Norwalk, California2.27
Pomona, California2.27
Santa Ana, California2.25
Temecula, California2.25

In this data set, Jurupa Valley, Moreno Valley, California shares an average of 2.32, Corona and  Simi Valley, California shares an average of 2.29, Fontana, Norwalk, and Pomona shares the average of 2.27 while Santa Ana, California and Temecula, California Shares an average of 2.25. This result in 4 classes as observed in the histogram.

 The plotting of the histogram in excel for the averages generated the histogram chart below

Figure 1 2016 Histogram

  • Relative Frequency Graph

The frequency plot for the data set 2016 percentage of households without vehicles was very unusual. The graph did not indicate any normal distribution pattern for the data set. The graph is presented in the chart below due to some errors in the research model or data collection.

Figure 2 2016 frequency histogram

  • Box plot

In order to construct the box plot, the minimum, first quartile, third quartile, median and maximum values are determined. These values are then used to obtain the P-values as indicated performed in excel and displayed in the second column of the table.

The box chat is displayed in figure 3.

Table one                

Table 1 Box Plot Values

Minimum0.70%0.70%
Quartile 14.90%4.20%
Median7.30%2.40%
Quartile 310.90%3.60%
Maximum54.40%43.50%

Figure 3 2016 box plot

Figure 4 box chat

  •  Numerical Descriptive Statistics for the “2016 Vehicles per Household

The descriptive analysis for the 2016 vehicles per household generated the result displayed in the table below

Table 2Figure 5 2016 numerical (descriptive) analysis

Average1.715362776
Standard Deviation0.298962781
Maximum2.36
Minimum0.63
Median1.72
Mode1.69
Variance0.089378745

The average value for the 2016 vehicles per household is 1.7154 (correct to 4 decimal places). The maximum value is 2.36 while the minimum value is 0.63. The minimum and the maximum values are located around the mean. These three values can, therefore, be used as a measure of central tendency.  The mode is 1.69, and the median is 1.72 these values to illustrate the central tendency or the distribution of cars per household. The standard deviation value is 0.299 correct to three decimal places. The low standard deviation indicates that the number of cars per household is concentrated around the mean. The variance also emphasizes this statistical analysis given that it is lower than the standard deviation; the variance is 0.0894 correct to four decimal places.

  • Outliers in 2016 Percentage Of Households Without Vehicle

In the determination of outliers, the first quartile, third quartile and inter-quartile ranges were determined. These values were then used to obtain the lower and the upper bound values. The values were then used to extrapolate the outliers in the excel as displayed in the excel format. The result for the analysis generated an above 95% true for the outliers. This indicates that the data set was greatly biased or skewed.  The impact of these is that the data is unreliable. Outliers affect the quality of dataset and should be dealt with in case they were a result of experimental error.

Table 3 5. Outliers in 2016 Percentage of Households without Vehicle

First Quartile0.049
Third Quartile0.109
Inter-Quartile Range0.06
Upper Bound0.199
Lower Bound-0.041
  • Comparing 2016 and 2015 percentages for households without vehicles.
  • Descriptive analysis

The descriptive analysis for the data set generated the following results

Table 4 2015 Descriptive Analysis Result

average9.70%
standard deviation0.07331056
maximum54.50%
minimum1.00%
median7.80%
mode0.055
variance0.005374438

The values generated in the 2016 calculations were not percentages as this are. Therefore these are two standards which make their comparison impossibility unless they were of the same unit.

  • Outliers.

The values used in the determination of the outliers are as indicated below. The outliers indicated false for all the values as contained in the excel. This indicates that the data collected in the section of the data were accurately collected which justify the result of the study.

Table 5 2015 outlier values

First Quartile0.049
3rd Quartile0.109
Inter-Quartile0.06
Upper Bound0.199
Lower Bound-0.041
  •  Frequency Histogram.

The frequency plot for the result of 2015 was similar to the plot for 2016. There was no normal distribution curve for the data set.  The plot is displayed in the chart below and also contained in the excel

Figure 5 frequency histogram for the 2015 dataset

  •  box plot

The box plot for the data set in 2015 is similar to that of 2016. The plot is contained in the below. The third series lies before 10.00% while the fourth series greatly before 10.00% with a small portion spilling over to the second range of 10-20%.

Figure 6 2015 box plot

  • Scatter plot for the 2015 and 2016 the average number of households with vehicles

The scatter plot displays a rising linear representation which is an indicator of a positive correlation. Thus there is a correlation between the data sets.

Figure 8 scatter plot for the 2015 and 2016 data on average households with vehicles

Conclusion

The quantitative analysis indicates the correlation between the data samples for 2015 and 2016. However, there are differences as represented in the outlier analysis whereby 2016 had a higher percentage of almost 100% outliers while the 2015 dataset had no outlier. This indicates that the research procedure for the 2016 analysis could have been falsified, or the results were accurate but there is a confounding variable.

error: Content is protected !!
Scroll to Top