• V1_225106
  • 380.1KB
  • zip
  • 0
  • VIP专享
  • 0
  • 2022-05-15 07:08
MechaCar_Statistical_Analysis 项目概况 在开始担任新职务几周后,高层管理人员与杰里米(Jeremy)接洽了一个特殊项目。 AutosRUs的最新原型机MechaCar遭受生产问题的困扰,这些问题阻碍了制造团队的进步。 AutosRUs的高层管理人员已邀请Jeremy和数据分析团队审查生产数据,以获取有助于制造团队的见解。 在此挑战中,您将帮助Jeremy和数据分析团队执行以下操作: 执行多元线性回归分析,以识别数据集中哪些变量可预测MechaCar原型的mpg 从生产批次中收集悬挂线圈的每平方英寸磅数(PSI)的摘要统计信息 进行t检验以确定制造批次是否在统计上与平均人口不同 设计统计研究,以比较MechaCar车辆与其他制造商的车辆的车辆性能。对于每个统计分析,您都将对结果进行摘要说明。 线性回归预测MPG 在上面的输出中,我们可以看出: 车辆长度和车辆离地
  • MechaCar_Statistical_Analysis-main
  • Resources
  • Photos
  • Deliverable2_total_summary.png
  • Deliverable3_sampletest_Lot3.png
  • Deliverable3_sampletest_Lot2.png
  • Deliverable3_sampletest_Lot1.png
  • Deliverable1.png
  • Deliverable2_lot_summary.png
  • Deliverable3_sampletest.png
  • Suspension_Coil.csv
  • MechaCar_mpg.csv
  • MechaCarChallenge.RScript.R
# MechaCar_Statistical_Analysis ## Project Overview A few weeks after starting his new role, Jeremy is approached by upper management about a special project. AutosRUs’ newest prototype, the MechaCar, is suffering from production troubles that are blocking the manufacturing team’s progress. AutosRUs’ upper management has called on Jeremy and the data analytics team to review the production data for insights that may help the manufacturing team. In this challenge, you’ll help Jeremy and the data analytics team do the following: - Perform multiple linear regression analysis to identify which variables in the dataset predict the mpg of MechaCar prototypes - Collect summary statistics on the pounds per square inch (PSI) of the suspension coils from the manufacturing lots - Run t-tests to determine if the manufacturing lots are statistically different from the mean population - Design a statistical study to compare vehicle performance of the MechaCar vehicles against vehicles from other manufacturers. For each statistical analysis, you’ll write a summary interpretation of the findings. ## Linear Regression to Predict MPG ![](Resources/Photos/Deliverable1.png) In the Above output, we are able to tell that: - Vehicle length and vehicle ground clearance provide non-random amounts of variance to the model statistically. Both have an impact on miles per gallon (mpg) for the MachaCar prototype. Alternatively, the other three (vehicle weight, spoiler angle, and AWD-All Wheel Drive) have p-values that distribute a random amount of vairance within the dataset. - The p-value we are working with is smaller than the assumed significance level of 0.05%. What this tells us is that we can reject the null hypothesis because we have enough evidence to do so, confirming that the slope is not zero. - Looking at the r-squared value of this linear model, we can determine that 0.7149 (approximately 72%) of the mpg predictions will be determined by this model. Based on this, we can confirm that this model does predict mpg of MechaCar pprototypes effectively. ## Summary Statistics on Suspension Coils ![](Resources/Photos/Deliverable2_total_summary.png) ![](Resources/Photos/Deliverable2_lot_summary.png) When we look at all of the manufacturing lots and whetherthe design fits the specifications for MechaCar suspension coils, we notice that the variance of the coils is approximately 62.3 PSI, which it well within the 100 PSI variance requirement. Because of this, the current manufacturing data does meet the design specification of 100 pounds per square inch for all manufacturing lots in total. When we break the lots down to their prospective lots (1, 2 & 3) we notice that lot 1 (0.98 variance) and lot 2 (7.47 variance) are most certainly within the 100 PSI vairance requirement. Lot 3, however, has a much higher variance (170.29 variance). With this variance so high, it is causing a disproportionate variance at the full lot reading. In other words, this means that the current manufacturing data does not meet the design specification of 100 pounds per square inch for each lot individually. ## T-Tests on Suspension Coils The next step in our process is to administer a t-test based on the suspension coil data so that we can determine if there is a statistical difference between the mean of our dataset and a potential population dataset. The population mean of 1500 is what we used in the following analysis: ### Sample t-test ![](Resources/Photos/Deliverable3_sampletest.png) - Above is the summary of the t-test results from all manufacturing lots. In the above sample test, we find that the true mean of the sample is actually 1498.78. We cannot reject the null hypothesis because there is not enough evidence to support the decision to do so. Combined, all three manufacturing lots are statistically comparable to the presumed population mean of 1500. ### 1. Sample t-test: lot1 ![](Resources/Photos/Deliverable3_sampletest_Lot1.png) 1. ### 2. Sample t-test: lot2 ![](Resources/Photos/Deliverable3_sampletest_Lot2.png) ### 3. Sample t-test: lot3 ![](Resources/Photos/Deliverable3_sampletest_Lot3.png) ## Challenge Sumary