Github and R Markdown, and mastery in the practice of regression analysis. (Statistics Project Sample)
Weight (% of final grade):
11:59 ADT December 10th, 2020
Upon successful completion of this project, students will possess a working knowledge of Githuband R Markdown, and mastery in the practice of regression analysis. These skills are highly valuedglobally by employers in search of data scientists.
Each group will be assigned a dataset. Collectively, group members are to perform a completeregression analysis of their data, details of which must be presented on Github(https://github.com) using R Markdown (https://rmarkdown.rstudio.com/articles_intro.html).The following sections must be included:
Abstract (150 words or less)
Introduction (must contain a thorough description of the questions of interest)Data Description (must contain data visualizations that are properly labelled and explained)Methods (must contain a complete description of all analysis tools used)Results (all figures should be properly labelled and discussed)Conclusion (must contain a concise discussion of what has been learned from the analysis)Appendix (must include all data and R Markdown files for reproducibility)
Datasets are found at https://lionbridge.ai/datasets/10-open-datasets-for-linear-regression/.Groups 1-5 are to analyse Dataset 1 (Cancer), Groups 6-10 are to analyse Dataset 2 (CDC) ,Groups 11-15 are to analyse Dataset 3 (Fish Market), Groups 16-20 are to analyse Dataset 4(Medical Insurance), Groups 21-25 are to analyse Dataset 5 (New York Stock Exchange), Groups26-30 are to analyse Dataset 7 (Real Estate), Groups 31-35 are to analyse Dataset 8 (Red Wine),Groups 36-40 are to analyse Dataset 9 (Vehicle), Groups 41-45 are to analyse Dataset 10 (WHO).Note: Before commencing your analysis, you must introduce one new additional data point intoyour assigned dataset. A description of this unique data point must be included in your DataDescription section along with some rationale for the values chosen.
Grading Scheme:6 Overall presentation and organization of materials3 Quality of data visualizations6 Correctness of analysis4 Quality and selection of relevant figures6 Interpretation of results--25Regression and Analysis of Variance STAT 3340 / MATH 3340Fall 2020Final Project
Regression and Prediction Using R
As part of daily life, machine learning is used to make decisions, especially by data scientists. This paper aimed to incorporate machine learning algorithms in the prediction of vehicle prices. First, the car.csv dataset was inspected, cleaned, and organized. A final dataset was arrived at and used for further analysis. The dataset was fitted with Present_Price as the response variable and the rest as explanatory variables using the basic linear model. Three algorithms, linear regression, random forest, and support vector machines, were selected for modeling. Data were partitioned into two; training and testing set. The training data was used to predict the prices on the testing set. The models' performances were evaluated and improved using tuning, cross-validation, and checked for overfitting. Lastly, the models were compared against one another using a calculated RMSE (Root Mean Squared Error). The best performing model was chosen, hence leading to the arrival of meaningful conclusions.
Keywords: Algorithms, linear regression, random forest, basic linear model, dataset, regression, and output.
Every day, applications for machine learning are typical to come across. The algorithms help in making critical decisions in every field of work. For instance, media sites rely on machine learning to sift through millions of options to give you song or movie recommendations, and retailers use it to gain insight into their customers' purchasing behavior. Closer to home, data scientists using machine learning to advise on future data patterns and behavior that could be encouraged or discouraged. Besides, it entails building models that offer predictive power and can be used to understand data not yet collected.
YOU MAY ALSO LIKE
You Might Also Like Other Topics Related to song analysis:
- Song Violeta Parra-Volver a los 17 letra. Literature & Language EssayDescription: Violeta Parra's Volver a Los 17, translated as Returning to Seventeen, is a Spanish song that talks about the youthful experiences of someone.The song Volver a Los 17 roughly expresses a person's feelings and growth as seventeen years old....1 page/≈275 words | MLA | Literature & Language | Essay |
- Media anaylsis paper Communications & Media Essay PaperDescription: Typically, the music evokes certain emotions of far-reaching consequences psychologically, cognitively, socially, and culturally. For as far back as humans started making music, music has been a staple of human life for a multiplicity of entertainment, educational, medical, and sporting purposes...4 pages/≈1100 words | APA | Communications & Media | Essay |
- Violence and Redemption in the Film ParasiteDescription: Looking at the last scenes in the movie, one can unpack the loaded social critique within them. This paper explores violence in the film's last scenes by examining what led to it and how it adds to the film's meaning....3 pages/≈825 words | MLA | Visual & Performing Arts | Movie Review |