Sign In
Not register? Register Now!
Pages:
5 pages/≈1375 words
Sources:
3 Sources
Style:
APA
Subject:
Management
Type:
Other (Not Listed)
Language:
English (U.S.)
Document:
MS Word
Date:
Total cost:
$ 26.1
Topic:

Identify Patterns In Data: Student Rate

Other (Not Listed) Instructions:

Download and open the dataset Most-Recent-Cohorts-Scorecard-Elements.csv.  To download the dataset Most-Recent-Cohorts-Scorecard-Elements.cvs, go to the Books and Resources for this Week module and click on the Scorecard Data link.

  1. Install the randtests package into R: install.packages("randtests").
  2. Identify if the data are normally distributed and any missing values.
  3. Run the runs.test on the SATAVG variable (Remove the NA values [Hint: use na.omit() command].

Then address the following in a research paper with an introduction, method, and results section:

  1. Identify the dataset and variables and provide context to the research used to collect the data (in an introduction section).
  2. Identify the analysis performed, pros and cons of calculation, and why they are used (in a method section).
  3. Describe and interpret results (in a results section).
  4. Include references.
  5. Include all R code as an appendix.

Length: 5-7 pages

References: Include a minimum of 3 scholarly resources

Your assignment should demonstrate thoughtful consideration of the ideas and concepts that are presented in the course and provide new thoughts and insights relating directly to this topic. Your response should reflect graduate-level writing and APA standards.

Other (Not Listed) Sample Content Preview:
Identify Patterns in Data Student’s Name Institutional Affiliation Identify Patterns in Data Identify the dataset and variables and provide context to the research used to collect the data (in an introduction section). On 12th September of 2015, the US Department of Education launched an online tool as an assistant for consumers (most likely students) to compare higher learning institutions based on their costs and value. The administration of the former US President (Barack Obama) spearheaded the launching of the website. The name of the website is College Scorecard. The website consolidates vital information about colleges and universities, such as yearly cost, average student debt, completion or graduation rate, expected job earnings, and employment rate. The College Scorecard is structured to improve transparency by bestowing the power of choice on the public to check out how various schools are serving their students. As diversification of people’s way of life is largely felt in terms of choices and preferences, different families and students may often choose certain schools based on yearly costs, average student rate, employment rate, etc. Over the years, the initial version of the scorecard was discredited since it lacked qualitative information e.g. location of the schools listed. The limitation was misleading and less reliable in guiding a student to their best-fit college. Additionally, criticism arose upon the revelation of inaccurate loan repayment rates for most colleges published on the site (Hanley, 2017, p. 504). The department quickly aims to fix the mistake because it was a ‘coding error’. In 2017, Donald Trump’s administration released an updated version of the site. Today, prospective scholars can create a customized search for their dream college, compare colleges on the framed search and make a decision on which school to choose. The Data.gov website contains datasets accessible and ready for use by the public. In reference to that, the dataset featured in this paper is also accessible and in RDF, CSV, JSON, and XML form on data.gov. The Most Recent Cohort Scorecard Elements dataset has 102 variables both categorical, such as map location and numerical e.g. the OPEID variable. There are several blank entries, NULL and 0 values. Specifically, the selected variable (SAV_ATG) has both numeric and null values. Identify the analysis performed, pros and cons of calculation, and why they are used (in a method section) Data distribution is a fundamental concept in statistics. By definition, a function explaining how all possible data values and their frequency is known as data distribution. Two measures of distribution; one of central tendency (mean and median) and the other measure of spread (standard deviation) describe a distribution. The most common distribution is the standard normal or Z or Gaussian that is defined by a nice bell-shaped curve. Other distributions include Poisson, binomial, Bernoulli, chi-squared, log-norma, etc. Distributions can as well be symmetric like the Z distribution or asymmetric like skewed distributions. Right-skewed distributions have tails to the right with mean greater than the median and often common in biological data. The mean tends to be more susceptible to affect ...
Updated on
Get the Whole Paper!
Not exactly what you need?
Do you need a custom essay? Order right now:

👀 Other Visitors are Viewing These APA Other (Not Listed) Samples:

HIRE A WRITER FROM $11.95 / PAGE
ORDER WITH 15% DISCOUNT!