Sign In
Not register? Register Now!
Pages:
2 pages/≈550 words
Sources:
Check Instructions
Style:
APA
Subject:
Mathematics & Economics
Type:
Statistics Project
Language:
English (U.S.)
Document:
MS Word
Date:
Total cost:
$ 10.37
Topic:

NYCHA Resident Data Book Summary

Statistics Project Instructions:

Project instruction:Choose a dataset from the NYC Open Data website (https://opendata(dot)cityofnewyork(dot)us/data/) and write a report on a hypothesis test using the dataset. Your report should introduce the dataset and mention your objective. The hypothesis test should be complete (with all four steps). The report should be written and submitted as a Word document. You are allowed to use Excel as work (show screen shot of Excel). The project is due Dec 4, 11:59 pm EST on Blackboard. Attached is a template project you can refer to when writing your own report (Note: you may need to do some cleaning and filtering for the selected dataset before it can be used for analysis).


Project template


Title: Comparison of the average number of employees between the finance and retail businesses in New York City


The dataset I chose is "NYC Business Acceleration Businesses Served and Jobs Created". The dataset lists the number of businesses that NYC Business Acceleration has assisted in opening and how many jobs were created by those businesses. Each row in the original dataset represents one business. My study goal is to analyze how two different business sectors compare in their average number of jobs created.


First, I selected the business sectors "Finance and Insurance" and "Retail Trade". I then deleted the businesses that did not report the "number of employees". I rearranged the selected data into two columns: one column (named "Finance and Insurance") listing the number of employees that are in the finance and insurance businesses (sample size is 17); and the other column (named "Retail Trade") listing the number of employees that are in the retail trade businesses (sample size is 125). Below is part of the filtered dataset.





Finance and Insurance



Retail Trade



12



25



12



25



14



3



14



5



7



1



14



1



14



350



15



125



14



5




Since the two columns list different companies from different sectors, I assume that they are independent samples. For businesses that are in Finance and Insurance, the sample size is relatively small (<30), so I assume the number of employees in all Finance and Insurance businesses in NYC is normally distributed. Finally, since I only have sample data, population standard deviation of the number of employees in each business is unknown. Under these assumptions, I set up the hypotheses as where  denotes the average number of employees in the Finance and Insurance businesses while  denotes the average number of employees in the Retail Trade businesses.


Next, I set the significance level (Type I error limit) at .


Then, using Excel (Data---Data Analysis---t Test: Two-Sample Assuming Unequal Variances) and keeping the alpha at 0.05 (see below), the Excel shows



t-Test: Two-Sample Assuming Unequal Variances


     

 



Variable 1



Variable 2



Mean



13.88235



64.296



Variance



4.110294



10715.24



Observations



17



125



Hypothesized Mean Difference



0


 

df



125


 

t Stat



-5.43739


 

P(T<=t) one-tail



1.36E-07


 

t Critical one-tail



1.657135


 

P(T<=t) two-tail



2.73E-07


 

t Critical two-tail



1.979124



 





 




The t statistic is -5.44. Since this is a two-tailed test, the p-value equals 0.000000273, which is much less than . Therefore, we reject the null hypothesis and conclude that the Finance and Insurance and Retail Trade businesses have different average numbers of employees in NYC. This is probably due to the fact that in a few Retail Trade businesses, the numbers of employees are extremely large, resulting in large mean values. It shall be worth investigating what these businesses are.




 



Words Characters Reading time 

Words Characters Reading time 
Statistics Project Sample Content Preview:
Name Course Instructor Date
Title: Relationship between average total gross income and total head of household (HOH)
Being 62 years and over
The dataset is “NYCHA Resident Data Book Summary”, which contains resident demographic data including housing and development data under the “NYCHA Resident Data Book Summary”. The NYC Open Data using data from the New York City Housing Authority (NYCHA), reported on housing and development. The variables chosen are “All Average Total Gross Income" and "Total HOH 62 Years and Over as Percent of Families" (sample size is 33) (NYC Open Data).
I selected the datasets by agency, then the New York City Housing Authority (NYCHA) and NYCHA Resident Data Book. Then, I filtered the dataset to include two columns the "Total HOH 62 Years and Over as Percent of Families" Below is the dataset
Hypothesis
* H0: The relationship between average total gross income and total head of household (HOH) =0
* Ha: The relationship between average total gross income and total head of household (HOH) ≠0
The level of significance is set at 5%
Then, using Excel (ANOVA and regression) and keeping the alpha at 0.05 (see below),
The excel shows
ANOVA






 

df

SS

MS

F

Significance F

Regression

1

2721265831

2.72E+8

271.013627

7.08723E-17

Residual

31

31127305.4

1004107



Total

32

303253889

 

 

 







 

Coefficients

Standard Error

t Stat

P-value

Lower 95%

Intercept

15822.77416
Updated on
Get the Whole Paper!
Not exactly what you need?
Do you need a custom essay? Order right now:

👀 Other Visitors are Viewing These APA Statistics Project Samples:

HIRE A WRITER FROM $11.95 / PAGE
ORDER WITH 15% DISCOUNT!