SCM200 Mathematics & Economics Statistics Project Paper

SCM200
Subject and Section
1 Upon an initial observation of the data set, it is apparent that the whole sheet is composes of all kinds of values, including qualitative (ordinal and nominal) and quantitative ones (integral, interval, and ratio). While there are a lot of fields (i.e., ID, dti_joint, pymnt_plan, etc.) with missing values, it does not pose any threat to preciseness of the data for most of these were only omitted for confidentiality or unavailability of information. Additionally, as compared to the common mistakes committed by other data sets, this specific one does not substitute any qualitative values (i.e., nothing), with a numeric one (0) lessening the risks associated with automation in data processing. There are also no duplicated fields and the formatting of the numeric values were similar throughout the data sheet (i.e., date format; month-date). Additionally, the data provided were specific enough for its intended purpose, while any possible misconceptions are mitigated as this data sheet includes a data dictionary. However, one of the possible problem in this data sheet is the difference in the place values of some of the items. Generally, the data sheet records place values to the ‘hundredths’ place, but some of the data were solely whole integers. It wasn’t made clear as to whether this estimation represents a null value after the period or not. While this might not seem as a big error in terms of automated data processing, it could lead to serious errors when data is manually transcript from one data sheet to another. All in all, the data taken and presented in the data sheet entitled “LoanStats3a” is fairly accurate and representative of the data that it seeks to understand and represent.

