Sign In
Not register? Register Now!
Pages:
7 pages/≈1925 words
Sources:
Check Instructions
Style:
MLA
Subject:
Mathematics & Economics
Type:
Statistics Project
Language:
English (U.S.)
Document:
MS Word
Date:
Total cost:
$ 36.29
Topic:

Differential Analysis Using R’s Edger Economics Statistics Project

Statistics Project Instructions:

Follow instructions. Using R code to do this and summarize in words. Statistical form are necessary to be attached from R code to conclude in the paper. All material will be in the file I uploaded and progress we have done.
This is group work. And saying at most 20 pages. I just need you write the report following the instruction and I need it just 7 pages.
Data is in the other pdf which is online resource, you can access that since it is public.
Let me know if you could not find the right one

 

1 Project Detail Your final project report should be at most 20 pages. Your write-up should contain the following sections: • Introduction: It must contain the following – Describe the dataset. – Identify the problem of interest: choose a data set, describe the data set and identify the problem you are interested in. • Methodology: – Describe the software you intend to use. – Describe in detail the methods you have chosen. – Does your data has missing data? Describe how you treat missing data and why? • Results and Discussion: – All the results (figures, tables etc) goes in this section. – Discuss your findings and what the results mean. – All your tables and figures need to be labelled properly. • Conclusion: – What conclusion do you draw from your analysis? • References: – List of references that you have cited in your work. • Appendix: – Any additional Information you want to add. • Individual Contribution: – In addition to the final report, each student must submit a page summarizing what their contribution to the project was. A few things to consider in your report as they will be used for evaluation: 1 Criteria Information is presented in a logical sequence. Complexity and appropriate of the analysis for the class. Provides introduction to dataset and problem. Provides introduction to statistical methods and software packages. Technical terms well-defined. The figures and tables are well labelled. There is an obvious conclusion from the study. References are cited appropriately. Report is well prepared and readable. VERY IMPORTANT You will be penalized for grammatical errors. 2

Statistics Project Sample Content Preview:
Student’s Name
Professor’s Name
Course
Date
Differential Analysis Using R’s Edger
Introduction
The datasets
Two datasets were used in this analysis where the in-depth analysis was performed on both data sets independently, and the results compared. The first dataset was obtained from The Cancer Genome Atlas (TGCA). The first dataset consisted of Lung adenocarcinoma gene expressions. The mRNAseq preprocessor picked the “scaled estimate” value from Illumina HiSeq/GA2 mRNAseq level_3 (v2) dataset and made the mRNAseq matrix with log2 transformed for the downstream analysis. Preprocessing had already been done, but the raw data was available if necessary. The second dataset was from Bioconductor, a study on lung cancer gene expression. The data was initially published on Bioconductor in 2004 (Scharpf R, Zhong S, Parmigiani G (2019). lungExpression: ExpressionSets for Parmigiani et al., 2004 Clinical Cancer Research paper., R package version 0.24.0). The dataset called “lungExpression” was represented as an ExpressionSet and was already preprocessed.
After observing the two sets of data, a search was embarked on the best ways to integrate the two platforms usefully. However, the challenge lied in finding the most effective means to this end, a current research goal. The goal aims at unifying the two studies and reporting on how their respective findings compare post-integration. There is hope to strengthen the findings made using the two platforms. Such findings ought to ultimately help in understanding the relationship between the human genome and lung cancer. There exists a need to find the best way to get the dataset from Broad Institute into a usable format in R. Further, translation of these datasets in an integrate-able way for further analysis would be essential. Therefore, edgeR would be used to complete a differential expression analysis of the integrated data set to compare the prior results.
However, there would be a chance of an inability to integrate the two platforms because of the lack of information needed to do so. Hence, an attempt to get what would be required from the publishers of the data or another publication would be conducted. If it fails, an in-depth analysis would be performed on both data sets independently and the results compared against each other. RNA sequencing (RNA-seq) has become a very widely used technology for profiling gene expression. One of the most common aims of RNA-seq profiling is to identify genes or molecular pathways that are differentially expressed (DE) between two or more biological conditions. Conclusively, the goal of this analysis was to find out what genetic features are related to lung cancer (FireBrowse, firebrowse.org/? cohort=LUSC#).
Methodology
The paper demonstrated a computational workflow for the detection of differentially expressed genes and pathways from RNA-seq data by providing a complete analysis of an RNA-seq experiment profiling of genetic features related to lung cancer. The workflow used R software packages from the open-source Bioconductor project, and it covered all steps of the analysis pipeline, including alignment of reading sequences, data exploration, differential expression analysis, visuali...
Updated on
Get the Whole Paper!
Not exactly what you need?
Do you need a custom essay? Order right now:

👀 Other Visitors are Viewing These MLA Statistics Project Samples:

HIRE A WRITER FROM $11.95 / PAGE
ORDER WITH 15% DISCOUNT!