Sign In
Not register? Register Now!
Pages:
5 pages/β‰ˆ1375 words
Sources:
No Sources
Style:
MLA
Subject:
IT & Computer Science
Type:
Coursework
Language:
English (U.S.)
Document:
MS Word
Date:
Total cost:
$ 36.45
Topic:

How Data and Machine Learning are Used within This Industry

Coursework Instructions:

Please upload your response to the lecture on Enterprise Search (Lecture6: Abhishek Singh Tomar) answering the base questions listed in the course information.
prepare a short (~2-3 pages, 12 point single space) report that addresses as many of the following questions as are relevant:
• Describe the market sector or sub-space covered in this lecture.
• What data science related skills and technologies are commonly used in this sector?
• How are data and computing related methods used in typical workflows in this sector? Illustrate with an example.
• What are the data science related challenges one might encounter in this domain?
• What do you find interesting about the nature of data science opportunities in this
domain?
In addition,
(i) What's the difference between a forward index and an inverted index? (10 pts of the 80 C+R points in the rubric) )
(ii) Describe the high level architectural components of web search. (10 pts of the 80 C+R points in the rubric) )
(iii) Also, answer the following multiple-choice questions: You can list the question number and the letter corresponding to the correct choice as Answer in your report, (2x5 = 10 pts of the 80 C+R points in the rubric)
Q1: Based on the lecture, there are 3 different actors in Web Search, the search engine users, the search engine providers, and the advertisers. Different actors have different expectations on web search results. Select the INCORRECT statement about web search actors’ expectations

A. Search engine users want high-quality search results and fast response time
B. Search engine providers want to attract more users, and reduce operational costs
C. Advertisers on search engine want to attract more users to their sites
D. Advertisers on search engine want to increase ad revenue

Q2: Based on the lecture, the indexing system performs several tasks, which of these is NOT a task of the indexing system?

E. Performs information extraction, filtering, and classification on downloaded web pages
F. Provides meta-data, metrics, and other kinds of feedback to the crawling and query processing systems
G. Based on the query data, index the retrieved textual content of pages for ranking
H. Converts the pages in the web repository into appropriate index structures that facilitate searching the textual content of pages

Q3: Based on the lecture, there are many textual content processing techniques used in the Query Interpretation System. Select all the text processing techniques mentioned in the lecture in this context.

9. Spelling correction
10. Stop-words removal
11. Word tokenization
12. Word stemming
13. Lemmatization
14. Geotagging

O. 1,2,3,4,5 B. 1,2,3,4,6 C. 1,2,4,5,6 D. All of Them

Q4. Based on the lecture, there are several processing steps in the Query Interpretation System. Select the steps in the order of the pipeline described in the lecture.

16. Spelling Correction
17. Normalization
18. Segmentation
19. Annotation
20. Stemming
21. Term Expansion
22. Query Rewriting

W. 1,2,3,4,5,6,7 B. 2,1,3,5,4,6,7 C. 2,1,3,4,5,6,7 D. 1,2,3,5,4,6,7

Q5. Based on the lecture, Machine Learning has many use cases in Enterprise, select ALL the mentioned Machine Learning use case scenarios.

24. Transformational HR Services
25. Self-Driving Customer Service
26. Conversational Bots
27. Student services

BB. 1,2,3 B. 1,2,4 C. 2,3,4 D. All of them

Coursework Sample Content Preview:
How Data and Machine Learning are Used within This Industry
Market Sector
Lecture 6 explores the enterprise market sector, specifically how Data and Machine Learning are used within this industry. Enterprise Search is used in this lecture as a case study to demonstrate the various ML and information retrieval techniques used within the enterprise sector.
Description
Enterprise content is distributed across multiple data sources in various formats. For example, in a company, employee data can be found in the employee database, the company’s internal collaboration system, and the finance ERP. This wide content distribution within the enterprise sector can make retrieving relevant content challenging. Therefore, Enterprise Search makes it easier for enterprises to find relevant content from multiple data sources without worrying about where the information is stored.
Key Data Science Skills and Technologies
Indexing
Indexing systems are an essential part of enterprise search. Their significance goes beyond information extraction, filtering, and classification. These tools enable users to search the textual content of webpages by converting pages in web repositories into appropriate index structures. Through pipelines, indexing systems pre-process documents and perform other extraction tasks on webpages. To excel in enterprise search, data science professionals must be conversant with the concept of full-text indexing and text document properties such as metadata and structure. Additionally, one must understand the use of pipelines for document processing, term selection, and the removal of bearing words.
Deep Learning
Deep learning is an essential data science technology in the enterprise market because it helps query interpretation. It influences matching users’ queries to relevant documents. Query term expansion, one of the mechanisms used in a query interpretation system, utilizes deep learning by applying synonyms, tokenization, and inverted index to maximize the successful matching of queries and results (documents).
Machine Learning
The benefits of machine learning in the enterprise sector are extensive. It plays a significant part in enterprise search because it is involved in processes such as tokenization, stopword removal stemming, and query interpretation. Supervised ML is specifically substantial when data science professionals need to develop a model that addresses the needs of their users.
Use of Data and Computing-related methods
All three high-level architecture components of Web search utilize data and computing-related methods to execute their tasks. Since queries have statistical patterns, data and computing is often leveraged to speed up the search. For example, optimization of the search results to be more user-specific is a complex process that requires the application of several computer-related methods. Search engines also often construct links and word frequencies and run this data through their ML algorithm.
Challenges in Enterprise Search
Redundancy
Redundancy is a common problem in the enterprise sector because project teams traditionally load data marts from existing data sources into the data lake. Teams would then add their unique data that matched the...
Updated on
Get the Whole Paper!
Not exactly what you need?
Do you need a custom essay? Order right now:

πŸ‘€ Other Visitors are Viewing These MLA Coursework Samples: