Trie Articles SOlution

Starting from:

~~$35~~

$29

A file, companies.dat, contains a list of company names on each line. Company names may have multiple words in the name. Sometimes, a company might have multiple names associated with it. The first name is the primary name, and the remainder are synonyms. In this case, the company names will be separated by a tab on the same line. (Create a sample version of this file for your testing. The final file used for grading is not published.)

Write a program that can read a news article from standard input. Keep reading until you get a line in the article that consists entirely of a period symbol (.).

Identify each company name in the article, and display each company name on the screen, one line at a time. Always display the primary name of the

company identified, not the synonym you found in the text. On the same line, display the "relevance" of the company name hit. Relevance is defined as

frequency of the company name appearing in the article divided by the number of words in the article." For example, Microsoft in "Microsoft released new

products today." should result in a relevance of 1/5, or 20%. If two names for the same company match, they count as matches for the same one

company. Display the relevance in percentage. You should ignore the following words in the article (but not the company name) when considering relevance:

a, an, the, and, or, but

You must normalize the company names for the search. Punctuation and other symbols should not impact the search. So the appearance of Microsoft Corporation, Inc. in the companies.dat file should match with Microsoft Corporation Inc in the article. However, the search must be case sensitive.

Output:

Company
Hit Count
Relevance
Microsoft
6
4.38889%
Apple Inc.
4
3.08333%
Verizon
2
2.38889%
Wireless

Total
12
10%
Total Words

120

Output should consist of

Each Company Name, Hit Count, and the Relevance (Relevance = HitCount / Total Number of Words).

The second to last row of your output should read Total, Total Hit Count, and Total Relevance.

The last row should simply output the total number of words in the file.

Note: You must not submit your "node_modules" folder if you are working on NodeJs/JavaScript. (Just submit your JavaScript source code and package.json file)

Trie Articles

Criteria
Ratings
Pts

Input: Prompt user for a news article.
5.0 pts
0.0 pts
5.0 pts
Full Marks
No Marks

Input: Read data from file named "company.dat". (No points if either filename is incorrect or used absolute path)
5.0 pts
0.0 pts
5.0 pts
Full Marks
No Marks

Calculate: Company's hit count (includes synonym)
5.0 pts
0.0 pts
5.0 pts
Full Marks
No Marks

Calculate: Company's Relevance (Must be decimal a value up to 4 digits. Ex: 6.000%)
5.0 pts
0.0 pts
5.0 pts
Full Marks
No Marks

Stopwords: Ignore words "a", "an", "the", "and", "or", and "but". (-8 points if these words in company names are ignored)
10.0 pts
0.0 pts
10.0 pts
Full Marks
No Marks

Output: Every line should have Company Name, Hit Count, and the Relevance
5.0 pts
0.0 pts
5.0 pts
Full Marks
No Marks

Output: Second last row should have Total, Total Hit Count, and Total Relevance.
5.0 pts
0.0 pts
5.0 pts
Full Marks
No Marks

Output: The last row should have the total number of words in the file.
5.0 pts
0.0 pts
5.0 pts
Full Marks
No Marks

Data Structure: Implementation of Tries
30.0 pts
0.0 pts
30.0 pts
Full Marks
No Marks

Search: Normalize company name
5.0 pts
0.0 pts
5.0 pts
Full Marks
No Marks

Search: No impact of punctuation and other symbols
5.0 pts
0.0 pts
5.0 pts
Full Marks
No Marks

Search: Case sensitive
5.0 pts
0.0 pts
5.0 pts
Full Marks
No Marks

Coding Style and Test Cases
10.0 pts
0.0 pts
10.0 pts
Full Marks
No Marks

Note: (a) Late submission penalty per policy (b) 5 points penalty if the output for improper format and indentation.
0.0 pts
0.0 pts
0.0 pts
Full Marks
No Marks

Total Points: 100.0

More products

$6.00 OFF

Assignment #7 Solution

$24.99

$18.99

Buy now

$6.00 OFF

Assignment #6 Make Use of External Subroutines Solution

$24.99

$18.99

Buy now

$6.00 OFF

Assignment #5 External Subroutines Solution

$24.99

$18.99

Buy now