Starting from:
$30

$24

CSE Data Structures Project #2 Solution

Storm data, provided by the National Weather Service (NWS), contain a chronological listing, by state, of hurricanes, tornadoes, thunderstorms, hail, oods, drought conditions, lightning, high winds, snow, tem-perature extremes and other weather phenomena. The data also contain statistics on personal injuries and damage estimates. Data is available from 1950 to the present for the United States of America.

This project goal of this project is to implement a storm event application that manages storm event data and uses it to answer queries meeting given selection criteria. The storm event data will be indexed by a hash table, and a binary search tree and max-heap as appropriate to support the queries.

Note: This project is to be completed individually. Your implementation must use C/C++ and ultimately your code must run on the Linux machine general.asu.edu.

Your application is required to implement the search tree, hash table, and max-heap and the needed operations on them yourself. Any dynamic memory allocation must be done yourself, using either malloc() and free(), or new() and delete(). You may not use any external libraries to implement any part of this project, aside from the standard libraries for I/O, le I/O, and string functions. If you are in doubt about what you may use, ask me.

You must use a version control system as you develop your solution to this project, e.g., GitHub or similar. Your code repository must be private to prevent anyone from plagiarizing your work.

The rest of this project description is organized as follows. x1 describes the storm and fatality event data format, and the queries that your storm event application must support. x2 summarizes the requirements for the storm event application for this project. x3 describes measures to help evaluate the e ectiveness of the data structures used in the application. Finally, x4 describes the submission requirements for the milestone and the full project deadlines.

    • Storm Event Application: Data and Query Format

Your storm event application, named storm, reads command line argument n, indicating the number of years of storm data to manage. Following this are n years on the command line, each a distinct four digit year YYYY, 1950 YYYY 2019. For each year YYYY, there are two input data les. The rst is named details-YYYY.csv, and is a comma separated value (CSV) le containing the storm event data. The second is named fatalities-YYYY.csv, also a CSV le, containing the storm fatality data. The format of these data les is described next.

1.1    The Storm and Fatality Event Data Format

For each year, each line of the le details-YYYY.csv except the rst contains details of a storm event described by the following 14 elds, in order. The rst line of the le contains the eld names, and should be skipped.

    1. event id: An identi er assigned by the NWS for a speci c storm event; it links the storm event with

the information in the fatalities-YYYY.csv  le. Example: 383097.

    2. state: The state name, spelled in all capital letters, where the event occurred. Example: GEORGIA.

    3. year: The four digit year for the event in this record. Example: 2000.

    4. month name: Name of the month for the event in this record spelled out. Example: January.



1

Table 1: Valid Event Types
Event Name
Designator
Event Name
Designator




Astronomical Low Tide
Z
Hurricane (Typhoon)
Z
Avalanche
Z
Ice Storm
Z
Blizzard
Z
Lake-E ect Snow
Z
Coastal Flood
Z
Lakeshore Flood
Z
Cold/Wind Chill
Z
Lightning
C
Debris Flow
C
Marine Hail
M
Dense Fog
Z
Marine High Wind
M
Dense Smoke
Z
Marine Strong Wind
M
Drought
Z
Marine Thunderstorm Wind
M
Dust Devil
C
Rip Current
Z
Dust Storm
Z
Seiche
Z
Excessive Heat
Z
Sleet
Z
Extreme Cold/Wind Chill
Z
Storm Surge/Tide
Z
Flash Flood
C
Strong Wind
Z
Flood
C
Thunderstorm Wind
C
Frost/Freeze
Z
Tornado
C
Funnel Cloud
C
Tropical Depression
Z
Freezing Fog
Z
Tropical Storm
Z
Hail
C
Tsunami
Z
Heat
Z
Volcanic Ash
Z
Heavy Rain
C
Waterspout
M
Heavy Snow
Z
Wild re
Z
High Surf
Z
Winter Storm
Z
High Wind
Z
Winter Weather
Z






    5. event type: The event types permitted are listed in Table 1, spelled out.

    6. cz type: Indicates whether the event happened in a county/parish (C), zone (Z), or marine (M).

    7. cz name: County/Parish, Zone or Marine name assigned to the county or zone. Example: AIKEN.

    8. injuries direct: The number of injuries directly related to the weather event. Examples: 0, 56.

    9. injuries indirect: The number of injuries indirectly related to the weather event. Examples: 0, 87.

    10. deaths direct: The number of deaths directly related to the weather event. Examples: 0, 23.

    11. deaths indirect: The number of deaths indirectly related to the weather event. Examples: 0, 4, 6.

    12. damage property: The estimated amount of damage to property incurred by the weather event, e.g.,

10.00K = $10,000; 10.00M = $10,000,000. Examples: 10.00K, 0.00K, 10.00M.

    13. damage crops: The estimated amount of damage to crops incurred by the weather event e.g., 10.00K

= $10,000; 10.00M = $10,000,000. Examples: 0.00K, 500.00K, 15.00M.

    14. tor f scale: Enhanced Fujita Scale describes the strength of the tornado based on the amount and type of damage caused by the tornado. The F-scale of damage varies in the destruction area; therefore, the highest value of the F-scale is recorded for each event. Examples: EF0, EF1, EF2, EF3, EF4, EF5.


The storm fatality data is also provided in a CSV le named fatalities-YYYY.csv. Each line of the le, except the rst, contains information on fatalities that occurred within a storm event, described by the following 7 elds, in order. The rst line of the le contains the eld names, and should be skipped.

    1. fatality id: An identi er assigned by NWS to denote the individual fatality that occurred within a

storm event. Example: 17582.

    2. event id: An identi er assigned by NWS to denote a speci c storm event; it links the fatality with

the storm event in the details-YYYY.csv  le. Example: 383097.

    3. fatality type: D represents a direct fatality, whereas I represents an indirect fatality. Example: D.



2

Table 2: Direct Fatality Location Table
Abbreviation
Location


BF
Ball Field
BO
Boating
BU
Business
CA
Camping
CH
Church
EQ
Heavy Equip/Construction
GF
Gol ng
IW
In Water
LS
Long Span Roof
MH
Mobile/Trailer Home
OT
Other/Unknown
OU
Outside/Open Areas
PH
Permanent Home
PS
Permanent Structure
SC
School
TE
Telephone
UT
Under Tree
VE
Vehicle and/or Towed Trailer




    4. fatality date:  Date and time of fatality in MM/DD/YYYY HH:MM:SS format, 24 hour time.

Example: 04/03/2012 12:00:00.

    5. fatality age: The age of the person su ering the fatality. Example: 38.

    6. fatality sex: The gender of the person su ering the fatality. Example: F.

    7. fatality location: The fatality locations permitted are listed in Table 2, spelled out. Example: Under Tree.



1.2    Storm Data Query Format

In the following, <> bracket parameters to the query, while the other strings are literals. There are 4 queries that your storm application must be able to support:

    1. find event <event id>, searches the hash table for the storm event with identi er event id. If found, it follows the index for the storm event for the given year to print the information associated with the event (i.e., the elds of the struct storm event in order, with each eld labelled on a separate line); otherwise it prints "Storm event <event id> not found." If there are no fatalities associated with the event, print "No fatalities." Otherwise, it should print the elds of each fatality event associated with this event id, with each eld on a separate line.

Example: find event 383097 //  nd storm event with given id

    2. find max <number> <YYYY> <damage type>, where number is an integer 50, YYYY is either a speci c four digit year or the literal all, and damage type is either damage property or damage crops. This query builds a max-heap on the eld damage type for the given year YYYY or all years of storm data. It executes number Delete-Max operations on the heap, each time printing information for the most expensive storm in terms of the damage it caused to property or crops. Speci cally, for each event print the event id, event type, and amount of damage of as a dollar amount, on a single line.

Example: find max 4 1950 damage property // nd 4 most expensive storms to property in 1950 find max 10 all damage crops // nd 10 most expensive storms to crops in all years




3

    3. find max fatality <number> <YYYY>, where number is an integer 50, YYYY is either a speci c four digit year or the literal all. This query builds a max-heap on the number of fatalities for the given year YYYY or all years of storm data. It executes number Delete-Max operations on the heap, each time printing information for the most fatal storm. Speci cally, for each fatality print all elds of the fatality event.

Example: find max fatality 4 1950 // nd 4 most fatal storms in 1950 find max fatality 10 all // nd 10 most fatal storms in all years

    4. range <YYYY> <field name> <low> <high>, where YYYY is either a speci c four digit year or the literal all, where field name is either state or month name, and low and high are strings. To answer this query, construct a binary search tree (BST) for the given year(s) on primary key field name and secondary key event id, i.e., the tree is ordered rst by field name, and for equal field name then ordered by event id. Then, perform an in-order traversal of the search tree printing event information for events whose field name is greater than or equal to low and less than or equal to high. If no applications are found whose field name is in the given range print "No storm events found for given range." Otherwise, your application should print for each event within the range of the search parameters, rst ordered by field name, and then ordered by event id, the following speci c elds: For field name of state, print the state, event id, year, event type, cz type, and cz name. For field name of month name, print the month name, event id, year, event type, cz type, and cz name.

Example: range 1950 state A C // all storms in states alphabetically from A to C in 1950 range all month name January January // all storms in January in all years


    • Program Requirements for Project #2

        1. Write a C/C++ storm event application, named storm, that reads command line argument n, indi-cating the number of years of storm data to manage. Following this are n years on the command line, each a distinct four digit year YYYY, 1950 YYYY 2019. That is, the application is invoked as: storm n YYYY1 : : : YYYYn. For example,

storm 1 1950

storm 3 1950 1951 1952

For each year YYYY, there are two input data les. The rst is named details-YYYY.csv, and is a comma separated value (CSV) le containing the storm event data. The second is also a CSV e, named fatalities-YYYY.csv; it contains the storm fatality data.

    2. Provided for you is a le named defns.h in which structures have been de ned for storing the storm and fatality event data with elds matching this format. For each year i, 1 i n of storm data, initialize an array of struct annual storms. This involves:

        (a) First initializing an array of struct storm event of size si where si is the number of events in the le details-YYYYi.csv.

        (b) As you initialize each entry in the storm event array, you must also insert a subset of the elds of the event into a hash table. The hash table is an array of pointers to struct hash table entry all initialized to Nil. Each hash table entry has three elds: The event id and year of the storm event, and the index position event index at which the event is stored in the storm event array.

The hash function is computed as the event id modulo the hash table size. The hash table size is the rst prime number larger than 2 (s1 + + sn), where si is the number of storm events in details-YYYYi.csv, 1 i n. (You may nd the function in prime.cc useful for this purpose.)




4
        (c) Now, process the fatalities-YYYYi.csv le. Hash on the event id of each fatality event to nd the event index in the storm event array. Insert the fatality into a linked list of fatality events for the corresponding storm event.

    3. Once the array(s) and hash table have been initialized, your storm event application is ready to start processing queries. It reads q, the number of queries, from stdin, followed by q queries, one per line. The format of queries follows the format described in x1.2. The output must be written to stdout.

Example: 3 // q=3 queries

range 1951 month name February March // all storms February and March in 1951 find event 383097 // nd storm event with given event id

find max fatality 10 1950 //  nd 10 most fatal storms in 1950

    4. After processing a query, collect and output the characteristics of the data structures you’ve built as described in x3.

    5. After a query is processed, any max-heap or binary search tree data structures created dynamically must be deallocated on answering the query.


Sample input les that adhere to the format described in x1.1 ansd x1.2 will be provided on Canvas; use them to test the correctness of your program.

    • Characteristics of the Data Structures

For a BST constructed to answer a query of the form range <YYYY> <field name> <low> <high>:


    1. Print a count of the total number of nodes in the BST.

    2. Print the height of the root of the BST, and the height of its left and its right subtree.

For a max-heap constructed to answer a query of the form find max <number> <YYYY> <damage type>, or of the form find max fatality <number> <YYYY>:


    1. Print a count of the total number of nodes in the max-heap.

    2. Print the total height of the max-heap, and the height of its left and right subtrees.

After processing all queries, print a summary of the hash table that includes:

    1. Print a table that lists for each chain length ‘, 0 ‘ ‘max, the number of chains of length ‘, up to the maximum chain length ‘max that your hash table contains.

    2. Compute and print the load factor for the hash table.

    • Submission Instructions

Submissions are always due before 11:59pm on the deadline date.

    1. For SLN 84794 (MW) the milestone is due on Monday, 10/21/2019. For SLN 83657 (TTh) the milestone is due on Tuesday, 10/22/2019. See x4.1 for requirements.

    2. For SLN 84794 (MW) the complete project is due on Monday, 11/04/2019. For SLN 83657 (TTh) the complete project is due on Tuesday, 11/05/2019. See x4.2 for requirements.

It is your responsibility to submit your project well before the time deadline!!! Late projects are not accepted. Do not expect the clock on your machine to be synchronized with the one on Canvas!

An unlimited number of submissions are allowed. The last submission will be graded.



5
4.1    Requirements for Milestone Deadline

For the milestone deadline you must complete steps 1 and 2 of x2, the Program Requirements for Project #2. You also must be able to answer find event queries. As well, you must print characteristics of the hash table as described in x3.
Using the submission link on Canvas for the Project #2 milestone, submit a zip1    le named:

yourFirstName-yourLastName.zip

that unzips into the following:

Project State (10%): In a folder (directory) named State provide a brief report (.pdf preferred) that addresses the following:

    1. Describe any problems encountered in your implementation for this project milestone.

    2. Describe any known bugs and/or incomplete implementation in the project milestone.

    3. While this project is to be completed individually, describe any signi cant interactions with anyone (peers or otherwise) that may have occurred.

    4. Cite any external books, and/or websites used or referenced.

Implementation (40%): In a folder (directory) named Code provide:

    1. In one or more les, your well documented C/C++ source code implementing the requirements for this project milestone.

    2. A makefile that compiles your program and produces an executable named storm on general.asu.edu. Our TA will write a script to compile and run all student submissions on general.asu.edu; there-fore executing the command make storm in the Code directory must produce the executable storm also located in the Code directory.

Correctness (50%): The correctness of your program will be evaluated by running find event queries on storm event and fatality les, some of which will be provided to you on Canvas prior to the deadline for testing purposes.

The milestone is worth 30% of the total project grade.

4.2    Requirements for Complete Project Deadline

For the full project deadline, you must implement all the requirements for Project #2 as described in x2. Using the submission link on Canvas for the complete Project #2, submit a zip2 le named

yourFirstName-yourLastName.zip

that unzips into the following:

Project State (10%): Follow the same instructions for Project State as in x4.1 except applied to all project requirements.

Implementation (40%): In a folder (directory) named Code provide:

    1. In one or more les, your well documented C/C++ source code implementing all requirements for this project.

    2. A makefile that compiles your program and produces an executable named storm on general.asu.edu. Our TA will write a script to compile and run all student submissions on general.asu.edu; there-fore executing the command make storm in the Code directory must produce the executable storm also located in the Code directory.

Correctness (50%): The correctness of your program will be evaluated by testing all queries in x1.2 on storm event and fatality les, some of which will be provided to you on Canvas prior to the deadline for testing purposes.


1Do not use any other archiving program except zip.
2Do not use any other archiving program except zip.

6

More products