Homework Assignment #2 240

Starting from:

~~$35~~

$29

Objectives

• Gain some familiarity setting up environments for cloud developing using an AWS.

• Start to become familiar with at least one AWS SDK.

• Gain some familiarity with the AWS Command-Line Interface.

Overview

In this assignment, you will write a program similar to the Consumer program used in CS5250. Specifically, this Consumer program will read objects (Widget Requests) from an S3 bucket (namely, Bucket 2) and then process those requests. Each request specifies results in a single Widget creation, update, or deletion in either another S3 bucket (Bucket 3) or in a DynamoDB table. You may write this program using any language and development stack that you choose, as long as the language supports an AWS SDK. You may even use Cloud9, if you wish. The instructor will provide an executable Producer program that will allow you to do some system testing.

Instructions

Step 0 – Preliminaries
Be sure that your AWS workspace includes the VPC, EC2 instances, S3 buckets, and DynamoDB table described in the preliminary step of HW1. If you are unsure of the state of these resources,

• test your connectivity to the EC2 instances using SSH and your key pair,

• test your Bucket 3 website, and

• test your DynamoDB table by adding some widgets to it manually through the DynamoDB console.

• test everything by running the instructor provided Producer on Host A and Consumer on Host B.

Step 1 – Design your Consumer program
As you should remember from CS5250, the Consumer program processes Widget Requests to create, update, or delete Widgets. Your Consumer program should be able to store the widgets either in the Bucket 3 or in the widgets DynamoDB table.

Your Consumer program needs to allow the user to specify command-line arguments for the storage strategy and resources to use, like the instructor provided Consumer program. The syntax for the arguments is up to you. Look for a good open-source 3rd-party library that can help you the implement command-line arguments with minimal coding effort.
You Consumer program will need to periodically try to a read a single read Widget Requests from Bucket 2 in key order (e.g., smallest key first). Don’t try to read all existing requests are one time for two reasons. First, the Producer program may be running concurrently and adding to the set of requests. Second, in later homework assignments, you will be extending this system so multiple Consumer programs can run at the same time without getting in each other’s way. If read only one Widget Request at a time, your application will be more scalable than if you tried to read them all at once. Third, repeatedly retrieving a list of all keys in an S3 bucket may not as efficiently as retrieving a small number of keys (like just one) at one time.

If the Consumer succeeds in reading a request, then it should delete the request, process the request, and then immediate go back to trying to read another request.

If there is no available request, then the Consumer should wait for a little while (e.g., 100ms)

before trying again. In other words, the reads should be done a typical polling loop:

Loop until some stop condition met

Try to get request
If got request
Process request
Else

Wait a while (100ms)
End loop

The Widget Requests are in a JSON format according to the schema given in Appendix A. This schema is available as a download with the assignment if needed. Some sample Widget Requests are also available for download with the assignment. Note: Do not try to write your own JSON parser. Please find an appropriate library that can parse JSON text into objects, if your development stack does not already include one. There are multiple open-source JSON parses for most common languages.

Requests come in three flavors (create, update, and delete):

Widget Create Request

When the Consumer processes a Widget Create Request, it will create the specified widget and store it in either Bucket 3 or the DynamoDB table.

Widget Delete Request

When the Consumer processes a Widget Delete Request, it needs to make sure that the specified object does not exist. If it does not currently exist, the Consumer should not throw an error. Instead, it should simply log a warning and move on to the next request.

Note: Although, you need to consider handling delete requests in your design, you do not need implement this requirement until HW3.
Widget Change Request

When the Consumer processes a Widget Update Request, it will first retrieve the specified widget, and then change all the attributes mentioned in the request. If a property is not included in an update request or if its value is null, its current value should not be change. If value of a property in an update request is the empty string, that means the corresponding property of the widget should be set to null or deleted (if it is one of the other attributes). A widget’s id and owner cannot be changed. If the specified widget does not exit, it should not throw an error. Instead, it should simply log a warning.

Note: Although, you need to consider handling change requests in your design, you do not need implement this requirement until HW3.

As mentioned in above, your HW2 implementation only needs to process Widget Create Requests. However, your design should anticipate the handling of Widget Delete and Update Request in the near future.

A widget needs to contain all the data found in a Widget Create Request. When a Widget needs to be stored in Bucket 3, you should serialize it into a JSON string and store that string data. Its key should be based on the following pattern:

widgets/{owner}/{widget id}

where {owner} is derived from the widget’s owner and {widget id} is derived from the widget’s id. The {owner} part of the key should be computed from the Owner property by 1) replacing spaces with dashes and converting the whole string to lower case.

When a widget needs to be stored in the DynamoTable, place every widget attribute in the request its own attribute in the DynamoDB object. In other words, in addition to the widget_id, owner, label, and description, all the properties listed in the otherAttributes properties need to be stored as attributes in the DynamoDB object and not as single map or list.

In a future homework assignment, requests will come from a source other than an S3 bucket, like a queue. Therefore, design your Consumer so the logic for retrieving requests can be easily swapped out at runtime with a different algorithm. At a minimum, encapsulate the request retrieve logic into a method or function, but try to go a step in ensuring low couple, high cohesion, and good modularization. Consider using a Strategy design pattern to solve this design problem.

Finally, you need to describe your design in a simple document. This document could consist of a set of meaningful UML diagrams or other kinds of diagrams. Being able to communicate your designs clearly and concisely is the important skill and one that needs regular practice. For this assignment, image that you need to explain your design to a technical peer or a supervisor and that the level of detail needs to sufficient for that person to implement your design. It does
not need to contain pseudo code or describe the body of trivial methods or functions. Instead, it needs to communicate the structure of the software components, their relationships with each other, and the flow of control and/or data. UML Class Diagrams and Interaction Diagrams can do this for object-oriented designs. Structure Charts and Flow Charts can do this for structured designs. If necessary, brush on these modeling languages. These or similar modeling languages should have been covered in CS3450, which is in the prerequisite chain for this class.

The Canvas website provides downloads for two sample designs: an object-oriented design and a structured design. These are just sample designs to stimulate your thinking. They are by no means the only possible designs and they lack some details. You are free adapt either of these designs to suite your purposes or you may develop one from scratch. In the end, you must take ownership of and responsibility for your design and your design document must represent an accurate specification for your program.

You must a PDF of your design document with your assignment.

Step 5 – Implement components of the Consumer program with test cases
In this step, you need to implement your design and test each component (e.g., class) using executable unit test cases.

It is important to note that testing is not an optional part of this assignment. It is a critical part of any software engineering project industry and should be part of any software programming assignment in academy. In fact, the department establish this a goal in response by the Industry Advisory Committee’s overwhelming request to do so.

Your unit test cases for this assignment should provide a reasonable level of confidence that each component (class, method, or function) is working correctly. Note that doing unit testing with executable unit test cases should have been covered in CS1440, CS1410, CS2420, and CS3450, all of which are in the perquisite chain for this class. If you are not familiar with unit testing and the testing tools for your chosen environment, you may need to do some extra reading on this subject. Also, the TA and instructor can provide some help in this area if needed.

If you find that method or function is hard to test, then it is very likely that your method or function is trying to do too much. Break it up into smaller methods or functions so each one focus on doing one thing. This is consistent with the principles of good modularization or the “Single Responsibility Principle” of the SOLID. Again, these are concepts that should have concepted in CS3450.

Finally, your Consumer should also produce a log file that records what is does. There are many open - source logging libraries available for every popular programming language. Choose a good one and use it. Don’t try to do all the log by brute-force file I/O.

Step 6 – Complete some system testing
Once you have completed your implementation and unit testing, do some ad hoc system testing by running the instructor-provided Producer with your Consumer. You may run the Producer on Host A or on your own machine. If you have setup the AWS credential file for AWS CLI properly, this should work just fine because the Producer will use this file when it tries to create an S3 client or DynamoDB client.

To see, the possible parameters and examples, execute the following:

java -jar producer.jar --help

Similarly, you can run your Consumer on Host B or on your own machine. During debugging, it will be considerably easier to run it on your own machine, and probably directly in your IDE.

Keep logs from Producer and Consumer for at least one successful run and submit them as part of the homework assignment.

Step 6 – Commit your work to a git repository
Manage your all project’s artifacts with Git, including project files, build instructions, etc. The only things you don’t need to keep in the Git repository are artifacts that are generated during the build process or at runtime (i.e., the log files, request, and Widgets). Commit to your Git repository frequently. Your Git commit log will be examined during the review and must show meaningful workflow. A single commit when you are all finished is not acceptable.

Also, you will need to take a snapshot of your Git repository’s commit log and submit it with your assignment.

Step 7 – Complete a Design-and-Code Review
Schedule and complete a 10-minute online design-n-code review with either the instructor or TA. An online signup sheet will be posted to help you schedule a time. If you can’t find a time slot in the sheet sign that works for you, simple email the instructor AND the TA with 3 possible times that work for you. Either the instructor or the TA will respond to your email and give you a time.

Before the review, submit your work artifacts to Canvas (see Submission instructions below). The submission to canvas needs to be before the due date if it is not to be counted as late. The review can be after the due date.

During the review, you should do the following

• Talk through your design (2 minutes)

• Walk through the key parts of your code (4 minutes)

• Walk through the some of the most interesting test cases (4 minutes)

You do not need to execute your system unless the instructor or TA ask you to do so. So, be prepared to compile and run your system if requested.

10 minutes is not a long time, so you must be prepared and efficient with your walk through.

Hints and Other Thoughts

Apply SE principles. Follow the principles of Abstraction, Modularization, and Encapsulation (AME), taught in CS1440 and CS3450. The AME principles encompass SOLID principles, which focus primarily on achieving good modularity. Both the AME and the SOLID principles involve creating a modular design with highly cohesive and loosely coupled components. Doing this will make testing and debugging much easier. If you try to build this (or any non-trivial) system as one large component, you will have an extremely tough time trying to test it. In fact, if you ever find that a component is hard to test, stop and re-think your design. Students who really struggle with testing do so because their designs do not follow either the AME or SOLID principles. So, be wise and take some time to think through your design, ensuring that your components are highly cohesive and loosely coupled.

Use patterns. Consider making use of common patterns, like the Strategy, Adapter, Template Method, and Factory patterns, in your design. Some of these patterns should have been covered in CS1440 and CS3450. You will not be docked any points for not using design patterns, but you will probably find your design much easy to implement and test if you do make use of them. Patterns are a way for you to leverage tried and proven solutions to reoccurring design problems. Get to know them and practice using them.

Review how to create executable test cases. The use of executable test cases should have been covered CS1410, CS1440, CS2420, and any other class that requires programming. If you are uncertain of how to set up and program executable test cases in your chosen language for this assignment, do some searching online for testing tutorials. Virtually every modern programming language has at least one readily available testing framework and harness. The framework is a library of macros, objects, or classes that allow you to define test cases and mocks (if needed), and to compare expected results with observed results. The harness allows you to execute the test cases and often includes other features like coverage reports. For some development stacks, the testing framework and harness are combined into one tool.

Make sure the executable test cases are meaningful. Each test case should adhere to the following pattern:

1. Set up an initial state, if needed

2. Stimulate the thing to be tested (i.e., run the method or function with the desired parameters

3. Compare expected results to predicted results. Be sure to include comparisons for all relevant portions of the resultant state.

Not doing sufficient meaningful comparisons is the primary reason for why test case might fail to provide real value.

Use Path and Input Validation Testing Techniques. Reasonable coverage of your code can be achieved by using a combination of Path Testing and Input Validation Testing. With Path Testing, each test case exercises a different path through a target method or function. With path testing, a thorough suite for a target method or function is one that ensures that every statement is executed at least once, every conditional statement is tried for each possible outcome, each loop boundary is checked, and the throwing of possible exceptions are exercised.

With Input Validation Testing, the input domain for a method or function (which can include the state of associated object or even the state of the whole system) is partitioned into meaningful subsets. Reasonably cover involves testing an example input from each subset. For example, to test a method that takes a “Long” integer as a parameter and does not rely on anything else, the input domain is the set of all long integers. Meaningful partitioning of this domain would be {{Positive Long integers}, {0}, {Negative Long integers}, {the Minimum Long Integer value}, {the Maximum Long Integer value}, and {null or undefined}}.

Keep it simple. Don’t over design, i.e., do not invent requirements or try to anticipate future requirements except those that explicitly mentioned. Also, don’t over implement, i.e., do not build components that have no purpose relative to your design. Use open-source software where possible, e.g., for logging.

Be flexible. If you get into the coding and testing, and discover a problem with your design, go back and fix it.

Don’t cheat. Don’t even think about decompiling and code any code that has been provide with this or any previous assignment. There are some idiosyncrasies that I will recognize. I don’t expect your design to be the same as mine. Also, do not copy each other’s designs or code. Do you own thing. That’s the only way you will learn.

You may use publicly available design ideas (such as design patterns) and code snippets. However, if you include anything that is not your creation, you must give credit to the source; otherwise, it is considered plagiarism. Plagiarism and all forms of cheating will be severely penalized and reported.

Submission

Your submission must include the following:

• All screen snapshots mentioned in the above steps.

• A document that clearly and concisely communicates the design of your Consumer program.
• An archive file of your entire project (including hidden project files and build instructions, but not necessarily any compiled artifacts).

• The log files from at least one successful system test.

Grading Criteria

A document that clearly and concisely communicates the design of
20
your Consumer program

A working and tested implementation of the Consumer program
50
that processes Widget Create Requests

Log files that demonstrate at least one successful system test
20
Screen snapshot(s) that show meaning commits to a git repository
10
Deductions (applied on top of scores given for the above):
-200 to 0
• Git not used or commits not done frequently

• Failure to complete a design and code review (-50)

• Late (-20 points per day up to -50 points)

• Cheating, e.g., copying someone else’s work. This will be an

automatic -200 (i.e., a negative twice the maximum points).

Max Points
100

Appendix A

Below is the JSON schema for Widget Requests. Note that the owner property can be any string of upper- and lower-case letters and spaces.

{

"$schema": "http://json-schema.org/draft-04/schema#",
"type": "object",
"properties": {
"type": {
"type": "string",
"pattern": "create|delete|update"
},
"requestId": {
"type": "string"
},
"widgetId": {
"type": "string"
},
"owner": {
"type": "string",
"pattern": "[A-Za-z ]+"
},
"label": {
"type": "string"
},
"description": {
"type": "string"
},
"otherAttributes": {
"type": "array",
"items": [
{
"type": "object",
"properties": {
"name": {
"type": "string"
},
"value": {
"type": "string"
}
},
"required": [
"name",
"value"
]
}
]
}
},
"required": [
"type",
"requestId",
"widgetId",
"owner"
]

}

Appendix B