$29
Objective. The purpose of this assignment1 is to create a symbol table data type whose keys are two-dimensional points. We’ll use a 2d-tree to support e cient range search ( nd all the points contained in a query rectangle) and k-nearest neighbor search ( nd k points that are closest to a query point). 2d-trees have numerous applications, ranging from classifying astronomical objects to computer animation to speeding up neural networks to mining data to image retrieval.
Geometric Primitives. Use the immutable data type Point2D for points in the plane.
Here is the subset of its API that you may use:
public class Point2D i m p l e m e n t s Comparable < Point2D
Co ns tr uc t the point (x, y). Point2D ( double x , double y )
x - c o o r d i n a t e .
double x ()
y - c o o r d i n a t e . double y ()
Square of Eu cl id ea n distance between this point and that . double d i s t a n c e S q u a r e d T o ( Point2D that )
For use in an ordered symbol table .
int co mp ar eT o ( Point2D that )
Compares two points by distance to this point . Comparator < Point2D D I S T A N C E _ T O _ O R D E R
Does this point equal that object ?
boolean equals ( Object that )
String r e p r e s e n t a t i o n . String toString ()
Use the immutable data type RectHV for axis-aligned rectangles. Here is the subset of its API that you may use:
public class RectHV
Co ns tr uc t the re ct an gl e [xmin , xmax ] x [ymin , ymax ].
RectHV ( double xmin , double ymin , double xmax , double ymax )
Minimum x - c o o r d i n a t e of re ct an gl e .
double xmin ()
Minimum y - c o o r d i n a t e of re ct an gl e . double ymin ()
1Adapted from www.cs.princeton.edu/courses/archive/spring15/cos226/assignments/kdtree.html.
Spring 2015 1 of 8
CS210 Project 4 (Kd-Trees) Swami Iyer
Maximum x - c o o r d i n a t e of re ct an gl e . double xmax ()
Maximum y - c o o r d i n a t e of re ct an gl e . double ymax ()
Does this re ct an gl e contain the point p ( either inside or on
boundary )?
boolean contains ( Point2D p )
Does this re ct an gl e in te rs ec t that re ct an gl e (at one or more
points )?
boolean i n t e r s e c t s ( RectHV that )
Square of Eu cl id ea n distance from point p to closest point in
re ct an gl e .
double d i s t a n c e S q u a r e d T o ( Point2D p )
Does this re ct an gl e equal that object . boolean equals ( Object that )
String r e p r e s e n t a t i o n .
String toString ()
You are not allowed to modify the Point2D and RectHV types.
Symbol Table API. Here is the Java interface representing the API for the symbol table data type whose keys are two-dimensional points (represented as Point2D objects):
public i nt er fa ce ST < Value
Return true if the symbol table is empty , and false ot he rw is e . boolean isEmpty ()
Return the number points in the symbol table .
int size ()
As so ci at e the value val with point p. void put ( Point2D p , Value value )
Return the value a s s o c i a t e d with point p . Value get ( Point2D p )
Return true if the symbol table contains the point p, and false
ot he rw is e .
boolean contains ( Point2D p )
Return all points in the symbol table . Iterable < Point2D points ()
Return all points in the symbol table that are inside the
re ct an gl e rect .
Iterable < Point2D range ( RectHV rect )
Spring 2015 2 of 8
CS210 Project 4 (Kd-Trees) Swami Iyer
Return a nearest neighbor to point p; null if the symbol table
is empty .
Point2D nearest ( Point2D p )
Return k points that are closest to point p . Iterable < Point2D nearest ( Point2D p , int k )
Problem 1. (Brute-force Implementation) Write a mutable data type PointST that im-plements the above interface by using a red-black BST (use RedBlackBST that is provided). Your implementation should support put(), get() and contains() in time proportional to the logarithm of the number of points in the set in the worst case; it should support points(), range(), and nearest() in time proportional to the number of points in the symbol table.
java PointST < input10K . txt st . empty ()? false
st . size () = 10000 First five values :
3380
1585
8903
4168
5971
7265
st . contains ((0.661633 , 0 . 2 8 7 1 4 1 ) ) ? true st . contains ((0.0 , 0.0))? false
st . range ([0.65 , 0.68] x [0.28 , 0.29]): (0.663908 , 0.285337)
(0.661633 , 0.287141)
(0.671793 , 0.288608)
st . nearest ((0.661633 , 0 . 2 8 7 1 4 1 ) ) = (0.663908 , 0. 28 53 37 ) st . nearest ((0.661633 , 0 . 2 8 7 1 4 1 ) ) :
(0.663908 , 0.285337)
(0.658329 , 0.290039)
(0.671793 , 0.288608)
(0.65471 , 0.276885)
(0.668229 , 0.276482)
(0.653311 , 0.277389)
(0.646629 , 0.288799)
Problem 2. (2d-tree Implementation) Write a mutable data type KdTreeST that uses a 2d-tree to implement the above symbol table API. A 2d-tree is a generalization of a BST to two-dimensional keys. The idea is to build a BST with points in the nodes, using the x- and y-coordinates of the points as keys in strictly alternating sequence, starting with the x-coordinates.
Search and insert. The algorithms for search and insert are similar to those for BSTs, but at the root we use the x-coordinate (if the point to be inserted has a smaller x-coordinate than the point at the root, go left; otherwise go right); then
Spring 2015 3 of 8
CS210 Project 4 (Kd-Trees) Swami Iyer
at the next level, we use the y-coordinate (if the point to be inserted has a smaller y-coordinate than the point in the node, go left; otherwise go right); then at the next level the x-coordinate, and so forth.
Level-order traversal. The points() method should return the points in level-order: rst the root, then all children of the root (from left/bottom to right/top), then all grandchildren of the root (from left to right), and so forth. The level-order traversal of the 2d-tree above is (0.7, 0.2), (0.5, 0.4), (0.9, 0.6), (0.2, 0.3), (0.4, 0.7).
The prime advantage of a 2d-tree over a BST is that it supports e cient imple-mentation of range search, nearest neighbor, and k-nearest neighbor search. Each node corresponds to an axis-aligned rectangle, which encloses all of the points in its subtree. The root corresponds to the in nitely large square from [( ; ); (+1; +1)]; the left and right children of the root correspond to the two rectangles split by the x-coordinate of the point at the root; and so forth.
Range search. To nd all points contained in a given query rectangle, start at the root and recursively search for points in both subtrees using the following pruning rule: if the query rectangle does not intersect the rectangle corresponding to a node, there is no need to explore that node (or its subtrees). That is, you should search a subtree only if it might contain a point contained in the query rectangle.
Nearest neighbor search. To nd a closest point to a given query point, start at the root and recursively search in both subtrees using the following pruning rule: if the closest point discovered so far is closer than the distance between the query point and the rectangle corresponding to a node, there is no need to explore that node (or its subtrees). That is, you should search a node only if it might contain a point that is closer than the best one found so far. The e ectiveness of the pruning rule depends on quickly nding a nearby point. To do this, organize your recursive method so that when there are two possible subtrees to go down, you choose rst the subtree that is on the same side of the splitting line as the query point; the closest point found while exploring the rst subtree may enable pruning of the second subtree.
Spring 2015 4 of 8
CS210 Project 4 (Kd-Trees) Swami Iyer
k-nearest neighbor search. Use the technique from kd-tree nearest neighbor search described above.
java KdTreeST < input10K . txt st . empty ()? false
st . size () = 10000 First five values :
0
2
1
4
3
62
st . contains ((0.661633 , 0 . 2 8 7 1 4 1 ) ) ? true st . contains ((0.0 , 0.0))? false
st . range ([0.65 , 0.68] x [0.28 , 0.29]): (0.671793 , 0.288608)
(0.663908 , 0.285337)
(0.661633 , 0.287141)
st . nearest ((0.661633 , 0 . 2 8 7 1 4 1 ) ) = (0.663908 , 0. 28 53 37 ) st . nearest ((0.661633 , 0 . 2 8 7 1 4 1 ) ) :
(0.646629 , 0.288799)
(0.653311 , 0.277389)
(0.668229 , 0.276482)
(0.65471 , 0.276885)
(0.671793 , 0.288608)
(0.658329 , 0.290039)
(0.663908 , 0.285337)
Interactive Clients. In addition to the test clients provided in PointST and KdTreeST, you may use the following interactive client programs to test and debug your code:
RangeSearchVisualizer reads a sequence of points from a le (speci ed as a command-line argument) and inserts those points into PointST and KdTreeST based symbol tables brute and kdtree respectively. Then, it performs range searches on the axis-aligned rectangles dragged by the user in the standard drawing window, and displays the points obtained from brute in red and those obtained from kdtree in blue.
$ java R a n g e S e a r c h V i s u a l i z e r input100 . txt
Spring 2015 5 of 8
CS210 Project 4 (Kd-Trees) Swami Iyer
NearestNeighborVisualizer reads a sequence of points from a le (speci ed as a command-line argument) and inserts those points into PointST and KdTreeST based symbol tables brute and kdtree respectively. Then, it performs k- (speci ed as the second command-line argument) nearest neighbor queries on the point correspond-ing to the location of the mouse in the standard drawing window, and displays the neighbors obtained from brute in red and those obtained from kdtree in blue.
$ java N e a r e s t N e i g h b o r V i s u a l i z e r input100 . txt 5
Spring 2015 6 of 8
CS210 Project 4 (Kd-Trees) Swami Iyer
BoidSimulator is an implementation of Craig Reynold’s Boids program2 to simulate the ocking behavior of birds, using a PointST or KdTreeST data type. The rst command-line argument speci es which data type to use (brute for PointST or kdtree for KdTreeST), the second argument speci es the number of boids, and the third argument speci es the number of friends each boid has. 3
$ java B o i d S i m u l a t o r brute 100 10
$ java B o i d S i m u l a t o r kdtree 1000 10
See www.en.wikipedia.org/wiki/Boids.
3Note that the program does not scale well with the number of boids when using PointST, which is after all a brute-force implementation of the ST interface. However, the program does scale quite well when using KdTreeST.
Spring 2015 7 of 8
CS210 Project 4 (Kd-Trees) Swami Iyer
Files to Submit:
PointST.java
KdTreeST.java
report.txt
Spring 2015 8 of 8