Inverted Index – java coding needed

Basic concept

The inverted index is a tool that maps occurrences of data (in our case, words) back to locations in a structure. This is frequently used to support full text support for information stored in databases. It is typically used to return records or documents that contain the queried word. We are going to use it to return lines of text.

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

There are a number of ways that we can implement the inverted index — it basically just requires an associative structure. Of course, we will be implementing it with a hash map. The keys in this case will be words. The values will be a list of the lines in which that word occurred. When a user queries the index with a word, she will be able to see all of the lines containing the queried word.

Part one: The hash map [15 pts]

The first thing you will need to implement is a hash map in a class called HashMap. This should fullyimplement the Map interface we have been working with. The storage within the HashMap should be provided by an array, and you should use external chaining to resolve collisions. For a hashing function, use thehashCode() method that is already defined for all Objects.

In addition to the methods required by the Map interface, you should implement the following additional methods on the HashMap:

public HashMap(int size)This is the constructor. The size variable determines the number of slots in the underlying hash map.public int getMaxChainLength()This method should return the length of the longest chain in the HashMap.public double getLoadFactor()This method should return the load factor of the HashMap.

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

Part two: Building the index [35 pts]

The index should be implemented in a new class called InvertedIndex. The interface of this class should look like this:

public InvertedIndex(File file)The constructor of the class. This reads in a File and builds the index.public InvertedIndex(File file, int tableSize)This performs the same function as above, but allows the size of the hash table to be specified (the first constructor should just call this one with a default value).public Iterator getLineNumberIterator(String word)Returns an Iterator that walks through each line number associated with a particular word. The Iterator does not need to implement remove.public Iterator getHighlightedLines(String word)Returns an Iterator that returns Strings containing the line number, followed by a colon, followed by a line containing at least one instance of the word. Each instance of the word should be surrounded by # characters. This should also not implement remove.public int numOccurrences(String word)Return the number of occurrences of the word in the document.

The constructor will take the file and store it, line by line, in an ArrayList (use the one in java.util). This is just a copy of the document that we can easily access by line number. The HashTable will be used to maintain our inverted index, which will map words (the keys) to occurrences in the document. We want to store both line number and word number within the line. You should create a simple wrapper class for holding these two pieces of information. Most words will occur more than once in a document, so the “value” stored in the hash table will be a List of the small wrapper objects. Use the LinkedList class fromjava.util for this purpose. As an implementation detail, convert all of the words to lowercase before you store them in the hash table, this will allow for case insensitive searching.

Part three: An application [10 pts]

Add a main method to your InvertedIndex class. This should take two arguments on the command line: the first should be a number which should determine the number of slots in the hash table, and the second will be a file name. You will, of course, use the file and size to build an InvertedIndex instance. Print out the two hash table statistics available through the methods specified earlier. You should add some methods to the InvertedIndex to allow acces to these.

Once the statistics are printed out, your application should enter an interactive mode. When the user enters a word, your index should look it up and return the lines that contain it. Each line will be printed out in the following format (the < and > just indicate that they are fields being described, they shouldn’t be included in the actual output):

:

If there are no matching lines, there should be an appropriate error message. When the user uses theRETURN key without entering any text, the application should quit.

import static org.junit.Assert.*;
import org.junit.Before;
import org.junit.Test;
/**
* BinaryTreeTest tests to see if Binary Tree functions as expected.
* @author tai-lanhirabayashi
*
*/
public class BinaryTreeTest {
BinaryTree t;

/**
* before sets up a base case.
*/
@Before
public void before(){
t= new BinaryTree(null, “head”, null);
t.setLeftChild(new BinaryTree(null,”second”,null));
t.getLeftChild().setLeftChild(new BinaryTree(null, “third”, null));
}

/**
* testSetup makes sure the test has been initialized.
*/
@Test
public void testSetup() {
assertEquals(t.getValue(), “head”);
}
/**
* tests the getLeft function
*/
@Test
public void testGetLeft(){
assertEquals(t.getLeftChild().getValue(),”second”);
}

/**
* Tests the get right function
*/
@Test
public void testGetRight(){
assertEquals(t.getRightChild(),null);
}

/**
* Tests the isLeaf function.
*/
@Test
public void isLeaf(){
assertEquals(t.getLeftChild().getLeftChild().isLeaf(),true);
}

/**
* Tests the setLeft function
*/
@SuppressWarnings(“unchecked”)
@Test
public void setLeft(){
t.setLeftChild(new BinaryTree(null,”replace”, null));
assertEquals(t.getLeftChild().getValue(),”replace”);
}
/**
* tests the setRightChild function
*/
@SuppressWarnings(“unchecked”)
@Test
public void setRight(){
t.setRightChild(new BinaryTree(null,”right”, null));
assertEquals(t.getRightChild().getValue(),”right”);
}
/**
* Tests the setValue function.
*/
@Test
public void setValue(){
t.getLeftChild().setValue(“reset”);
assertEquals(t.getLeftChild().getValue(),”reset”);
}

}

/**
* BinaryTree is a form of linked nodes that form a tree.
*
* @author tai-lan hirabayashi
*
* @param the object value that is within each node.
*/
public class BinaryTree {
E v;
BinaryTree treeLeft;
BinaryTree treeRight;

/**
* BinaryTree creates a new node binaryTree which holds an object value.
* It takes in the value, left and right child and holds them within the node.
* @param left the left child of the node.
* @param value the object the node holds
* @param right the right child of the node
*/
BinaryTree(BinaryTree left, E value, BinaryTree right){
v=value;
treeLeft=left;
treeRight=right;
}

/**
* getLeftChild returns the left child node.
* @return the left child, a binary tree node.
*/
BinaryTree getLeftChild(){
return treeLeft;

}

/**
* getRightChild returns the right child node.
* @return the right child,a binaryTree node.
*/
BinaryTree getRightChild(){
return treeRight;
}

/**
* setLeftChild, sets the left child of the current node.
* @param l is the left child, a binaryTree node.
*/
void setLeftChild(BinaryTree l){
treeLeft=l;
}

/**
* setRightChild, sets the right child of the current node.
* @param r the right child, a binaryTree node.
*/
void setRightChild(BinaryTree r){
treeRight=r;
}

/**
* setValue sets the value of a node.
* @param object value of the node.
*/
void setValue(E object){
v=object;
}

/**
* getValue returns the value held in the node.
* @return the object value of the node.
*/
E getValue(){
return v;
}

/**
* isLeaf checks if the node is a leaf node by checking if it has children.
* @return boolean if the node is a leaf node.
*/
boolean isLeaf(){
if(getLeftChild()==null && getRightChild()==null){
return true;
}
return false;

}
}

import static org.junit.Assert.*;
import org.junit.Before;
import org.junit.Test;
/**
* BinaryTreeTest tests to see if Binary Tree functions as expected.
* @author tai-lanhirabayashi
*
*/
public class BinaryTreeTest {
BinaryTree t;

/**
* before sets up a base case.
*/
@Before
public void before(){
t= new BinaryTree(null, “head”, null);
t.setLeftChild(new BinaryTree(null,”second”,null));
t.getLeftChild().setLeftChild(new BinaryTree(null, “third”, null));
}

/**
* testSetup makes sure the test has been initialized.
*/
@Test
public void testSetup() {
assertEquals(t.getValue(), “head”);
}
/**
* tests the getLeft function
*/
@Test
public void testGetLeft(){
assertEquals(t.getLeftChild().getValue(),”second”);
}

/**
* Tests the get right function
*/
@Test
public void testGetRight(){
assertEquals(t.getRightChild(),null);
}

/**
* Tests the isLeaf function.
*/
@Test
public void isLeaf(){
assertEquals(t.getLeftChild().getLeftChild().isLeaf(),true);
}

/**
* Tests the setLeft function
*/
@SuppressWarnings(“unchecked”)
@Test
public void setLeft(){
t.setLeftChild(new BinaryTree(null,”replace”, null));
assertEquals(t.getLeftChild().getValue(),”replace”);
}
/**
* tests the setRightChild function
*/
@SuppressWarnings(“unchecked”)
@Test
public void setRight(){
t.setRightChild(new BinaryTree(null,”right”, null));
assertEquals(t.getRightChild().getValue(),”right”);
}
/**
* Tests the setValue function.
*/
@Test
public void setValue(){
t.getLeftChild().setValue(“reset”);
assertEquals(t.getLeftChild().getValue(),”reset”);
}

}

Still stressed with your coursework?
Get quality coursework help from an expert!