Hive:
1. Add each of these (3) files to HDFS (recommended to make a new directory and add it into that):
i attached the 3 files as a zip
2. Make a new external table for each file in Hive.
3. Execute a join query between two of the tables.
4. Drop one of the tables.
For full credit provide:
- Screenshot of the tables created command from Hive prompt
- Screenshot of both the tables using DESCRIBE command
- Screenshot of query run from your prompt with result
- Screenshot of your table drop executed
impala:
1. Add the attached weather.txt file to HDFS.
2. Make a new table for the attached weather.txt file in Impala. The weather.txt file has the year as the second field with comma delimiters. The temperature is the 4th field in the comma delimited file.
3. Create a second table from the same data source but with a different schema.
4. Execute a query to find the max temperature in the new table created from the weather text, grouped by year.
For full credit provide:
- Screenshot of the tables created command from Impala prompt
- Screenshot of both tables using DESCRIBE command
Screenshot of query run from your prompt with result
006701199099999,1950,051507004888888888888888888888888888888888888888888888888889999999N9,+0000,1,+99999999999
004301199099999,1950,051512004888888888888888888888888888888888888888888888888889999999N9,+0022,1,+99999999999
004301199099999,1950,051518004888888888888888888888888888888888888888888888888889999999N9,-0011,1,+99999999999
004301265099999,1949,032412004888888888888888888888888888888888888888888888888880500001N9,+0111,1,+99999999999
004301265099999,1949,032418004888888888888888888888888888888888888888888888888880500001N9,+0078,1,+99999999999