Why are there a series of technical challenges behind “OMG buy it”>>>
Issue
When hive metadata has partition information partition = x and HDFS path does not have partition directory partition = X. Executing some hive SQL will report an error: org. Apache. Hadoop. Mapred. Invalidinputexception: input path does not exist
This is tez’s usual directory inconsistency problem
See issue: https://issues.apache.org/jira/browse/HIVE-13781
Impact: hive on tez. Hive3
Solutions
When executing hive SQL, the consistency between HDFS directory and hive metadata is guaranteed
Retreat on MR
Drop all partitions, execute msck repair table to repair the whole table (or use mask repair table sync partitions to synchronize the partition folder information on HDFS)