[Solved] CDH6.3.2 Hive on spark Error: is running beyond physical memory limits

Hue reports the following error when running hive sql

java.lang.IllegalStateException: Connection to remote Spark driver was lost

View the yarn error log as follows

Container [pid= 41355 ,containerID=container_1451456053773_0001_01_000002] is running beyond physical memory limits.
Current usage: 2.0 GB of 2 GB physical memory used; 5.2 GB of 4.2 GB virtual memory used. Killing container.

Probably the job run exceeded the memory size set by map and reduce, causing the task to fail. Adjustment increased the content of map and reduce, and the problem was eliminated. Some parameters are described as follows:

The memory resource configuration of RM is mainly carried out through the following two parameters (these two values ​​are characteristics of the Yarn platform and should be configured in yarn-site.xml):
yarn.scheduler.minimum-allocation-mb
yarn.scheduler.maximum-allocation-mb
Description: The minimum and maximum memory that a single container can apply for. The application cannot exceed the maximum value when running the application for memory. If it is less than the minimum value, the minimum value is allocated. From this point of view, the minimum value is somewhat similar to that in the operating system. Page. The minimum value has another purpose, calculating the maximum number of containers of a node Note: Once these two values ​​are set, they cannot be changed dynamically (the dynamic change mentioned here refers to the runtime of the application).

The memory resource configuration of NM is mainly carried out through the following two parameters (these two values ​​are the characteristics of Yarn platform and should be configured in yarn-sit.xml):
yarn.nodemanager.resource.memory-mb
yarn.nodemanager.vmem -pmem-ratio
Description: The maximum memory available per node, the two values ​​in RM should not exceed this value. This value can be used to calculate the maximum number of containers, that is: divide this value by the minimum container memory in the RM. The virtual memory rate is the percentage of the memory used by the task. The default value is 2.1 times; Note: The first parameter cannot be modified. Once set, it cannot be dynamically modified during the entire running process, and the default size of this value is 8G, even if If the computer memory is less than 8G, it will also be used according to the 8G memory.

The parameters related to AM memory configuration are described here by taking MapReduce as an example (these two values ​​are AM characteristics and should be configured in mapred-site.xml), as follows:
mapreduce.map.memory.mb
mapreduce.reduce.memory.mb
Description: These two parameters specify the memory size of the two tasks (Map and Reduce tasks) used for MapReduce, and their value should be between the maximum and minimum containers in the RM. If there is no configuration, it can be obtained by the following simple formula:
max(MIN_CONTAINER_SIZE, (Total Available RAM) / containers))
The general reduce should be twice the map. Note: These two values ​​can be changed through parameters when the application starts;

other memory-related parameters in AM, as well as JVM-related parameters, can be configured through the following options:
mapreduce.map.java.opts
mapreduce.reduce .java.opts
Description: These two parameters are mainly prepared for running JVM programs (java, scala, etc.), and parameters can be passed to the JVM through these two settings. Memory-related, -Xmx, -Xms and other options. The size of this value should be between map.mb and reduce.mb in AM.

We summarize the above content. When configuring Yarn memory, we mainly configure the following three aspects: the physical memory limit available for each Map and Reduce; the JVM size limit for each task; the virtual memory limit;

the following Through a specific error example, the memory-related description is given. The error is as follows:
Container[pid=41884,containerID=container_1405950053048_0016_01_000284] is running beyond virtual memory limits. Current usage: 314.6 MB of 2.9 GB physical memory used; 8.7 GB of 6.2 GB virtual memory used. Killing container. The
configuration is as follows:

        <property>
            <name>yarn.nodemanager.resource.memory-mb</name>
            <value> 100000 </value>
        </property>
        <property>
            <name>yarn.scheduler.maximum-allocation-mb</name>
            <value> 10000 </value>
        </property>
        <property>
            <name>yarn.scheduler.minimum-allocation-mb</name>
            <value>3 000 </value>
        </property>
       <property>
            <name>mapreduce.reduce.memory.mb</name>
            <value> 2000 </value>
        </property>

Through the configuration, we can see that the minimum memory and maximum memory of the container are: 3000m and 10000m respectively, and the default value set by reduce is less than 2000m, and the map is not set, so both values ​​are 3000m, which is “2.9 GB physical” in the log
memory used”. Since the default virtual memory rate (that is, 2.1 times) is used, the total virtual memory for both Map Task and Reduce Task is 3000*2.1=6.2G. The virtual memory of the application exceeds this value, so an error is reported.
Solution : Adjust the virtual memory rate when starting Yarn or adjust the memory size when the application is running.

Summary: This solution is to modify the yarn.scheduler.minimum-allocation-mb parameter to 6000 and solve it.

Similar Posts:

Leave a Reply

Your email address will not be published. Required fields are marked *