Troubleshooting a CDH Upgrade Using Cloudera Manager
HDFS DataNodes fail to start
After upgrading, HDFS DataNodes fail to start with exception:
Exception in secureMainjava.lang.RuntimeException: Cannot start datanode because the configured max locked memory size (dfs.datanode.max.locked.memory) of 4294967296 bytes is more than the datanode's available RLIMIT_MEMLOCK ulimit of 65536 bytes.
Possible Reasons
HDFS caching, which is enabled by default in CDH 5 and higher, requires new memlock functionality from Cloudera Manager Agents.
Possible Solutions
Do the following:
- Stop all CDH and managed services.
- On all hosts with Cloudera Manager Agents, hard-restart the Agents. Before performing this step, ensure you understand the semantics of the hard_restart
command by reading Hard Stopping and Restarting Agents.
- RHEL-compatible 7 and higher:
$ sudo service cloudera-scm-agent next_stop_hard $ sudo service cloudera-scm-agent restart
- All other Linux distributions:
sudo service cloudera-scm-agent hard_restart
- RHEL-compatible 7 and higher:
- Start all services.
Upgrading to CDH 5.8.0 or CDH 5.8.1 When Using the Flume Kafka Client
Due to the change of offset storage from ZooKeeper to Kafka in the CDH 5.8 Flume Kafka client, data might not be consumed by the Flume agents, or might be duplicated (if kafka.auto.offset.reset=smallest) during an upgrade to CDH 5.8.0 or CDH 5.8.1. To prevent this, perform the steps described below before you upgrade your system.
Important: This issue has been fixed for CDH 5.8.2 and higher. If you are upgrading to
CDH 5.8.2 or higher, you do not need to perform this procedure. For more information, see Cloudera Distribution of Apache Kafka Known Issues.
The upgrade process is based on this example configuration:tier1.sources = source1 tier1.channels = channel1 tier1.sinks = sink1 tier1.sources.source1.type = org.apache.flume.source.kafka.KafkaSource tier1.sources.source1.zookeeperConnect = zkhost:2181 tier1.sources.source1.topic = flumetopic tier1.sources.source1.groupId = flume tier1.sources.source1.channels = channel1 tier1.sources.source1.interceptors = i1 i2 tier1.sources.source1.interceptors.i1.type = timestamp tier1.sources.source1.interceptors.i2.type = host tier1.sources.source1.kafka.consumer.timeout.ms = 100 tier1.channels.channel1.type = org.apache.flume.channel.kafka.KafkaChannel tier1.channels.channel1.brokerList=broker1:9092,broker2:9092 tier1.channels.channel1.zookeeperConnect=zkhost:2181 tier1.channels.channel1.topic=flumechannel1 tier1.channels.channel1.groupId = flumechannel tier1.channels.channel1.capacity = 10000 tier1.channels.channel1.transactionCapacity = 1000 tier1.sinks.sink1.type = hdfs tier1.sinks.sink1.hdfs.path = /tmp/kafka/%{topic}/ tier1.sinks.sink1.hdfs.filePrefix = %{host}- tier1.sinks.sink1.hdfs.rollInterval = 60 tier1.sinks.sink1.hdfs.rollSize = 0 tier1.sinks.sink1.hdfs.rollCount = 0 tier1.sinks.sink1.hdfs.fileType = DataStream tier1.sinks.sink1.channel = channel1 tier1.sinks.sink1.hdfs.kerberosKeytab = $KERBEROS_KEYTAB tier1.sinks.sink1.hdfs.kerberosPrincipal = $KERBEROS_PRINCIPAL
Perform the following steps to upgrade CDH.
- If you are using a version lower than CDH 5.7, first upgrade to CDH 5.7. If for some reason you cannot upgrade to CDH 5.7, contact your Cloudera Sales Engineer for assistance, or file a support case with specific versions from which and to which you are upgrading.
- Add the following sections to the source and channel for the Flume configuration:
# Added for source upgrade compatability tier1.sources.source1.kafka.bootstrap.servers = broker1:9092,broker2:9092 tier1.sources.source1.kafka.offsets.storage = kafka tier1.sources.source1.kafka.dual.commit.enabled = true tier1.sources.source1.kafka.consumer.group.id = flume tier1.sources.source1.kafka.topics = flumetopic # Added for channel upgrade compatability tier1.channels.channel1.kafka.topic = flumechannel1 tier1.channels.channel1.kafka.bootstrap.servers = broker1:9092,broker2:9092 tier1.channels.channel1.kafka.consumer.group.id = flumechannel tier1.channels.channel1.kafka.offsets.storage = kafka tier1.channels.channel1.kafka.dual.commit.enabled = true
- Restart (or rolling restart) the Flume agents. This switches offsets.storage to Kafka, but keeps both the Kafka and ZooKeeper offsets updated because the
dual.commit.enabled property is set to true. Confirm that Kafka messages are flowing through the Flume servers. Updating the offsets
only occurs when new messages are consumed, so there must be at least one Kafka message consumed by the Flume agent, or one event passed through the Flume channel. Use the following commands to
verify that Flume is properly updating the offsets in Kafka (the egrep command is used to match the correct topic names: in this example, flumetopic and flumechannel) :
echo "exclude.internal.topics=false" > /tmp/consumer.config kafka-console-consumer --consumer.config /tmp/consumer.config --formatter "kafka.coordinator.GroupMetadataManager\$OffsetsMessageFormatter" --zookeeper zkhost:2181 --topic __consumer_offsets |egrep -e "flumetopic|flumechannel1"
Output should be similar to the following and show that the Flume source and/or channel topics offsets are being incremented:
[flume,flumetopic,0]::[OffsetMetadata[70,cf9e5630-214e-4689-9869-5e077c936ffb],CommitTime 1469827951129,ExpirationTime 1469914351129] [flumechannel,flumechannel1,0]::[OffsetMetadata[61,875e7a82-1c22-43be-acaa-eb4d63e7f71e],CommitTime 1469827951128,ExpirationTime 1469914351128] [flumechannel,flumechannel1,0]::[OffsetMetadata[62,66bda888-0a70-4a02-a286-7e2e7d14050d],CommitTime 1469827951131,ExpirationTime 1469914351131]
- Perform the upgrade to CDH 5.8.0 or CDH 5.8.1. The Flume agents are restarted during the process. Flume continues to consume the source topic where it left off, and the sinks continue
draining from the Kafka channels where they left off. Post upgrade, remove the following deprecated properties from flume.conf because they are no longer used in CDH
5.8.0 or higher:
tier1.sources.source1.zookeeperConnect = zkhost:2181 tier1.sources.source1.topic = flumetopic tier1.sources.source1.groupId = flume tier1.channels.channel1.zookeeperConnect=zkhost:2181 tier1.channels.channel1.topic=flumechannel1 tier1.channels.channel1.groupId = flumechannel
Page generated May 18, 2018.
<< Upgrading CDH | ©2016 Cloudera, Inc. All rights reserved | Performing Upgrade Wizard Actions Manually >> |
Terms and Conditions Privacy Policy |