Cloudera Enterprise 5.15.x | Other versions

Configuring Short-Circuit Reads

So-called "short-circuit" reads bypass the DataNode, allowing a client to read the file directly, as long as the client is co-located with the data. Short-circuit reads provide a substantial performance boost to many applications and help improve HBase random read profile and Impala performance.

Short-circuit reads require libhadoop.so (the Hadoop Native Library) to be accessible to both the server and the client. libhadoop.so is not available if you have installed from a tarball. You must install from an .rpm, .deb, or parcel to use short-circuit local reads.

Configuring Short-Circuit Reads Using Cloudera Manager

Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)

  Note: Short-circuit reads are enabled by default in Cloudera Manager.
  1. Go to the HDFS service.
  2. Click the Configuration tab.
  3. Select Scope > Gateway or HDFS (Service-Wide).
  4. Select Category > Performance.
  5. Locate the Enable HDFS Short Circuit Read property or search for it by typing its name in the Search box. Check the box to enable it.

    To apply this configuration property to other role groups as needed, edit the value for the appropriate role group. See Modifying Configuration Properties Using Cloudera Manager.

  6. Click Save Changes to commit the changes.

Configuring Short-Circuit Reads Using the Command Line

  Important:
  • Follow these command-line instructions on systems that do not use Cloudera Manager.
  • This information applies specifically to CDH 5.15.0. See Cloudera Documentation for information specific to other releases.
Configure the following properties in hdfs-site.xml to enable short-circuit reads in a cluster that is not managed by Cloudera Manager:
<property>
    <name>dfs.client.read.shortcircuit</name>
    <value>true</value>
</property>

<property>
    <name>dfs.client.read.shortcircuit.streams.cache.size</name>
    <value>1000</value>
</property>


<property>
    <name>dfs.client.read.shortcircuit.streams.cache.expiry.ms</name>
    <value>10000</value>
</property>

<property>
    <name>dfs.domain.socket.path</name>
    <value>/var/run/hadoop-hdfs/dn._PORT</value>
</property>
  Note: The text _PORT appears just as shown; you do not need to substitute a number.

If /var/run/hadoop-hdfs/ is group-writable, make sure its group is root.

Page generated May 18, 2018.