Deploying Spark and HDFS on Docker Swarm doesn’t enable data locality

Isn’t it linked to the use of this :

    <property>
        <name>dfs.client.use.datanode.hostname</name>
        <value>true</value>
    </property>

Using the hostname means being bound to the container and not the service itself, if I’m correct.

Leave a Comment