Hadoop: …be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and no node(s) are excluded in this operation

This error is caused by the block replication system of HDFS since it could not manage to make any copies of a specific block within the focused file. Common reasons of that: Only a NameNode instance is running and it’s not in safe-mode There is no DataNode instances up and running, or some are dead. … Read more

hdfs dfs -mkdir, No such file or directory

It is because the parent directories do not exist yet either. Try hdfs dfs -mkdir -p /user/Hadoop/twitter_data. The -p flag indicates that all nonexistent directories leading up to the given directory are to be created as well. As for the question you posed in the comments, simply type into your browser http://<host name of the … Read more

How to find the size of a HDFS file

I also find myself using hadoop fs -dus <path> a great deal. For example, if a directory on HDFS named “/user/frylock/input” contains 100 files and you need the total size for all of those files you could run: hadoop fs -dus /user/frylock/input and you would get back the total size (in bytes) of all of … Read more

Where HDFS stores files locally by default?

You need to look in your hdfs-default.xml configuration file for the dfs.data.dir setting. The default setting is: ${hadoop.tmp.dir}/dfs/data and note that the ${hadoop.tmp.dir} is actually in core-default.xml described here. The configuration options are described here. The description for this setting is: Determines where on the local filesystem an DFS data node should store its blocks. … Read more