Understanding HDFS commands with examples

  • -d is used to list the directories as plain files.
  • -h is used to print file size in human readable format.
  • -R is used to recursively list the content of the directories.
  • -ignoreCrc option will disable the checksum verification.
  • chgrp command is used to change the group of a file or a path.
  • chown command is used to change the owner of a file or a path.
  • chmod command is used to change the permissions of a file.
  • -R option is used to modify the files recursively.
  • -f overwrites the destination if it already exists.
  • -p preserves access and modification times, ownership and the permissions
  • -d skips creation of temporary file with the suffix ._COPYING_.
  • -l allows Data Node to lazily persist the file to disk, Forces a replication factor of 1.
  • -f overwrites the destination if it already exists.
  • -p preserves access and modification times, ownership and the permissions
  • -d : Skip creation of temporary file with the suffix ._COPYING_.
  • -h option shows sizes in human readable format.
  • -q means show quotas, the output is QUOTA, REMAINING_QUOTA, SPACE_QUOTA, REMAINING_SPACE_QUOTA, DIR_COUNT, FILE_COUNT, CONTENT_SIZE, PATHNAME.
  • -u limits the output to show quotas and usage only. The output is QUOTA, REMAINING_QUOTA, SPACE_QUOTA, REMAINING_SPACE_QUOTA, PATHNAME.
  • -v option displays a header line.
  • -s option will result in an aggregate summary of file lengths being displayed, rather than the individual files. Without the -s option, calculation is done by going 1-level deep from the given path.
  • -h option will format file sizes in a “human-readable” fashion (e.g 64.0m instead of 67108864)
  • -v option will display the names of columns as a header line.
  • -nl option is used to add a new line after end of each file.
  • -skip-empty-file can be used to avoid unwanted newline characters in case of empty files.
  • –rm option will remove only files but directories can’t be deleted by this command.
  • –skipTrash option is used to bypass the trash then it immediately deletes the source.
  • –f option is used to mention that if there is no file existing.
  • –r option is used to recursively delete directories
  • -safely option will require safety confirmation before deleting directory with total number of files greater than hadoop.shell.delete.limit.num.files (in core-site.xml, default: 100). It can be used with -skipTrash to prevent accidental deletion of large directories.
  • -w option requests that the command wait for the replication to complete. This can potentially take a very long time.
  • -R option is accepted for backwards compatibility. It has no effect.
  • -a option to change only the access time.
  • -m option to change only the modification time.
  • -t option to specify timestamp (in format yyyyMMddHHmmss) instead of current time.
  • -c option to not create file if it does not exist.

--

--

--

Data Engineer

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

DevOps / SRE — Top Links Last Week

Using Azure Automation Runbooks and Schedules to automatically turn on/off your VMs

Create a new Automation account

W1D2 (morning)

Rust Web API (Actix) Connect Database (PostgreSQL)

Draggable and Drag Target in Flutter

Telepat North Hits 50!

Use An API To Get LME Aluminium Spot Prices

v 2.7.8 | Quickswitch ⚡️

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Karthik Sharma

Karthik Sharma

Data Engineer

More from Medium

WordCounter in Hadoop! (Windows PRACTICAL)

SQL Query optimizer

YARN on Hadoop.

Difference between SQL and SQLite