Hadoop - HDFS

HDFS Concept

  • Cooperation
    HDFS have several types of nodes to cooperate the handling of big data.


    image.png
  • Resiliency
    HDFS stored its data multiple copies in different Data nodes, so you can still retrieve your data when one nodes burnt down. Name Node is often backed up in local disk and NFS for resiliency. There are also secondary Name Node for backup, and Zookeeper tracks active name nodes for management.
  • Federation
    HDFS's Name node manages specific volume for the corresponding data node.
  • Use of HDFS


    image.png

HDFS Usage

  • First lanching Hortonworks Docker :


    image.png

1)Web Interface

  • Then log into Ambari port which is 127.0.0.1:8080, and we can get into HDFS virtual machine web interface:


    image.png
  • Switching to the File View and we can create folder/ upload data/ view data


    image.png

    image.png

2)Command Line Interface

  • Loggin using command:
    ssh "your user name"@127.0.0.1 -p 2222

  • Code are similar with Linux shell:

hadoop fs -ls
hadoop fs -mkdir ml-100k
wget http://media.sundog-soft.com/hadoop/ml-100k/u.data
hadoop fs -copyFromLocal u.data ml-100k/u.data
hadoop fs -rm ml-100/u.data
hadoop fs -rmdir ml-100k

You can also find command functions by typing:

hadoop fs

Exit

exit

©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容