<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[HDFS基本操作]]></title><description><![CDATA[<p dir="auto">《Expert Hadoop Administration: Managing, Tuning, and Securing Spark, YARN, and HDFS》<br />
查询根目录</p>
<pre><code>hdfs dfs –ls /
</code></pre>
<p dir="auto">查询某个或者某些目录的内容</p>
<pre><code>$ hdfs dfs -ls /user/hadoop/testdir1 /user/hadoop/testdir2
</code></pre>
<p dir="auto">只查询目录</p>
<pre><code>$ hdfs dfs -ls -d /user/alapati
</code></pre>
<p dir="auto">查询文件详细信息</p>
<pre><code>$ hdfs dfs –ls /user/hadoop/testdir1/test1.txt
$ hdfs dfs –ls /hdfs://&lt;hostname&gt;:9000/user/hadoop/dir1/
</code></pre>
<p dir="auto">查询文件详情</p>
<pre><code>hdfs dfs -stat "%n" /user/alapati/messages
%b Size of file in bytes
%F Will return "file", "directory", or "symlink" depending on the type of inode
%g Group name
%n Filename
%o HDFS Block size in bytes ( 128MB by default )
%r Replication factor
%u Username of owner
%y Formatted mtime of inode
%Y UNIX Epoch mtime of inode
</code></pre>
<p dir="auto">创建目录</p>
<pre><code>$ hdfs dfs –mkdir hdfs://nn1.example.com/user/hadoop/dir
-p 创建目录层级
$ hdfs dfs -mkdir –p /user/hadoop/dir1
</code></pre>
<p dir="auto">删除目录</p>
<pre><code>hdfs dfs -rm -R /user/alapati
</code></pre>
<p dir="auto">目录删除后会保存到垃圾桶一段时间 ：</p>
<pre><code>hdfs dfs –ls /user/sam/.Trash
</code></pre>
<p dir="auto">清空回收站</p>
<pre><code>$ hdfs dfs –expunge
</code></pre>
<p dir="auto">修改归属</p>
<pre><code>$ hdfs dfs –chown sam:produsers  /data/customers/names.txt
</code></pre>
<p dir="auto">修改群组</p>
<pre><code>$ sudo –u hdfs hdfs dfs –chgrp marketing /users/sales/markets.txt
</code></pre>
<p dir="auto">修改权限</p>
]]></description><link>http://an.forum.genostack.com/topic/270/hdfs基本操作</link><generator>RSS for Node</generator><lastBuildDate>Sat, 13 Jun 2026 12:34:58 GMT</lastBuildDate><atom:link href="http://an.forum.genostack.com/topic/270.rss" rel="self" type="application/rss+xml"/><pubDate>Thu, 01 Apr 2021 08:05:34 GMT</pubDate><ttl>60</ttl><item><title><![CDATA[Reply to HDFS基本操作 on Thu, 01 Apr 2021 09:26:54 GMT]]></title><description><![CDATA[<p dir="auto">设置配额：<br />
Space quotas: Allow you to set a ceiling on the amount of space used for an individual directory</p>
<pre><code>$ hdfs dfsadmin –setSpaceQuota &lt;N&gt; &lt;dirname&gt;...&lt;dirname&gt;
清除设置
$ dfsadmin –clrSpaceQuota /user/alapati
</code></pre>
<p dir="auto">Name quotas: Let you specify the maximum number of file and directory names in the tree rooted at a directory<br />
设置最大文件数</p>
<pre><code>$ hdfs dfsadmin –setQuota &lt;max_number&gt; &lt;directory&gt;
</code></pre>
<p dir="auto">查询配额：<br />
dfs –count –q</p>
]]></description><link>http://an.forum.genostack.com/post/535</link><guid isPermaLink="true">http://an.forum.genostack.com/post/535</guid><dc:creator><![CDATA[anneng]]></dc:creator><pubDate>Thu, 01 Apr 2021 09:26:54 GMT</pubDate></item><item><title><![CDATA[Reply to HDFS基本操作 on Thu, 01 Apr 2021 09:17:22 GMT]]></title><description><![CDATA[<p dir="auto">查询目录是否存在：</p>
<pre><code>$ hdfs dfs –test –e /users/alapati/test
</code></pre>
<p dir="auto">创建空文件</p>
<pre><code>$ hdfs dfs -touchz /user/alapati/test3.txt
</code></pre>
]]></description><link>http://an.forum.genostack.com/post/534</link><guid isPermaLink="true">http://an.forum.genostack.com/post/534</guid><dc:creator><![CDATA[anneng]]></dc:creator><pubDate>Thu, 01 Apr 2021 09:17:22 GMT</pubDate></item><item><title><![CDATA[Reply to HDFS基本操作 on Thu, 01 Apr 2021 09:16:30 GMT]]></title><description><![CDATA[<p dir="auto">查询文件系统大小</p>
<pre><code># hdfs dfs -df
Filesystem                     Size             Used        Available Use%
hdfs://hadoop01-ns 2068027170816000 1591361508626924  476665662189076  77%
#
</code></pre>
<p dir="auto">查询使用情况<br />
$ hdfs dfs –du URI</p>
<p dir="auto">增加新的目录hdfs-site.xml</p>
<pre><code>&lt;property&gt;
&lt;name&gt;df.data.dir&lt;/name&gt;
value&gt;/u01/hadoop/data,/u02/hadoop/data,/u03/hadoop/data&lt;/value&gt;
&lt;/property&gt;
</code></pre>
]]></description><link>http://an.forum.genostack.com/post/533</link><guid isPermaLink="true">http://an.forum.genostack.com/post/533</guid><dc:creator><![CDATA[anneng]]></dc:creator><pubDate>Thu, 01 Apr 2021 09:16:30 GMT</pubDate></item><item><title><![CDATA[Reply to HDFS基本操作 on Thu, 01 Apr 2021 09:07:05 GMT]]></title><description><![CDATA[<p dir="auto">权限控制<br />
hdfs-site.xml</p>
<pre><code>&lt;property&gt;
&lt;name&gt;dfs.permissions.enabled&lt;/name&gt;
&lt;value&gt;true&lt;/value&gt;
&lt;/property&gt;
</code></pre>
<p dir="auto">HDFS uses a symbolic notation (r, w) to denote the read and write permissions, just as a Linux operating system does.</p>
<p dir="auto">When a client accesses a directory, if the client is the same as the directory’s owner, Hadoop tests the owner’s permissions.</p>
<p dir="auto">If the group matches the directory’s group, then Hadoop tests the user’s group permissions.</p>
<p dir="auto">If neither the owner nor the group names match, Hadoop tests the “other” permission of the directory.</p>
<p dir="auto">If none of the permissions checks succeed, the client’s request is denied.</p>
<p dir="auto">修改权限<br />
$ hdfs dfs –chmod –R 755 /user</p>
<p dir="auto">HDFS本身没有用户和组的概念：<br />
1.simple authentication 模式 依赖操作系统的用户和组<br />
2.Kerberos模式</p>
<p dir="auto">添加用户</p>
<pre><code>$ group add analysts
$ useradd –g analysts alapati
$ passwd alapati
</code></pre>
<pre><code>core-site.xml需要配置
&lt;property&gt;
  &lt;name&gt;hadoop.tmp.dir&lt;/name&gt;
  &lt;value&gt;/tmp/hadoop-$(user.name)&lt;/value&gt;
&lt;/property&gt;
</code></pre>
<pre><code>$ hdfs –dfs –chmod –R 777 //tmp/hadoop-alapati
</code></pre>
<pre><code>$ hdfs dfs -mkdir /user/alapati
</code></pre>
<pre><code>$ su hdfs
$ hdfs dfs –chown –R alapati:analysts
$ hdfs dfs –ls /user/
$ drwxr-xr-x   - alapati   analysts      0 2016-04-27 12:40 /user/alapati
</code></pre>
<pre><code>$ hdfs dfsadmin -refreshUserToGroupMappings
</code></pre>
<pre><code>$ hdfs dfsadmin -setSpaceQuota 30g /user/alapati
</code></pre>
]]></description><link>http://an.forum.genostack.com/post/532</link><guid isPermaLink="true">http://an.forum.genostack.com/post/532</guid><dc:creator><![CDATA[anneng]]></dc:creator><pubDate>Thu, 01 Apr 2021 09:07:05 GMT</pubDate></item><item><title><![CDATA[Reply to HDFS基本操作 on Thu, 01 Apr 2021 08:49:39 GMT]]></title><description><![CDATA[<p dir="auto">查询集群信息<br />
$ hdfs dfsadmin -report</p>
<p dir="auto">Configured Capacity: 2068027170816000 (1.84 PB)                   #A<br />
Present Capacity: 2068027170816000 (1.84 PB)<br />
DFS Remaining: 562576619120381 (511.66 TB)                        #A<br />
DFS Used: 1505450551695619 (1.34 PB)                              #B<br />
DFS Used%: 72.80%                                                 #B<br />
Under replicated blocks: 1                                        #C<br />
Blocks with corrupt replicas: 0<br />
Missing blocks: 1<br />
Missing blocks (with replication factor 1): 9                     #C</p>
<hr />
<p dir="auto">Live datanodes (54):                                              #D</p>
<p dir="auto">Name: 10.192.0.78:50010 (hadoop02.localhost)                      #E<br />
Hostname: <a href="http://hadoop02.localhost.com" rel="nofollow ugc">hadoop02.localhost.com</a><br />
Rack: /rack3                                                      #E<br />
Decommission Status : Normal                                      #F<br />
Configured Capacity: 46015524438016 (41.85 TB)                    #G<br />
DFS Used: 33107988033048 (30.11 TB)<br />
Non DFS Used: 0 (0 B)<br />
DFS Remaining: 12907536404968 (11.74 TB)<br />
DFS Used%: 71.95%<br />
DFS Remaining%: 28.05%                                            #G<br />
Configured Cache Capacity: 4294967296 (4 GB)                      #H<br />
Cache Used: 0 (0 B)<br />
Cache Remaining: 4294967296 (4 GB)<br />
Cache Used%: 0.00%<br />
Cache Remaining%: 100.00%                                         #H<br />
Xceivers: 71<br />
Last contact: Fri May 01 15:15:59 CDT 2015<br />
#A Configured capacity for HDFS in this cluster<br />
#B HDFS used storage statistics<br />
#C Shows if there are any under-replicated, corrupt or missing blocks<br />
#D Shows how many DataNodes in the cluster are alive and available<br />
#E The hostname and rack name<br />
#F Status of the DataNode (decommissioned or not)<br />
#G Configured and used capacity for this DataNode<br />
#H Cache usage statistics (if configured)</p>
<p dir="auto">刷新节点信息<br />
dfsadmin –refreshNodes<br />
提供更详细的信息<br />
dfsadmin –metasave</p>
]]></description><link>http://an.forum.genostack.com/post/531</link><guid isPermaLink="true">http://an.forum.genostack.com/post/531</guid><dc:creator><![CDATA[anneng]]></dc:creator><pubDate>Thu, 01 Apr 2021 08:49:39 GMT</pubDate></item></channel></rss>