前面说过了NameNode,DataNode的配置方法,这次说Secondary的配置方法。hadoop为实现高可用,支持配置失效备份的Namenode,这样当主的Namenode挂掉了之后,可以从Secondary把数据恢复回去。可以理解为Mysql的Master和Slave,但是不同的是,hadoop的Secondary不能直接当Namenode使用,更多的时候是用它当namenode的数据恢复。
其实secondary的配置有些类似于Datanode
看下配置文件
core-site.xml
<? xml version ="1.0" ?> <? xml-stylesheet type ="text/xsl" href ="configuration.xsl" ?> <!-- Put site-specific property overrides in this file. --> < configuration > < property > < name >fs.default.name </ name > < value >hdfs://hadoopmaster-177.tj:9000 </ value > <!--指定master地址--> </ property > < property > < name >fs.checkpoint.dir </ name > < value >/opt/data/hadoop1/hdfs/namesecondary,/opt/data/hadoop2/hdfs/namesecondary </ value > <!--其实这个比较重要,数据恢复全靠它--> </ property > < property > < name >fs.checkpoint.period </ name > < value >1800 </ value > </ property > < property > < name >fs.checkpoint.size </ name > < value >33554432 </ value > </ property > < property > < name >io.compression.codecs </ name > < value >org.apache.hadoop.io.compress.DefaultCodec,com.hadoop.compression.lzo.LzoCodec,com.hadoop.compression.lzo.LzopCodec,org.apache.hadoop.io.compress.GzipCodec,org.apache .hadoop.io.compress.BZip2Codec </ value > </ property > < property > < name >io.compression.codec.lzo.class </ name > < value >com.hadoop.compression.lzo.LzoCodec </ value > </ property > </ configuration >
hdfs-site.xml
<? xml version ="1.0" ?> <? xml-stylesheet type ="text/xsl" href ="configuration.xsl" ?> <!-- Put site-specific property overrides in this file. --> < configuration > < property > < name >dfs.name.dir </ name > < value >/opt/data/hadoop1/hdfs/name,/opt/data/hadoop2/hdfs/name </ value > < description > </ description > </ property > < property > < name >dfs.data.dir </ name > < value >/opt/data/hadoop1/hdfs/data,/opt/data/hadoop2/hdfs/data </ value > < description > </ description > </ property > < property > < name >dfs.http.address </ name > < value >hadoopmaster-177.tj:50070 </ value > </ property > < property > < name >dfs.secondary.http.address </ name > < value >hadoopslave-189.tj:50090 </ value > <!--注意这里--> </ property > < property > < name >dfs.replication </ name > < value >3 </ value > </ property > < property > < name >dfs.datanode.du.reserved </ name > < value >1073741824 </ value > </ property > < property > < name >dfs.block.size </ name > < value >134217728 </ value > </ property > </ configuration >
mapred-site.xml
<? xml version ="1.0" ?> <? xml-stylesheet type ="text/xsl" href ="configuration.xsl" ?> <!-- Put site-specific property overrides in this file. --> < configuration > < property > < name >mapred.job.tracker </ name > < value >hadoopmaster-177.tj:9001 </ value > </ property > < property > < name >mapred.local.dir </ name > < value >/opt/data/hadoop1/mapred/mrlocal </ value > < final >true </ final > </ property > < property > < name >mapred.system.dir </ name > < value >/opt/data/hadoop1/mapred/mrsystem </ value > < final >true </ final > </ property > < property > < name >mapred.tasktracker.map.tasks.maximum </ name > < value >12 </ value > < final >true </ final > </ property > < property > < name >mapred.tasktracker.reduce.tasks.maximum </ name > < value >4 </ value > < final >true </ final > </ property > < property > < name >mapred.child.java.opts </ name > < value >-Xmx1536M </ value > </ property > < property > < name >mapred.compress.map.output </ name > < value >true </ value > </ property > < property > < name >mapred.map.output.compression.codec </ name > < value >com.hadoop.compression.lzo.LzoCodec </ value > </ property > < property > < name >mapred.child.java.opts </ name > < value >-Djava.library.path=/opt/hadoopgpl/native/Linux-amd64-64 </ value > </ property > </ configuration >
然后masters文件也要指向master namenode的主机名,在这里也就是hadoopmaster-177.tj
slaves文件内容则仍是Datanode的主机名。
而真正的主namenode里面的conf/master文件则写从namenode的主机名
hadoopslave-189.tj
slaves不变。
意思就是主namenode的masters写secondary的主机名,从namenode的masters写primary的主机名。
这样,从namenode到datanode到secondary就完整了。可以开始真实的运行map/reduce运算了。如果namenode失效了,可以从secondary的checkpoint指定的路径里将数据恢复到namenode。