HDFS – File System implementations in HDFS Java API

As discussed in previous post, HDFS files in Hadoop can be accessed in many ways. One of such way of accessing HDFS files is through HDFS’s Java API as entire Hadoop is written in Java after all. Here are the File System implementations in HDFS.

org.apache.hadoop.fs.FileSystem

File system is an abstract class in Hadoop, serves as a generic file system representation. Note that it’s a class and not an interface. Other concrete classes implement from this class. It is implemented in several flavors such as Distributed and Local.Some of the concrete File System implementations include –

org.apache.hadoop.fs.LocalFileSystem

It represents a good old native filesystem using local disks. No need to start hdfs or hdfs need to be running for accessing this file system. We will make use of this file system generally to debug code in developer environment.

org.apache.hadoop.hdfs.DistributedFileSystem

Represents a fault tolerant, highly available distributed file system. Known as Hadoop Distributed File System.

org.apache.hadoop.hdfs.HttpFileSystem

Using this filesystem we can access HDFS in read-only  mode over http.

org.apache.hadoop.fs.ftp.FTPFileSystem

Used to access file system on FTP server.

org.apache.hadoop.fs.S3FileSystem

For accessing files on cloud such as Amazon S3 cloud, more infomation at

https://wiki.apache.org/hadoop/AmazonS3

org.apache.hadoop.fs.kfs.KosmosFileSystem

Kosmos FileSystem is backed by Cloud Store. more details can be found here.

https://code.google.com/p/kosmosfs/

References

http://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/FileSystem.html

http://hadoop.apache.org/docs/current/api/

Leave a Comment