Storage Elasticity to Hadoop DateNode using LVM | Resize Static Partition without data loss

This article shows how we can change Hadoop data node storage on the fly without data loss. This will be achieved by integrating Hadoop with LVM(Logical Volume Manager).

What I’ll do
1.Configure Hadoop master with data node
2. Create an LVM partition.
3. Mount this partition to the data node directory
4. Now, we can resize the partition, and It’ll be reflected in Hadoop.

As simple as it is…

Here, I have Configured my Hadoop Cluster on AWS EC2.

My Hadoop Master Config. is as Follows…

Data Node Configs…

Here, I am giving storage to master node from “ /dn ” directory. We will now create an LVM partition and will mount that partition to “ /dn ”.

Before LVM, I had a Static Partition of 10GiB Mounted to /dn Directory.

LVM: Logical Volume Manager

LVM is an advanced version of Normal Partition.
In Normal Partition, there was a limit that if we create a 10GiB Partition, we can’t resize it.
Whereas in LVM, the Storage comes from different Volumes, and if we create a partition of 10GiB and want to put a data of 15 Gib, then it can be dynamically increased.

Steps To Create an LVM Partition

  1. Create a Physical Volume from our storage disk.
  2. Create a Volume Group with single or multiple PV.
  3. Create a Logical Volume from Volume Group.
  4. Format Logical Volume.
  5. Mount it.

Creating an LVM partition:

To demonstrate this. I have added extra storage of 20GiB.

The Storage Name is /dev/xvdf. We'll be using this to create a lvm partition.

Creating a Physical Volume( PV )

To create a PV use pvcreate /dev/xvdf

Creating a Volume Group( VG )

To create a VG use vgcreate hadoopelas /dev/xvdf
here, hadoopelas is the name of VG( it can be anything)
If we have multiple PV, we can add them too in the same VG.

Creating an LVM partition using VG

To create an LV use lvcreate --size 10G --name hadooplv hadoopelas
here, hadooplv is the name of LV name and Hadoopelas is VG Name.

Fdisk After Create an LV.

“/dev/mapper/hadoopelas-hadooplv” is the name of the partition That we have to mount to our data node directory. But before that, we have to format it.

use mkfs.ex4 /dev/mapper/hadoopelap-hadooplv

and then we’ll mount it to our Data Node directory.

use mount /dev/mapper/hadoopelap-hadooplv /dn

Everything Done…Now we Have got an Elastic Storage DataNode.

To Resize the partition, we Have to resize the LV ( hadooplv ).

Lets Add 5 Gib to Our DataNode.

Use lvextend --size +5G /dev/mapper/hadoopelas-hadooplv
this will increase partition size by +5Gib.
But this extra 5Gib is not formatted to format this extra part; we’ll use resize2fs( it only format the extra left part ).

resize2fs /dev/mapper/hadoopelas-hadooplv

Now, this Extra 5 GiB will be reflected in Hadoop Cluster.

We can now increase/decrease partition. And hence made it Elastic.

Thanks For Reading 🙏

Connect With Me:
LinkedIn: Here

Tech. Explorer