CephFS - cephfs

Ceph is an open-source storage platform that stores its data in a storage cluster based on RADOS. It is highly scalable and, as a distributed system without a single point of failure, very reliable.

Tip

If you want to quickly set up a basic Ceph cluster, check out MicroCeph.

Ceph provides different components for block storage and for file systems.

CephFS is Ceph’s file system component that provides a robust, fully-featured POSIX-compliant distributed file system. Internally, it maps files to Ceph objects and stores file metadata (for example, file ownership, directory paths, access permissions) in a separate data pool.

Terminology

Ceph uses the term object for the data that it stores. The daemon that is responsible for storing and managing data is the Ceph OSD. Ceph’s storage is divided into pools, which are logical partitions for storing objects. They are also referred to as data pools, storage pools or OSD pools.

A CephFS file system consists of two OSD storage pools, one for the actual data and one for the file metadata.

cephfs driver in LXD

Note

The cephfs driver can only be used for custom storage volumes with content type filesystem.

For other storage volumes, use the Ceph driver. That driver can also be used for custom storage volumes with content type filesystem, but it implements them through Ceph RBD images.

Unlike other storage drivers, this driver does not set up the storage system but assumes that you already have a Ceph cluster installed.

You can either create the CephFS file system that you want to use beforehand and specify it through the source option, or specify the cephfs.create_missing option to automatically create the file system and the data and metadata OSD pools (with the names given in cephfs.data_pool and cephfs.meta_pool).

This driver also behaves differently than other drivers in that it provides remote storage. As a result and depending on the internal network, storage access might be a bit slower than for local storage. On the other hand, using remote storage has big advantages in a cluster setup, because all cluster members have access to the same storage pools with the exact same contents, without the need to synchronize storage pools.

LXD assumes that it has full control over the OSD storage pool. Therefore, you should never maintain any file system entities that are not owned by LXD in a LXD OSD storage pool, because LXD might delete them.

The cephfs driver in LXD supports snapshots if snapshots are enabled on the server side.

Configuration options

The following configuration options are available for storage pools that use the cephfs driver and for storage volumes in these pools.

Storage pool configuration

cephfs.cluster_name

Name of the Ceph cluster that contains the CephFS file system

Key: cephfs.cluster_name
Type:

string

Default:

ceph

cephfs.create_missing

Automatically create the CephFS file system

Key: cephfs.create_missing
Type:

bool

Default:

false

Use this option if the CephFS file system does not exist yet. LXD will then automatically create the file system and the missing data and metadata OSD pools.

cephfs.data_pool

Data OSD pool name

Key: cephfs.data_pool
Type:

string

This option specifies the name for the data OSD pool that should be used when creating a file system automatically.

cephfs.fscache

Enable use of kernel fscache and cachefilesd

Key: cephfs.fscache
Type:

bool

Default:

false

cephfs.meta_pool

Metadata OSD pool name

Key: cephfs.meta_pool
Type:

string

This option specifies the name for the file metadata OSD pool that should be used when creating a file system automatically.

cephfs.osd_pg_num

Number of placement groups when creating missing OSD pools

Key: cephfs.osd_pg_num
Type:

string

This option specifies the number of OSD pool placement groups (pg_num) to use when creating a missing OSD pool.

cephfs.path

The base path for the CephFS mount

Key: cephfs.path
Type:

string

Default:

/

cephfs.user.name

The Ceph user to use

Key: cephfs.user.name
Type:

string

Default:

admin

source

Existing CephFS file system or file system path to use

Key: source
Type:

string

volatile.pool.pristine

Whether the CephFS file system was empty on creation time

Key: volatile.pool.pristine
Type:

string

Default:

true

Tip

In addition to these configurations, you can also set default values for the storage volume configurations. See Configure default values for storage volumes.

Storage volume configuration

security.shared

Enable volume sharing

Key: security.shared
Type:

bool

Default:

same as volume.security.shared or false

Condition:

custom block volume

Enabling this option allows sharing the volume across multiple instances despite the possibility of data loss.

security.shifted

Enable ID shifting overlay

Key: security.shifted
Type:

bool

Default:

same as volume.security.shifted or false

Condition:

custom volume

Enabling this option allows attaching the volume to multiple isolated instances.

security.unmapped

Disable ID mapping for the volume

Key: security.unmapped
Type:

bool

Default:

same as volume.security.unmappped or false

Condition:

custom volume

size

Size/quota of the storage volume

Key: size
Type:

string

Default:

same as volume.size

Condition:

appropriate driver

snapshots.expiry

When snapshots are to be deleted

Key: snapshots.expiry
Type:

string

Default:

same as volume.snapshots.expiry

Condition:

custom volume

Specify an expression like 1M 2H 3d 4w 5m 6y.

snapshots.pattern

Template for the snapshot name

Key: snapshots.pattern
Type:

string

Default:

same as volume.snapshots.pattern or snap%d

Condition:

custom volume

You can specify a naming template that is used for scheduled snapshots and unnamed snapshots.

The snapshots.pattern option takes a Pongo2 template string to format the snapshot name.

To add a time stamp to the snapshot name, use the Pongo2 context variable creation_date. Make sure to format the date in your template string to avoid forbidden characters in the snapshot name. For example, set snapshots.pattern to {{ creation_date|date:'2006-01-02_15-04-05' }} to name the snapshots after their time of creation, down to the precision of a second.

Another way to avoid name collisions is to use the placeholder %d in the pattern. For the first snapshot, the placeholder is replaced with 0. For subsequent snapshots, the existing snapshot names are taken into account to find the highest number at the placeholder’s position. This number is then incremented by one for the new name.

snapshots.schedule

Schedule for automatic volume snapshots

Key: snapshots.schedule
Type:

string

Default:

same as snapshots.schedule

Condition:

custom volume

Specify either a cron expression (<minute> <hour> <dom> <month> <dow>), a comma-separated list of schedule aliases (@hourly, @daily, @midnight, @weekly, @monthly, @annually, @yearly), or leave empty to disable automatic snapshots (the default).

volatile.uuid

The volume’s UUID

Key: volatile.uuid
Type:

string

Default:

random UUID