Skip to main content
Version: 4.0.8

SaunaFS documentation overview

Introduction

About this document

This SaunaFS documentation, as of 29th December 2023, is in an early draft stage and primarily builds upon pre-existing documentation. This may lead to certain details being absent, incorrect, or outdated. In cases of confusion or questions, reaching out to the development team is advised.

Despite its preliminary status, this document aims to provide sufficient information for the initial setup and operation of SaunaFS. It includes essential details and a quick start guide to facilitate basic configurations.

Feedback is highly valued at this stage. It not only aids in enhancing this document for future reference but also assists others in their SaunaFS setup process.

Please note that SaunaFS software, Windows Client software, and this documentation are all licensed separately under different licensing formats. For more information about the licensing terms for each component, please see the Licensing section of this documentation.

We appreciate your choice to use SaunaFS.

Architectural overview of SaunaFS

Architectural overview of SaunaFS

Important notes

Hardware recommendations

There are no fixed requirements for hardware, although, for better results it is recommended to have:

  • 10 or 25 GbE networking
  • Bonding (e.g., MC-LAG) across the switches for redundant setup
  • Nodes with unified hardware configurations

Sample hardware configuration for node:

  • 1x Intel® Xeon® Silver or higher CPU (or AMD equivalent)
  • 4x 16GB DDR4 ECC Registered DIMM
  • 2x 240GB Enterprise SSD
  • 10x Enterprise HDD
  • 1x Network Interface Card 25GbE Dual-Port

If unified nodes cannot be provided, at least Master/Shadow nodes should have hardware configuration like sample setup.

Alternatively, you can use the hardware sizer application to calculate hardware requirements for your specific needs: https://diaway.com/saunafs#calc

The suggestion for unified hardware configuration is drawn from the future updates that will introduce the setup with multiple master servers. It is our current effort to introduce distributed metadata enabled architecture to circumvent the limitations imposed by the RAM capacity of a single node within a namespace and also to introduce parallel access to metadata.

The following table will give the estimated amount of RAM occupied by the metadata correlated to the number of files in SaunaFS.

Number of filesall data structures overhead
1500 B
1000500 KB
1 000 000500 MB
100 000 00050 GB
1 000 000 000500 GB

Manual pages

This document does not elaborate on every command or configuration option. For comprehensive information, manual (man) files are provided in the Debian packages for both commands and configuration files.

To view a man file for a command, e.g., saunafs-admin:

  • man saunafs-admin

To view a man file for a configuration, e.g., sfsmaster.cfg:

  • man sfsmaster.cfg

Systemd services

The Debian packages include systemd services for initiating various SaunaFS services. This document assumes the use of these services in its examples.

For systems without systemd, or those choosing not to use it, examining the service files for custom setup or direct command usage is recommended.

Quick overview of SaunaFS

SaunaFS is a distributed POSIX file system inspired by the Google File System, comprising Metadata Servers (Master, Shadows, Metaloggers), Data Servers (Chunkservers), and Clients (supporting multiple operating systems and NFS). It employs a chunk-based storage architecture, segmenting files into 64 MiB chunks subdivided into 64 KiB blocks, each with 4 bytes of CRC (Cyclic Redundancy Check) for data integrity.

The write process in SaunaFS involves clients requesting the Master server for suitable Chunkservers for file chunk storage. Data is transferred directly to Chunkservers in 64 KiB blocks, with CRC verification by Chunkservers and subsequent metadata updates. SaunaFS utilizes Reed-Solomon erasure coding for redundancy, enhancing data integrity and availability. For instance, a 65 MiB file is segmented into four 16 MiB data parts and two parity parts.

The system also prioritizes data resiliency through data scrubbing and CRC32 checksum verification. Additional features include instant copy-on-write snapshots, efficient metadata logging, and hardware integration without downtime.