Release notes for Gluster 5.0

This is a major release that includes a range of code improvements and stability fixes among a few features as noted below.

A selection of the key features and changes are documented on this page. A full list of bugs that have been addressed is included further below.

Announcements

  1. Releases that receive maintenance updates post release 5 are, 4.1 (reference)

NOTE: 3.12 long term maintenance release, will reach end of life (EOL) with the release of 5.0. (reference)

  1. Release 5 will receive maintenance updates around the 10th of every month for the first 3 months post release (i.e Nov'18, Dec'18, Jan'18). Post the initial 3 months, it will receive maintenance updates every 2 months till EOL. (reference)

Major changes and features

Features are categorized into the following sections,

Management

GlusterD2

IMP: GlusterD2 in Gluster-5 is still considered a preview and is experimental. It should not be considered ready for production use. Users should still expect some breaking changes even though all efforts would be taken to ensure that these can be avoided. As GD2 is still under heavy development, new features can be expected throughout the Gluster 5 release.

The following major changes have been committed to GlusterD2 since v4.1.0.

  1. Volume snapshots : Most snapshot operations are available including create, delete, activate, deactivate, clone and restore.

  2. Volume heal: Support for full heal and index heal for replicate volumes has been implemented.

  3. Tracing with Opencensus: Support for tracing distributed operations has been implemented in GD2, using the Opencensus API. Tracing instrumentation has been done for volume create, list and delete operations. Other operations will follow subsequently.

  4. Portmap refactoring: Portmap in GlisterD2 no longer selects a port for the bricks to listen on, instead leaving the choice upto the bricks. Portmap only saves port information provided by brick during signin.

  5. Smartvol API merged with volume create API: The smart volume API which allows user to create a volume by just specifying a size has been merged with the normal volume create API.

  6. Configure GlusterD2 with environment variables: In addition to CLI flags, and the config file, GD2 configuration options can be set using environment variables.

In addition to the above, many changes have been merged for minor bug-fixes and to help with testing.

Refer to the user documentation section for details on how to get started with GlusterD2.

Standalone

1. Entry creation and handling, consistency is improved

The dentry serializer feature was introduced in Gluster 4.0, to strengthen the consistency handling of entry operations in the Gluster stack. Entry operations refer to creating, linking, renaming and unlinking of files and directory names into the filesystem space.

When this feature was first introduced (in 4.0) it was optional, with this release this feature is enabled by default.

2. Python code in Gluster packages is Python 3 ready

3. Quota fsck script to correct quota accounting

See usage documentation here

4. Added noatime option in utime xlator

Enabling the utime and ctime feature, enables Gluster to maintain consistent change and modification time stamps on files and directories across bricks.

The utime xlator is enhanced with a noatime option and is set by default to enabled, when the utime feature is enabled. This helps to ignore atime updates for operations that change may trigger an atime update on the file system objects.

To enable the feature use,

# gluster volume set <volname> features.utime on
# gluster volume set <volname> features.ctime on

5. Added ctime-invalidation option in quick-read xlator

Quick-read xlator by default uses mtime (files last modification time) to identify changes to file data. However, there are applications, like rsync, which explicitly set mtime making it unreliable for the purpose of identifying changes to the file content.

Since ctime (files last status change time) also changes when content of a file changes and cannot be set explicitly by applications, it becomes a more reliable source to identify staleness of cached data.

The ctime-invalidation option makes quick-read to prefer ctime over mtime to validate staleness of its cache.

To enable this option use,

# gluster volume set <volname> ctime-invalidation on

NOTE: Using ctime can result in false positives as ctime is updated even on attribute changes, like mode bits, without changes to file data. As a result this option is recommended in situations where mtime is not reliable.

6. Added shard-deletion-rate option in shard xlator

The shard-deletion-rate option is introduced, to configure the number of shards to delete in parallel when a file that is sharded is deleted.

The default value is set at 100, but can be increased to delete more shards in parallel for faster space reclamation.

To change the defaults for this option use,

# gluster volume set <volname> shard-deletion-rate <n>

NOTE: The upper limit is unbounded, use it with caution as a very large number will cause lock contention on the bricks. As an example, during testing, an upper limit of 125000 was enough to cause timeouts and hangs in the gluster processes due to lock contention.

7. Removed last usage of MD5 digest in code, towards better FIPS compliance

In an effort to ensure that Gluster can be installed and deployed on machines that are compliant with the requirements for FIPS, remaining uses of MD5 digest is removed from the code base.

Addressing this feature's requirements was initiated during the 4.0 release, at which point enabling user space snapshots, which still used MD5 for certain needs, broke the FIPS compliance requirements. This limitation is now addressed in this release.

8. Code improvements

Over the course of this release, the contributors have been active in addressing various Coverity issues, GCC and clang warnings, clang formatting of the code base, micro improvements to GLibC API usage and memory handling around string handling and allocation routines.

The above are ongoing efforts, but major strides were made during this release to actively address code quality in these areas.

Major issues

  1. The following options are removed from the code base and require to be unset before an upgrade from releases older than release 4.1.0,
  2. features.lock-heal
  3. features.grace-timeout

To check if these options are set use,

# gluster volume info

and ensure that the above options are not part of the Options Reconfigured: section in the output of all volumes in the cluster.

If these are set, then unset them using the following commands,

# gluster volume reset <volname> <option>

NOTE: Failure to do the above may result in failure during online upgrades, and the reset of these options to their defaults needs to be done prior to upgrading the cluster.

Bugs addressed

Bugs addressed since release-4.1.0 are listed below.

  • #853601: working-directory should be protected from being a brick
  • #1312832: tests fail because bug-924726.t depends on netstat
  • #1390050: Elasticsearch get CorruptIndexException errors when running with GlusterFS persistent storage
  • #1405147: glusterfs (posix-acl xlator layer) checks for "write permission" instead for "file owner" during open() when writing to a file
  • #1425325: gluster bash completion leaks TOP=0 into the environment
  • #1437780: don't send lookup in fuse_getattr()
  • #1455872: [Perf]: 25% regression on sequential reads on EC over SMB3
  • #1492847: core (named threads): flood of -Wformat-truncation warnings with gcc-7.
  • #1512691: PostgreSQL DB Restore: unexpected data beyond EOF
  • #1524323: No need to load ctr xlator if user has not configured tiering
  • #1526780: ./run-tests-in-vagrant.sh fails because of disabled Gluster/NFS
  • #1533000: Quota crawl regressed
  • #1537602: Georeplication tests intermittently fail
  • #1543279: Moving multiple temporary files to the same destination concurrently causes ESTALE error
  • #1545048: [brick-mux] process termination race while killing glusterfsd on last brick detach
  • #1546103: run-tests-in-vagrant.sh should return test status
  • #1558574: Coverity: Warning for singlton array..
  • #1558921: Gluster volume smb share options are getting overwritten after restating the gluster volume
  • #1561332: merge ssl infra with epoll infra
  • #1564071: directories are invisible on client side
  • #1564149: Agree upon a coding standard, and automate check for this in smoke
  • #1564419: Client side memory leak in encryption xlator (crypt.c).
  • #1568521: shard files present even after deleting vm from ovirt UI
  • #1569345: Need COMMITMENT from community for GPL Cure.
  • #1569399: glusterfsd should be able to start without any other arguments than a single volfile.
  • #1570538: linux untar errors out at completion during disperse volume inservice upgrade
  • #1570962: print the path of the corrupted object in scrub status
  • #1574421: Provide a way to get the hashed-subvol for a file
  • #1575381: gluster volume heal info prints extra newlines
  • #1575490: [geo-rep]: Upgrade fails, session in FAULTY state
  • #1575587: Leverage MDS subvol for dht_removexattr also
  • #1575716: gfapi: broken symbol versions
  • #1575742: Change op-version of master to 4.2.0 for future options that maybe added
  • #1575858: quota crawler fails w/ TLS enabled
  • #1575864: glusterfsd crashing because of RHGS WA?
  • #1575887: Additional log messages in dht_readdir(p)_cbk
  • #1575910: DHT Log flooding in mount log "key=trusted.glusterfs.dht.mds [Invalid argument]"
  • #1576179: [geo-rep]: Geo-rep scheduler fails
  • #1576392: Glusterd crashed on a few (master) nodes
  • #1576418: Warning messages generated for the removal of extended attribute security.ima flodding client logs
  • #1576767: [geo-rep]: Lot of changelogs retries and "dict is null" errors in geo-rep logs
  • #1576842: cloudsync: make plugins configurable
  • #1577574: brick crash seen while creating and deleting two volumes in loop
  • #1577627: [Geo-rep]: Status in ACTIVE/Created state
  • #1577672: Brick-mux regressions failing for over 8+ weeks on master
  • #1577731: [Ganesha] "Gluster nfs-ganesha enable" commands sometimes gives output as "failed" with "Unlocking failed" error messages ,even though cluster is up and healthy in backend
  • #1577744: The tool to generate new xlator template code is not upto date
  • #1578325: Input/Output errors on a disperse volume with concurrent reads and writes
  • #1578650: If parallel-readdir is enabled, the readdir-optimize option even when it is set to on it behaves as off
  • #1578721: Statedump prints memory usage statistics twice
  • #1578823: Remove EIO from the dht_inode_missing macro
  • #1579276: rpc: The gluster auth version is always AUTH_GLUSTERFS_v2
  • #1579769: inode status command is broken with distributed replicated volumes
  • #1579786: Thin-arbiter: Provide script to start and run thin arbiter process
  • #1579788: Thin-arbiter: Have the state of volume in memory
  • #1580020: ctime: Rename and unlink does not update ctime
  • #1580238: Fix incorrect rebalance log message
  • #1580269: [Remove-brick+Rename] Failure count shows zero though there are file migration failures
  • #1580352: Glusterd memory leaking in gf_gld_mt_linebuf
  • #1580529: posix/ctime: Access time is not updated for file with a hardlink
  • #1580532: posix/ctime: The first lookup on file is not healing the gfid
  • #1581035: posix/ctime: Mtime is not updated on setting it to older date
  • #1581345: posix unwinds readdirp calls with readdir signature
  • #1581735: bug-1309462.t is failing reliably due to changes in security.capability changes in the kernel
  • #1582051: Fix failure of readdir-ahead/bug-1439640.t in certain cases
  • #1582516: libgfapi: glfs init fails on afr volume with ctime feature enabled
  • #1582704: rpc_transport_unref() called for an unregistered socket fd
  • #1583018: changelog: Changelog is not capturing rename of files
  • #1583565: [distribute]: Excessive 'dict is null' errors in geo-rep logs
  • #1583583: "connecting" state in protocol client is useless
  • #1583937: Brick process crashed after upgrade from RHGS-3.3.1 async(7.4) to RHGS-3.4(7.5)
  • #1584098: 'custom extended attributes' set on a directory are not healed after bringing back the down sub-volumes
  • #1584483: afr: don't update readables if inode refresh failed on all children
  • #1584517: Inconsistent access permissions on directories after bringing back the down sub-volumes
  • #1584864: sometime messages
  • #1584981: posix/ctime: EC self heal of directory is blocked with ctime feature enabled
  • #1585391: glusteshd wrong status caused by gluterd big lock
  • #1585585: Cleanup "connected" state management of rpc-clnt
  • #1586018: (f)Setxattr and (f)removexattr invalidates the stat cache in md-cache
  • #1586020: [GSS] Pending heals are not getting completed in CNS environment
  • #1586342: Refactor the distributed test code to make it work for ipv4
  • #1586363: Refactor rebalance code
  • #1589253: After creating and starting 601 volumes, self heal daemon went down and seeing continuous warning messages in glusterd log
  • #1589691: xdata is leaking in server3_3_seek
  • #1589782: [geo-rep]: Geo-replication in FAULTY state - CENTOS 6
  • #1589842: [USS] snapview server does not go through the list of all the snapshots for validating a snap
  • #1590193: /usr/sbin/gcron.py aborts with OSError
  • #1590385: Refactor dht lookup code
  • #1590655: Excessive logging in posix_check_internal_writes() due to NULL dict
  • #1590710: Gluster Block PVC fails to mount on Jenkins pod
  • #1591193: lookup not assigning gfid if file is not present in all bricks of replica
  • #1591580: Remove code duplication in protocol/client
  • #1591621: Arequal checksum mismatch on older mount
  • #1592141: Null pointer deref in error paths
  • #1592275: posix/ctime: Mdata value of a directory is different across replica/EC subvolume
  • #1592509: ctime: Self heal of symlink is failing on EC subvolume
  • #1593232: CVE-2018-10841 glusterfs: access trusted peer group via remote-host command [glusterfs upstream]
  • #1593351: mount.glusterfs incorrectly reports "getfattr not found"
  • #1593548: Stack overflow in readdirp with parallel-readdir enabled
  • #1593562: Add new peers to Glusto
  • #1593651: gnfs nfs.register-with-portmap issue with ipv6_default
  • #1595174: Found an issue on using lock before init in md-cache
  • #1595190: rmdir is leaking softlinks to directories in .glusterfs
  • #1595320: gluster wrongly reports bricks online, even when brick path is not available
  • #1595492: tests: remove tarissue.t from BAD_TEST
  • #1595726: tests/geo-rep: Add test case for symlink rename
  • #1596020: Introduce database group profile
  • #1596513: glustershd crashes when index heal is launched before graph is initialized.
  • #1596524: 'replica 3 aribiter 1' is not a industry standard way of telling 2-way replicate with arbiter.
  • #1596789: Update mount-shared-storage.sh to automatically include all enabled glusterfs mounts in fstab
  • #1597156: Need a simpler way to find if a replica/ec subvolume is up
  • #1597247: restart all the daemons after all the bricks
  • #1597473: introduce cluster.daemon-log-level option
  • #1597512: Remove contrib/ipaddr-py
  • #1597540: tests/geo-rep: Add test cases for rsnapshot use case
  • #1597563: [geo-rep+tiering]: Hot and Cold tier brick changelogs report rsync failure
  • #1597568: Mark brick online after port registration even for brick-mux cases
  • #1597627: tests/bugs/core/bug-1432542-mpx-restart-crash.t is generated crash
  • #1597662: Stale entries of snapshots need to be removed from /var/run/gluster/snaps
  • #1597776: br-state-check.t crashed while brick multiplex is enabled
  • #1597805: Stale lock with lk-owner all-zeros is observed in some tests
  • #1598325: Replace the BROKEN_TESTS environment variable value
  • #1598345: gluster get-state command is crashing glusterd process when geo-replication is configured
  • #1598390: Remove extras/prot_filter.py
  • #1598548: Disabling iostats diagnostics.stats-dump-interval (set to 0) does not terminate the dump thread
  • #1598663: Don't execute statements after decrementing call count in afr
  • #1598884: [geo-rep]: [Errno 2] No such file or directory
  • #1598926: Misleading error messages on bricks caused by lseek
  • #1598977: [geo-rep]: geo-replication scheduler is failing due to unsuccessful umount
  • #1599219: configure fails complaining absence of libxml2-devel
  • #1599250: bug-1432542-mpx-restart-crash.t takes a lot of time to complete cleanup
  • #1599628: To find a compatible brick ignore diagnostics.brick-log-level option while brick mux is enabled
  • #1599783: _is_prefix should return false for 0-length strings
  • #1600405: [geo-rep]: Geo-replication not syncing renamed symlink
  • #1600451: crash on glusterfs_handle_brick_status of the glusterfsd
  • #1600687: fuse process segfault when use resolve-gids option
  • #1600812: A new volume set option to for GD2 quota integration
  • #1600878: crash seen while running regression, intermittently.
  • #1600963: get the failed test details into gerrit output itself
  • #1601166: performance.read-ahead causes huge increase in unnecessary network traffic
  • #1601390: Distributed testing: Fix build environment
  • #1601423: memory leak in get-state when geo-replication session is configured
  • #1601683: dht: remove useless argument from dht_iatt_merge
  • #1602070: [SNAPSHOT] snapshot daemon crashes if a fd from a deleted snapshot is accessed
  • #1602121: avoid possible glusterd crash in glusterd_verify_slave
  • #1602236: When reserve limits are reached, append on an existing file after truncate operation results to hang
  • #1602866: dht: Crash seen in thread dht_dir_attr_heal
  • #1603063: ./tests/bugs/glusterd/validating-server-quorum.t is generated core
  • #1605056: [RHHi] Mount hung and not accessible
  • #1605077: If a node disconnects during volume delete, it assumes deleted volume as a freshly created volume when it is back online
  • #1607049: Excessive logging in posix_set_parent_ctime()
  • #1607319: Remove uuid from contrib/
  • #1607689: Memory leaks on glfs_fini
  • #1607783: Segmentation fault while using gfapi while getting volume utilization
  • #1608175: Skip hash checks in dht_readdirp_cbk if dht has a single child subvol.
  • #1608564: line coverage tests failing consistently over a week
  • #1608566: line coverage tests: glusterd crash in ./tests/basic/sdfs-sanity.t
  • #1608568: line coverage tests: bug-1432542-mpx-restart-crash.t times out consistently
  • #1608684: Change glusto ownership to reflect current reality
  • #1608991: Remove code duplication in socket
  • #1609126: Fix mem leak and smoke failure for gcc8 in cloudsync
  • #1609207: thin arbiter: set notify-contention option to yes
  • #1609337: Remove argp-standalone from contrib/
  • #1609551: glusterfs-resource-agents should not be built for el6
  • #1610236: [Ganesha] Ganesha crashed in mdcache_alloc_and_check_handle while running bonnie and untars with parallel lookups
  • #1610256: [Ganesha] While performing lookups from two of the clients, "ls" command got failed with "Invalid argument"
  • #1610405: Geo-rep: Geo-rep regression times out occasionally
  • #1610726: Fuse mount of volume fails when gluster_shared_storage is enabled
  • #1611103: online_brick_count check in volume.rc should ignore bitrot and scrubber daemons
  • #1611566: tests/bitrot: tests/bitrot/bug-1373520.t fails intermittently
  • #1611692: Mount process crashes on a sharded volume during rename when dst doesn't exist
  • #1611834: glusterfsd crashes when SEEK_DATA/HOLE is not supported
  • #1612017: MAINTAINERS: Add Xavier Hernandez as peer for shard xlator
  • #1612037: Entry will be present even if the gfid link creation inside .glusterfs fails
  • #1612054: Test case bug-1586020-mark-dirty-for-entry-txn-on-quorum-failure.t failure
  • #1612418: Brick not coming up on a volume after rebooting the node
  • #1612750: gfapi: Use inode_forget in case of unlink/rename objects
  • #1613098: posix-acl: skip acl_permits check when the owner setting GF_POSIX_ACL_xxxx
  • #1613807: Fix spurious failures in tests/basic/afr/granular-esh/replace-brick.t
  • #1614062: Provide/preserve tarball of retried tests
  • #1614088: kill_brick function needs to wait for brick to be killed
  • #1614124: glusterfsd process crashed in a multiplexed configuration during cleanup of a single brick-graph triggered by volume-stop.
  • #1614142: Fix the grammar error in the rpc log
  • #1614168: [uss]snapshot: posix acl authentication is not working as expected
  • #1614654: Potential fixes for tests/basic/afr/add-brick-self-heal.t failure
  • #1614662: ./tests/bugs/replicate/bug-1448804-check-quorum-type-values.t
  • #1614718: Fix spurious failures in tests/bugs/index/bug-1559004-EMLINK-handling.t
  • #1614730: Test case bug-1433571-undo-pending-only-on-up-bricks.t failure
  • #1614799: Geo-rep: Few workers fails to start with out any failure
  • #1615037: Multiplex tests use a cleanup pattern that results in empty tarballs on failure
  • #1615078: tests/bugs/replicate/bug-1408712.t fails.
  • #1615092: tests/bugs/shard/configure-lru-limit.t spurious failure
  • #1615096: ./tests/bugs/quick-read/bug-846240.t fails spuriously
  • #1615239: Fix ./tests/basic/afr/replace-brick-self-heal.t failure
  • #1615331: gfid-mismatch-resolution-with-fav-child-policy.t is failing
  • #1615474: Rebalance status shows wrong count of "Rebalanced-files" if the file has hardlinks
  • #1615582: test: ./tests/basic/stats-dump.t fails spuriously not finding queue_size in stats output for some brick
  • #1615703: [Disperse] Improve log messages for EC volume
  • #1615789: Come up with framework to test thin-arbiter
  • #1618004: [GSS] glusterd not starting after upgrade due to snapshots error in RHEV + RHGS
  • #1619027: geo-rep: Active/Passive status change logging is redundant
  • #1619423: cli: Command gluster volume statedump <volname> dumps core
  • #1619475: NetBSD memory detection issue
  • #1619720: posix_mknod does not update trusted.pgfid.xx xattr correctly
  • #1619843: Snapshot status fails with commit failure
  • #1620544: Brick process NOT ONLINE for heketidb and block-hosting volume
  • #1621981: dht: File rename removes the .glusterfs handle for linkto file
  • #1622076: [geo-rep]: geo-rep reverse sync in FO/FB can accidentally delete the content at original master incase of gfid conflict in 3.4.0 without explicit user rmdir
  • #1622422: glusterd cli is showing brick status N/A even brick is consumed by a brick process
  • #1622549: libgfchangelog: History API fails
  • #1622665: clang-scan report: glusterfs issues
  • #1622821: Prevent hangs while increasing replica-count/replace-brick for directory hierarchy
  • #1623408: rpc: log fuse request ID with gluster transaction ID
  • #1623759: [Disperse] Don't send final version update if non data fop succeeded
  • #1624244: DHT: Rework the virtual xattr to get the hash subvol
  • #1624440: Fail volume stop operation in case brick detach request fails
  • #1625089: CVE-2018-10911 glusterfs: Improper deserialization in dict.c:dict_unserialize() can allow attackers to read arbitrary memory
  • #1625095: CVE-2018-10930 glusterfs: Files can be renamed outside volume
  • #1625096: CVE-2018-10923 glusterfs: I/O to arbitrary devices on storage server
  • #1625097: CVE-2018-10907 glusterfs: Stack-based buffer overflow in server-rpc-fops.c allows remote attackers to execute arbitrary code
  • #1625102: CVE-2018-10913 glusterfs: Information Exposure in posix_get_file_contents function in posix-helpers.c
  • #1625106: CVE-2018-10904 glusterfs: Unsanitized file names in debug/io-stats translator can allow remote attackers to execute arbitrary code
  • #1625643: Use CALLOC in dht_layouts_init
  • #1626319: DH ciphers disabled errors are encountered on basic mount & unmount with ssl enabled setup
  • #1626346: dht: Use snprintf in dht_filter_loc_subvol_key
  • #1626394: dht_create: Create linkto files if required when using dht_filter_loc_subvol_key
  • #1626787: sas workload job getting stuck after sometime
  • #1627044: Converting to replica 2 volume is not throwing warning
  • #1627620: SAS job aborts complaining about file doesn't exist
  • #1628668: Update op-version from 4.2 to 5.0
  • #1629877: GlusterFS can be improved (clone for Gluster-5)
  • #1630673: geo-rep: geo-rep config set fails to set rsync-options
  • #1630804: libgfapi-python: test_listdir_with_stat and test_scandir failure on release 5 branch
  • #1633015: ctime: Access time is different with in same replica/EC volume
  • #1633242: 'df' shows half as much space on volume after upgrade to RHGS 3.4
  • #1633552: glusterd crash in regression build
  • #1635373: ASan (address sanitizer) fixes - Blanket bug
  • #1635972: Low Random write IOPS in VM workloads
  • #1635975: Writes taking very long time leading to system hogging
  • #1636162: [SNAPSHOT]: with brick multiplexing, snapshot restore will make glusterd send wrong volfile
  • #1636842: df shows Volume size as zero if Volume created and mounted using Glusterd2
  • #1638159: data-self-heal in arbiter volume results in stale locks.
  • #1638163: split-brain observed on parent dir
  • #1639688: core: backport uuid fixes
  • #1640392: io-stats: garbage characters in the filenames generated