Skip to content

MDEV-38936 Proactive handling of InnoDB tablespace full condition#4721

Open
FarihaIS wants to merge 1 commit into
MariaDB:mainfrom
FarihaIS:mdev-38936
Open

MDEV-38936 Proactive handling of InnoDB tablespace full condition#4721
FarihaIS wants to merge 1 commit into
MariaDB:mainfrom
FarihaIS:mdev-38936

Conversation

@FarihaIS

@FarihaIS FarihaIS commented Mar 2, 2026

Copy link
Copy Markdown
Contributor

Description

InnoDB write failures occur when tablespace files exceed filesystem size limits (e.g. 16TB on ext4, 2TB on ext3 - varies by filesystem). Current behavior logs errors but continues accepting transactions, causing repeated failures, user disruption, and potential data integrity issues.

Add proactive monitoring by emitting warnings when InnoDB tablespaces approach a configurable size threshold.

Key features:

  • Two new system variables:
    • innodb_tablespace_size_warning_threshold (default 0, disabled): Maximum tablespace size in bytes before warnings begin
    • innodb_tablespace_size_warning_pct (default 85%): Percentage of threshold at which to start emitting warnings
  • Warning frequency:
    • Below warning_pct: No warnings
    • At or above warning_pct: Every 1% increase (85%, 86%, 87%, etc.)
  • Per-tablespace tracking with automatic reset on TRUNCATE/DROP or threshold/percentage changes
  • Zero overhead when threshold is 0
  • Progressive warnings capped at 100%

Implementation adds fil_space_t::extend() which consolidates file extension, size_in_header update, and size warning checks. Per-tablespace warning state is tracked in fil_space_t (m_last_size_warning_pct, m_last_warning_threshold, m_last_warning_pct).

Release Notes

Added proactive InnoDB tablespace size monitoring to prevent filesystem size limit failures. Two new system variables enable configurable warning thresholds with incremental warning frequency:

  • innodb_tablespace_size_warning_threshold (default 0, disabled): Maximum size before warnings
  • innodb_tablespace_size_warning_pct (default 85%): When to start warnings

Warning frequency:

  • Below configured percentage: no warnings
  • At or above configured percentage: every 1% increase
  • Threshold set to 0: warnings disabled

How can this PR be tested?

Execute the innodb.tablespace_size_warning test in mysql-test-run. This commit adds a test in the innodb suite.

The test validates:

  1. Both system variables are visible and have correct default values
  2. Basic warning emission when tablespace exceeds configured percentage
  3. Configurable warning percentage (tests both 70% and 80% thresholds)
  4. Threshold set to 0 disables warnings, re-enabling with a nonzero threshold resumes them
  5. TRUNCATE TABLE resets warning state
  6. Behavior when tablespace exceeds 100% of threshold (warnings cap at 100%)

Expected warning behavior in error log:

  • Below innodb_tablespace_size_warning_pct (default 85%): No warnings

  • At or above innodb_tablespace_size_warning_pct: Every 1% increase

    Example: [Warning] InnoDB: Tablespace 'test/t1' size 7340032 bytes reached 70% of configured threshold of 10485760 bytes

Basing the PR against the correct MariaDB version

  • This is a new feature, and the PR is based against the main branch.

Copyright

All new code of the whole pull request, including one or several files that are either new files or modified ones, are contributed under the BSD-new license. I am contributing on behalf of my employer Amazon Web Services, Inc.

@CLAassistant

CLAassistant commented Mar 2, 2026

Copy link
Copy Markdown

CLA assistant check
All committers have signed the CLA.

@grooverdan grooverdan added the External Contribution All PRs from entities outside of MariaDB Foundation, Corporation, Codership agreements. label Mar 2, 2026
@mikegriffin

Copy link
Copy Markdown
Contributor

Feature request, allow additional use cases (example, tablespace becomes larger than expected):

  • Make a configurable warning percent (tablespace_size_warning_pct) so warnings can start earlier without changing the byte threshold
  • Replace hard-coded 90 with a named constant (high_resolution_pct)
  • No change above high_resolution_pct: print on every 1% increase
  • Between tablespace_size_warning_pct and high_resolution_pct: print at most twice per 10% (example, 70%, 77%, 81%, 89%, 90%, 91%, 92%)

@FarihaIS FarihaIS marked this pull request as ready for review March 2, 2026 23:10
@Thirunarayanan Thirunarayanan requested review from dr-m and iMineLink March 3, 2026 05:01
@FarihaIS FarihaIS force-pushed the mdev-38936 branch 3 times, most recently from 64ab2ed to 5bd3d38 Compare March 3, 2026 17:30
@FarihaIS

FarihaIS commented Mar 3, 2026

Copy link
Copy Markdown
Contributor Author

@mikegriffin I have just pushed some new changes. Could you please take a look and confirm whether the new implementation addresses the additional use cases you mentioned above? Thank you!

@iMineLink iMineLink left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution!
I left a few comments on the feature.
Since it's only adding logs and not solving actual bugs related to excessive InnoDB tablespace size (like the recently discovered MDEV-38898), please also wait for @dr-m comments.
Nevertheless, it's fair to say that the feature, when disabled, seems to have a small runtime cost (checking a variable in an ATTRIBUTE_COLD function, new members of fil_space_t, whose footprint may be further reduced by reordering to avoid padding, or eliminated by storing only high 32 bits of threshold + reorder).

Comment thread storage/innobase/fsp/fsp0fsp.cc Outdated
Comment thread storage/innobase/fsp/fsp0fsp.cc Outdated
Comment thread storage/innobase/fsp/fsp0fsp.cc Outdated
Comment thread storage/innobase/fsp/fsp0fsp.cc Outdated
Comment thread storage/innobase/include/fil0fil.h
Comment thread mysql-test/suite/innodb/t/tablespace_size_warning.test
Comment thread storage/innobase/handler/ha_innodb.cc
Comment thread storage/innobase/fsp/fsp0fsp.cc Outdated
Comment thread storage/innobase/fsp/fsp0fsp.cc
Comment thread mysql-test/suite/innodb/r/tablespace_size_warning.result Outdated
@FarihaIS FarihaIS force-pushed the mdev-38936 branch 3 times, most recently from 5a0d8ee to 7c0e2a0 Compare March 13, 2026 18:30
@FarihaIS

Copy link
Copy Markdown
Contributor Author

@iMineLink Thank you for the detailed review! I have addressed all your comments and updated the PR description to reflect the latest version of the feature. Please let me know if I have missed anything, thank you.

I will wait for @dr-m's review in the meantime.

@FarihaIS FarihaIS requested a review from iMineLink March 13, 2026 21:28

@iMineLink iMineLink left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing the previous review points. I have just a couple more points, then it's good for me.

Comment thread storage/innobase/include/fil0fil.h Outdated
Comment thread mysql-test/suite/innodb/t/tablespace_size_warning.test
@FarihaIS

Copy link
Copy Markdown
Contributor Author

@iMineLink thank you for the suggestions again, I've addressed all the new comments as well!

Please let me know if you have any other thoughts while we wait for @dr-m's review.

@FarihaIS FarihaIS requested a review from iMineLink March 17, 2026 21:15

@iMineLink iMineLink left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @FarihaIS, the changes look good to me!

As a note, the feature in the current state will be enabled by default.

Please wait for @dr-m review, thanks.

@dr-m dr-m left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this good for? Do you have an example of already implemented external monitoring that would react when some warning messages appear in the server error log?

Could we have something that would better integrate with event handlers and other existing mechanisms?

Comment thread storage/innobase/fsp/fsp0fsp.cc Outdated
Comment thread storage/innobase/srv/srv0srv.cc Outdated

@gkodinov gkodinov left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a preliminary review. LGTM. Please keep working with Marko on his review.

@FarihaIS FarihaIS force-pushed the mdev-38936 branch 2 times, most recently from 59cedaa to 1f4e0c0 Compare April 9, 2026 19:35
@FarihaIS

FarihaIS commented Apr 9, 2026

Copy link
Copy Markdown
Contributor Author

@dr-m thank you for your feedback. I have addressed the two code changes you requested above now. Please let me know if these changes look okay or if they need further modification.

As for the questions you asked above,

What is this good for? Do you have an example of already implemented external monitoring that would react when some warning messages appear in the server error log?

These warnings would be helpful for external monitoring tools, for example, AWS RDS, which monitors the error log for operational alerts. This follows the same pattern as existing InnoDB warnings (undo truncation, system tablespace full, etc.).

Could we have something that would better integrate with event handlers and other existing mechanisms?

Could you please help guide me to the kind of integration you're looking for? I'm not entirely sure what the new approach would look like, but I'm happy to make the changes once I have a clearer understanding.

@FarihaIS FarihaIS requested review from dr-m and iMineLink April 9, 2026 23:44
@grooverdan

Copy link
Copy Markdown
Member

Could we have something that would better integrate with event handlers and other existing mechanisms?

Could you please help guide me to the kind of integration you're looking for? I'm not entirely sure what the new approach would look like, but I'm happy to make the changes once I have a clearer understanding.

I think @dr-m is after best practices in log message in general and tooling integration. So perhaps MDEV-27147 JSON Error log to STDERR/STDOUT as an option, and perhaps - https://opentelemetry.io/docs/specs/otel/logs/data-model/#events

Comment thread storage/innobase/include/fil0fil.h Outdated
Comment thread storage/innobase/include/fil0fil.h
Comment thread storage/innobase/include/fil0fil.h Outdated
Comment thread storage/innobase/fsp/fsp0fsp.cc Outdated
Comment thread storage/innobase/fsp/fsp0fsp.cc
Comment thread storage/innobase/fsp/fsp0fsp.cc
Comment thread storage/innobase/handler/ha_innodb.cc Outdated
Comment on lines +19144 to +19146
"Threshold in bytes for tablespace size warnings (0 = disabled)",
NULL, NULL,
17592186044416ULL, /* Default setting */

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this remain disabled by default?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I've disabled it by default.

@FarihaIS

Copy link
Copy Markdown
Contributor Author

Could we have something that would better integrate with event handlers and other existing mechanisms?

Could you please help guide me to the kind of integration you're looking for? I'm not entirely sure what the new approach would look like, but I'm happy to make the changes once I have a clearer understanding.

I think @dr-m is after best practices in log message in general and tooling integration. So perhaps MDEV-27147 JSON Error log to STDERR/STDOUT as an option, and perhaps - https://opentelemetry.io/docs/specs/otel/logs/data-model/#events

@grooverdan Thanks for the pointers! Since this uses sql_print_warning(), it would automatically benefit from structured output once MDEV-27147 (JSON error log) lands, right? Is there any special handling needed on our side? What about for OpenTelemetry integration as well?

@FarihaIS

FarihaIS commented Apr 24, 2026

Copy link
Copy Markdown
Contributor Author

@dr-m thank you for the detailed feedback! I've addressed/responded to all your comments above - could you please take a look and see if there are any other changes needed?

@dr-m dr-m left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay. I think that as part of this, we must pay back some maintenance debt of the fil_space_extend() function.

Comment on lines +4 to +12
SET @old_threshold = @@global.innodb_tablespace_size_warning_threshold;
SET @old_pct = @@global.innodb_tablespace_size_warning_pct;
# Test system variables
SHOW VARIABLES LIKE 'innodb_tablespace_size_warning_threshold';
Variable_name Value
innodb_tablespace_size_warning_threshold 0
SHOW VARIABLES LIKE 'innodb_tablespace_size_warning_pct';
Variable_name Value
innodb_tablespace_size_warning_pct 85

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no point in saving and restoring the old values if the test fails when run with non-default values:

mysql-test/mtr --mysqld=--innodb-tablespace-size-warning-threshold=4 --mysqld=--innodb-tablespace-size-warning-pct=42 innodb.tablespace_size_warning
innodb.tablespace_size_warning           [ fail ]
        Test ended at 2026-06-05 12:33:20

CURRENT_TEST: innodb.tablespace_size_warning
--- /mariadb/main/mysql-test/suite/innodb/r/tablespace_size_warning.result	2026-06-05 12:27:30.660602135 +0300
+++ /mariadb/main/mysql-test/suite/innodb/r/tablespace_size_warning.reject	2026-06-05 12:33:20.040125327 +0300
@@ -6,10 +6,10 @@
 # Test system variables
 SHOW VARIABLES LIKE 'innodb_tablespace_size_warning_threshold';
 Variable_name	Value
-innodb_tablespace_size_warning_threshold	0
+innodb_tablespace_size_warning_threshold	4
 SHOW VARIABLES LIKE 'innodb_tablespace_size_warning_pct';
 Variable_name	Value
-innodb_tablespace_size_warning_pct	85
+innodb_tablespace_size_warning_pct	42
 # Test basic warning emission
 SET GLOBAL innodb_tablespace_size_warning_threshold = 10485760;
 SET GLOBAL innodb_tablespace_size_warning_pct = 70;

Result content mismatch

I don’t think there is a way to check the built-in default values in the regression test suite.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, I removed the SHOW VARIABLES checks and save/restore logic. The test now sets explicit values before each test block and resets to defaults at the end.

Comment on lines +30 to +36
--disable_query_log
let $i = 10;
while ($i) {
eval INSERT INTO t1 (data) VALUES (REPEAT('a', 1024*1024));
dec $i;
}
--enable_query_log

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be written in a single line:

INSERT INTO t1(data) SELECT REPEAT('a',1024*1024) FROM seq_1_to_10;

For this to work, we will need the following at the start of the test file:

--source include/have_sequence.inc

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the pointer, I've replaced the affected lines with your suggested rewrite above!

--enable_query_log

let SEARCH_FILE=$MYSQLTEST_VARDIR/log/mysqld.1.err;
let SEARCH_PATTERN=Tablespace 'test/t1' size [^\n]* bytes reached [^\n]*% of configured threshold;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A more appropriate pattern for matching a string of digits would be \d+.

We seem to issue exact numbers. Therefore, I would look for exact messages, instead of filtering out the numbers.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, I've updated the test with exact numbers as requested above, thank you.

Comment thread storage/innobase/include/fil0fil.h Outdated
Comment on lines +431 to +432
/** Threshold value used for the last warning */
ulonglong m_last_warning_threshold{0};

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really have to allocate 64 bits for this? The files should grow by extents of FSP_EXTENT_SIZE, which is 1MiB, or 64 pages, whichever is greater (2MiB or 4MiB for the two largest innodb_page_size). At least 20 of the least significant bits would be constantly 0 in a byte counter.

Could we use a uint32_t counter of pages here? After all, a page is the smallest unit that we work with.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, changed to uint32_t page counter!

Comment on lines +666 to +667
const ulonglong threshold= fil_system.tablespace_size_warning_threshold;
const uint warning_pct= fil_system.tablespace_size_warning_pct;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inside fil_space_extend(), which we are calling before entering here, we are acquiring and releasing fil_system.mutex. Hence, we should be able to read these fields from fil_system as normal data members, not Atomic_relaxed. Can you refactor the logic? I think that fil_space_extend would best be replaced with a member function fil_space_t::extend(uint32_t, mtr_t *mtr), which would include this warning logic. Each caller is going to assign size_in_header and invoke mtr->write<4,mtr_t::FORCED>. Therefore, that logic can be part of the replacement function fil_space_t::extend() itself.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, I refactored fil_space_extend() usage into a new fil_space_t::extend(uint32_t, buf_block_t*, mtr_t*) member function that handles the file extension, size_in_header update, mtr write, and size warning check. Both callers now use space->extend(), and I also removed Atomic_relaxed since fil_system.mutex protects access.

InnoDB write failures occur when tablespace files exceed filesystem size
limits. Current behavior logs errors but continues accepting
transactions, causing repeated failures and potential data integrity
issues.

Add proactive monitoring by emitting warnings when InnoDB tablespaces
approach a configurable size threshold.

Key features:
- Two new system variables:
  * innodb_tablespace_size_warning_threshold (default 0, disabled):
    Maximum tablespace size in bytes before warnings begin
  * innodb_tablespace_size_warning_pct (default 85%): Percentage of
    threshold at which to start emitting warnings
- Warning frequency:
  * Below warning_pct: No warnings
  * At or above warning_pct: Every 1% increase (85%, 86%, 87%, etc.)
- Per-tablespace tracking with automatic reset on TRUNCATE/DROP or
  threshold/percentage changes
- Zero overhead when threshold is 0
- Progressive warnings capped at 100%

Implementation adds fil_space_t::extend() which consolidates file
extension, size_in_header update, and size warning checks.
Per-tablespace warning state is tracked in fil_space_t
(m_last_size_warning_pct, m_last_warning_threshold, m_last_warning_pct).

All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the
BSD-new license. I am contributing on behalf of my employer Amazon Web
Services, Inc.
@FarihaIS

FarihaIS commented Jun 8, 2026

Copy link
Copy Markdown
Contributor Author

@dr-m thank you for the detailed feedback again! I've addressed all your comments above - could you please take a look and see if there are any other changes needed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

External Contribution All PRs from entities outside of MariaDB Foundation, Corporation, Codership agreements.

Development

Successfully merging this pull request may close these issues.

7 participants