Skip to content

Commit

Permalink
Add formulas for calculating diskless SBD timeouts (#370)
Browse files Browse the repository at this point in the history
* Add formulas for calculating diskless SBD timeouts

bsc#1219972
jsc#DOCTEAM-1289

* Add warning about diskless SBD timeout misconfiguration
  • Loading branch information
tahliar committed Mar 4, 2024
1 parent 7311201 commit 89ddbb9
Showing 1 changed file with 29 additions and 9 deletions.
38 changes: 29 additions & 9 deletions xml/ha_storage_protection.xml
Original file line number Diff line number Diff line change
Expand Up @@ -347,8 +347,9 @@
<para>
This timeout is set in the CIB as a global cluster property. If not set
explicitly, it defaults to <literal>0</literal>, which is appropriate for
using SBD with one to three devices. For use of SBD in diskless mode, see <xref
linkend="pro-ha-storage-protect-confdiskless"/> for more details.</para>
using SBD with one to three devices. For SBD in diskless mode, this timeout
must <emphasis>not</emphasis> be <literal>0</literal>. For details, see
<xref linkend="pro-ha-storage-protect-confdiskless"/>.</para>
</listitem>
</varlistentry>
</variablelist>
Expand Down Expand Up @@ -995,8 +996,10 @@ SBD_WATCHDOG_TIMEOUT=5</screen>
properties on the &crmshell;:</para>
<screen>&prompt.crm.conf;<command>property</command> stonith-enabled="true" <co
xml:id="co-ha-sbd-stonith-enabled"/>
&prompt.crm.conf;<command>property</command> stonith-watchdog-timeout=10 <co
xml:id="co-ha-sbd-diskless-watchdog-timeout"/></screen>
&prompt.crm.conf;<command>property</command> stonith-watchdog-timeout=10<co
xml:id="co-ha-sbd-diskless-watchdog-timeout"/>
&prompt.crm.conf;<command>property</command> stonith-timeout=15<co
xml:id="co-ha-sbd-diskless-stonith-timeout"/></screen>
<calloutlist>
<callout arearefs="co-ha-sbd-stonith-enabled">
<para>
Expand All @@ -1007,12 +1010,29 @@ SBD_WATCHDOG_TIMEOUT=5</screen>
<callout arearefs="co-ha-sbd-diskless-watchdog-timeout">
<para>For diskless SBD, this parameter must not equal zero.
It defines after how long it is assumed that the fencing target has already
self-fenced. Therefore its value needs to be &gt;= the value of
<varname>SBD_WATCHDOG_TIMEOUT</varname> in <filename>/etc/sysconfig/sbd</filename>.
Starting with &productname; 15, if you set <parameter>stonith-watchdog-timeout</parameter>
to a negative value, Pacemaker will automatically calculate this timeout
and set it to twice the value of <parameter>SBD_WATCHDOG_TIMEOUT</parameter>.
self-fenced. Use the following formula to calculate this timeout:
</para>
<screen>stonith-watchdog-timeout &gt;= (SBD_WATCHDOG_TIMEOUT * 2)</screen>
<para>
If you set <parameter>stonith-watchdog-timeout</parameter>
to a negative value, Pacemaker automatically calculates this timeout
and sets it to twice the value of <parameter>SBD_WATCHDOG_TIMEOUT</parameter>.
</para>
</callout>
<callout arearefs="co-ha-sbd-diskless-stonith-timeout">
<para>
This parameter must allow sufficient time for fencing to complete.
For diskless SBD, use the following formula to calculate this timeout:
</para>
<screen>stonith-timeout &gt;= stonith-watchdog-timeout + 20%</screen>
<important>
<title>Diskless SBD timeouts</title>
<para>
With diskless SBD, if the <literal>stonith-timeout</literal> value is smaller than the
<literal>stonith-watchdog-timeout</literal> value, failed nodes can become stuck
in an <literal>UNCLEAN</literal> state and block failover of active resources.
</para>
</important>
</callout>
</calloutlist>
</step>
Expand Down

0 comments on commit 89ddbb9

Please sign in to comment.