-
Couldn't load subscription status.
- Fork 3.4k
HBASE-28158 Decouple RIT list management from TRSP #7375
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
af75dcb to
99caa2c
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
...erver/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionInTransitionTracker.java
Show resolved
Hide resolved
| return regionInTransitionTracker.isRegionInTransition(regionInfo); | ||
| } | ||
|
|
||
| public int getOngoingTRSPCount() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Call it getInTransitCount()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Transit" means something different than "in transition". Javadoc the explaination.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about scheduledRegionTransition ?
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java
Outdated
Show resolved
Hide resolved
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java
Outdated
Show resolved
Hide resolved
| } | ||
|
|
||
| public boolean isInTransition() { | ||
| public boolean isOngoingTRSP() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about isInTransit?
And javadoc that "transit" means something different than "in transition".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about changing to isTransitionScheduled?
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java
Outdated
Show resolved
Hide resolved
| removeRegionInTransition(regionInfo); | ||
| } | ||
|
|
||
| private List<RegionState.State> getExceptedRegionStates(RegionStateNode regionStateNode) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this method should be renamed and the code should be documented with comments. You are not really "excepting" regions.
You are basically waiting for the assignment state machine to complete and reach the desired terminal state by checking the region's current state against the table's state (ENABLED vs. DISABLED) to determine if the region is in the terminal state. If not, the region is added to the RIT list; otherwise, it is removed.
Your method naming and comments should communicate this theory of operation.
Related, there may be a minor race condition here. Consider if the table's state is changed (e.g., from ENABLED to DISABLING) at the same time a region for that table reports a state change. I think the tracker can momentarily use a stale table state here. However, this is self-correcting. As the region proceeds through its state transitions each subsequent call to the tracker will re-evaluate its status, and it will eventually be removed from RIT correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct me if I am wrong but I think this race condition might not occur as before changing the state of any table of region (or running any procedure) we first take a lock (setState using HBCK may be an exception). And this lock might save us from this case.
...erver/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionInTransitionTracker.java
Outdated
Show resolved
Hide resolved
...erver/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionInTransitionTracker.java
Outdated
Show resolved
Hide resolved
| } | ||
|
|
||
| public void clear() { | ||
| regionInTransition.clear(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When the list is cleared this way there are no log lines indicating the regioninfo was removed from the list. Is that a problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed the method it to stop, we call it when master/assignment manager is stopping.
Added log line.
| static void removeNonDefaultReplicas(MasterProcedureEnv env, Stream<RegionInfo> regions, | ||
| int regionReplication) { | ||
| // Remove from in-memory states | ||
| // TODO should we not confirm here that replica region are closed or not ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@apurtell can you help me here? I was curious here why we are not confirming if replica regions are closed or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure about this either.
0535506 to
1ebf4f7
Compare
|
💔 -1 overall
This message was automatically generated. |
No description provided.