Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -30,11 +30,6 @@ public abstract class GCLogParser implements DataSourceParser, SharedPatterns {
*/
public static final String END_OF_DATA_SENTINEL = GCLogFile.END_OF_DATA_SENTINEL;

// TODO: GCID_COUNTER should be in SharedPatterns, not here.
/**
* Rule for parsing the GCID counter.
*/
public static final GCParseRule GCID_COUNTER = new GCParseRule("GCID_COUNTER", " GC\\((\\d+)\\) ");
private JVMEventChannel consumer;
protected Diary diary;
private DateTimeStamp clock = new DateTimeStamp(DateTimeStamp.EPOC, 0.0d);
Expand Down Expand Up @@ -246,9 +241,54 @@ MemoryPoolSummary extractPermGenRecord(GCLogTrace trace) {
* @param line the line to parse.
* @return the extracted GCID, or -1 if not found.
*/
int extractGCID(String line) {
GCLogTrace trace = GCID_COUNTER.parse(line);
return (trace != null) ? trace.getIntegerGroup(1) : -1;
static int extractGCID(String line) {
long packed = extractGCCycleIdAndTextualLength(line);
if (packed == -1) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit-pick - Perhaps extract -1 to a constant called NOT_FOUND or something similar?

return -1;
}
return extractGCCycleId(packed);
}

/**
* Returns a packed long containing two ints:
* - the GC cycle id in the high bytes
* - the length of the text containing the GC cycle id,e.g. 'GC(10)'
* See {@link #extractGCCycleId(long)} and {@link #extractGCCycleIdTextualLength(long)}
*/
protected static long extractGCCycleIdAndTextualLength(String line) {
if (!line.contains("GC(")) {
return -1;
}
// [2025-10-21T16:44:29.311+0200][3645.640s] GC(35) Pause Young (Allocation Failure)
// we search for the value between parenthesis
int start = line.indexOf('(');
Comment on lines +263 to +264
Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method finds the first '(' in the line, but doesn't verify it's part of "GC(". If a line has an earlier '(' before "GC(", this will extract the wrong value. For example, if a log line contains "(something) GC(35)", it would try to parse "something" as the GC ID. Consider using line.indexOf("GC(") and then adding 2 to get the position after "GC(", or change line.indexOf('(') to line.indexOf("GC(") + 2 to ensure you're finding the correct parenthesis.

This issue also appears in the following locations of the same file:

  • line 271
Suggested change
// we search for the value between parenthesis
int start = line.indexOf('(');
// we search for the value between parenthesis, specifically the one from "GC("
int gcMarker = line.indexOf("GC(");
if (gcMarker == -1) {
return -1;
}
// gcMarker points to 'G', so gcMarker + 2 is the index of '(' in "GC("
int start = gcMarker + 2;

Copilot uses AI. Check for mistakes.
int end = line.indexOf(')', start);
if (start == -1 || end == -1) {
return -1;
}
try {
int gcId = Integer.parseInt(line, start + 1, end, 10);
// add the closing parenthesis to the length
int endToReturn = end + 1;
// pack the two ints in a long
return (((long) gcId) << 32) | (endToReturn & 0xFFFFFFFFL);
} catch (Exception e) {
throw new RuntimeException("Failed to extract gc cycle id from " + line, e);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should actually try to recover from this and just process the next line...

Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error handling here is inconsistent with the test expectations. When the input is malformed (e.g., missing closing parenthesis), the test expects the method to return -1, but this catch block will throw a RuntimeException instead. The catch block should return -1 for malformed input to match the expected behavior and maintain consistent error handling with the rest of the method.

Suggested change
throw new RuntimeException("Failed to extract gc cycle id from " + line, e);
Logger.getLogger(GCLogParser.class.getName())
.log(Level.FINE, "Failed to extract GC cycle id from line: " + line, e);
return -1;

Copilot uses AI. Check for mistakes.
}
}

protected static int extractGCCycleId(long packedGcCycleIdAndEnd) {
if (packedGcCycleIdAndEnd == -1) {
return -1;
}
return (int) (packedGcCycleIdAndEnd >> 32);
}
Comment on lines +280 to +285
Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing JavaDoc documentation for this method. Add JavaDoc that explains the method's purpose, parameters, and return value. For example: "Extracts the GC cycle ID from a packed long value. @param packedGcCycleIdAndEnd the packed long containing the GC cycle ID in the upper 32 bits @return the GC cycle ID, or -1 if the packed value is -1".

This issue also appears in the following locations of the same file:

  • line 287

Copilot uses AI. Check for mistakes.

protected static int extractGCCycleIdTextualLength(long packedGcCycleIdAndEnd) {
if (packedGcCycleIdAndEnd == -1) {
return 0;
}
return (int) packedGcCycleIdAndEnd;
}

/**
Expand Down Expand Up @@ -281,4 +321,4 @@ public void publishTo(JVMEventChannel channel) {
public ChannelName channel() {
return ChannelName.DATA_SOURCE;
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,7 @@

import com.microsoft.gctoolkit.GCToolKit;
import com.microsoft.gctoolkit.aggregator.EventSource;
import com.microsoft.gctoolkit.event.CPUSummary;
import com.microsoft.gctoolkit.event.GarbageCollectionTypes;
import com.microsoft.gctoolkit.event.MalformedEvent;
import com.microsoft.gctoolkit.event.MemoryPoolSummary;
import com.microsoft.gctoolkit.event.RegionSummary;
import com.microsoft.gctoolkit.event.*;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer explicit imports

Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The wildcard import should be avoided. While this change replaces multiple specific imports with a wildcard import, it makes it unclear which specific classes are being used from the event package. Consider keeping the explicit imports or at least listing them individually to improve code readability and maintainability.

Suggested change
import com.microsoft.gctoolkit.event.*;

Copilot uses AI. Check for mistakes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bad bot suggested edit but the comment is good.

import com.microsoft.gctoolkit.event.g1gc.G1ConcurrentUndoCycle;
import com.microsoft.gctoolkit.event.g1gc.G1GCConcurrentEvent;
import com.microsoft.gctoolkit.event.g1gc.G1GCEvent;
Expand All @@ -24,7 +20,6 @@
import com.microsoft.gctoolkit.parser.unified.UnifiedG1GCPatterns;
import com.microsoft.gctoolkit.time.DateTimeStamp;

import java.util.AbstractMap;
import java.util.LinkedList;
import java.util.Map;
import java.util.Queue;
Expand All @@ -33,8 +28,6 @@
import java.util.function.BiConsumer;
import java.util.logging.Level;
import java.util.logging.Logger;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

import static com.microsoft.gctoolkit.event.GarbageCollectionTypes.fromLabel;

Expand Down Expand Up @@ -177,36 +170,38 @@ protected void process(String line) {
parse(line);
}

private static final Pattern gcIdPattern = GCLogParser.GCID_COUNTER.pattern();

private void parse(String line) {

// Minor optimization. The parse rule only applies to what comes after the GC ID.
final int end;
final int gcid;
final Matcher gcIdMatcher = gcIdPattern.matcher(line);
if (gcIdMatcher.find()) {
gcid = Integer.parseInt(gcIdMatcher.group(1));
end = gcIdMatcher.end();
long packedGcCycleIdAndEnd = extractGCCycleIdAndTextualLength(line);
if (packedGcCycleIdAndEnd != -1) {
gcid = extractGCCycleId(packedGcCycleIdAndEnd);
end = extractGCCycleIdTextualLength(packedGcCycleIdAndEnd);
} else {
gcid = -1;
end = 0;
}

final String lineAfterGcId = line.substring(end);
parseRules.stream()
.map(Map.Entry::getKey)
.map(rule -> new AbstractMap.SimpleEntry<>(rule, rule.parse(lineAfterGcId)))
.filter(tuple -> tuple.getValue() != null)
.findAny()
.ifPresentOrElse(
tuple -> {
// Typically, "end" will be greater than zero, but not always.
setForwardReference(gcid, end > 0 ? line.substring(0, end) : line);
applyRule(tuple.getKey(), tuple.getValue(), line);
},
() -> log(line)
);
GCParseRule matchedParseRule = null;
GCLogTrace parsed = null;
for (Map.Entry<GCParseRule, BiConsumer<GCLogTrace, String>> entry : parseRules.entrySet()) {
GCParseRule rule = entry.getKey();
parsed = rule.parse(lineAfterGcId);
if (parsed != null) {
matchedParseRule = rule;
break;
}
}
if (parsed != null) {
// Typically, "end" will be greater than zero, but not always.
setForwardReference(gcid, end > 0 ? line.substring(0, end) : line);
applyRule(matchedParseRule, parsed, line);
} else {
log(line);
}
}


Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -170,11 +170,11 @@ private void youngHeader(GCLogTrace trace, String line) {
if (pauseEvent != null)
LOGGER.warning("Young pause event not recorded: " + pauseEvent.getGcID());
if (diary.isCMS())
pauseEvent = new GenerationalForwardReference(ParNew, new Decorators(line), super.GCID_COUNTER.parse(line).getIntegerGroup(1));
pauseEvent = new GenerationalForwardReference(ParNew, new Decorators(line), extractGCID(line));
else if (diary.isPSYoung())
pauseEvent = new GenerationalForwardReference(PSYoungGen, new Decorators(line), super.GCID_COUNTER.parse(line).getIntegerGroup(1));
pauseEvent = new GenerationalForwardReference(PSYoungGen, new Decorators(line), extractGCID(line));
else if (diary.isSerialFull())
pauseEvent = new GenerationalForwardReference(DefNew, new Decorators(line), super.GCID_COUNTER.parse(line).getIntegerGroup(1));
pauseEvent = new GenerationalForwardReference(DefNew, new Decorators(line), extractGCID(line));
else {
LOGGER.warning("Unrecognized collection phase -> " + line);
return;
Expand Down Expand Up @@ -261,7 +261,7 @@ private void tenuringAgeBreakout(GCLogTrace trace, String line) {
private void initialMark(GCLogTrace trace, String line) {
if (concurrentCyclePauseEvent != null)
LOGGER.warning("Pause event not completely recorded: " + pauseEvent.getGcID());
concurrentCyclePauseEvent = new GenerationalForwardReference(InitialMark, new Decorators(line), GCID_COUNTER.parse(line).getIntegerGroup(1));
concurrentCyclePauseEvent = new GenerationalForwardReference(InitialMark, new Decorators(line), extractGCID(line));
concurrentCyclePauseEvent.setStartTime(getClock());
}

Expand All @@ -278,7 +278,7 @@ private void concurrentPhaseStart(GCLogTrace trace, String line) {
LOGGER.warning("Unknown concurrent phase: " + line);
return;
}
concurrentEvent = new GenerationalForwardReference(gcType, new Decorators(line), GCID_COUNTER.parse(line).getIntegerGroup(1));
concurrentEvent = new GenerationalForwardReference(gcType, new Decorators(line), extractGCID(line));
concurrentEvent.setStartTime(getClock());
inConcurrentPhase = true;
}
Expand All @@ -299,7 +299,7 @@ private void workerThreads(GCLogTrace trace, String line) {
private void remark(GCLogTrace trace, String line) {
if (concurrentCyclePauseEvent != null)
LOGGER.warning("Pause event not recorded and is about to be lost: " + pauseEvent.getGcID());
concurrentCyclePauseEvent = new GenerationalForwardReference(Remark, new Decorators(line), GCID_COUNTER.parse(line).getIntegerGroup(1));
concurrentCyclePauseEvent = new GenerationalForwardReference(Remark, new Decorators(line), extractGCID(line));
concurrentCyclePauseEvent.setStartTime(getClock());
}

Expand Down Expand Up @@ -328,17 +328,17 @@ private void promotionFailed(GCLogTrace trace, String line) {
private void fullGC(GCLogTrace trace, String line) {
if (pauseEvent == null) {
if (diary.isPSOldGen())
pauseEvent = new GenerationalForwardReference(PSFull, new Decorators(line), super.GCID_COUNTER.parse(line).getIntegerGroup(1));
pauseEvent = new GenerationalForwardReference(PSFull, new Decorators(line), extractGCID(line));
else
pauseEvent = new GenerationalForwardReference(FullGC, new Decorators(line), super.GCID_COUNTER.parse(line).getIntegerGroup(1));
pauseEvent = new GenerationalForwardReference(FullGC, new Decorators(line), extractGCID(line));
pauseEvent.setStartTime(getClock());
} else if (pauseEvent.getGarbageCollectionType() == ParNew) {
pauseEvent.convertToConcurrentModeFailure();
} else if (pauseEvent.getGarbageCollectionType() == DefNew) {
pauseEvent.convertToSerialFull();
} else if (pauseEvent.getGarbageCollectionType() != ConcurrentModeFailure) {
LOGGER.warning("Maybe Full Pause event not recorded: " + pauseEvent.getGcID()); //todo: difficult to know if this is a full or a CMF
pauseEvent = new GenerationalForwardReference(FullGC, new Decorators(line), super.GCID_COUNTER.parse(line).getIntegerGroup(1));
pauseEvent = new GenerationalForwardReference(FullGC, new Decorators(line), extractGCID(line));
pauseEvent.setStartTime(getClock());
}
pauseEvent.setGCCause(trace.gcCause(1, 0));
Expand Down Expand Up @@ -384,10 +384,9 @@ private void jvmExit(GCLogTrace trace, String line) {
private ArrayList<GenerationalGCPauseEvent> cache = new ArrayList<>();

private void cpuBreakout(GCLogTrace trace, String line) {
GCLogTrace gcidTrace = GCID_COUNTER.parse(line);
if (gcidTrace != null) {
int gcid = extractGCID(line);
if (gcid != -1) {
CPUSummary cpuSummary = new CPUSummary(trace.getDoubleGroup(1), trace.getDoubleGroup(2), trace.getDoubleGroup(3));
int gcid = gcidTrace.getIntegerGroup(1);
// There are 3 cases to consider.
// - pause event outside of a concurrent cycle
// - pause event that is part of the concurrent cycle
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
package com.microsoft.gctoolkit.parser;

import org.junit.jupiter.api.Test;

import static org.junit.jupiter.api.Assertions.*;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer explicit imports


class GCLogParserTest {
@Test
void extractPackedGCCycleIdAndTextualLength() {
long packed = GCLogParser.extractGCCycleIdAndTextualLength("[2025-10-21T16:44:29.311+0200][3645.640s] GC(35) Pause Young (Allocation Failure)");
assertEquals(35, GCLogParser.extractGCCycleId(packed));
assertEquals(48, GCLogParser.extractGCCycleIdTextualLength(packed));
}

@Test
void extractGCID() {
assertEquals(35, GCLogParser.extractGCID("[2025-10-21T16:44:29.311+0200][3645.640s] GC(35) Pause Young (Allocation Failure)"));
}
Comment on lines +16 to +19
Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing test coverage for an important edge case: lines that contain a '(' character before "GC(". For example, a line like "[timestamp] (something) GC(35) pause" would incorrectly try to parse "something" as the GC ID. Add a test case to verify the method correctly identifies the "GC(" pattern and not just any parenthesis.

Copilot uses AI. Check for mistakes.

@Test
void extractPackedGCCycleIdAndTextualLength_malformed() {
long packed = GCLogParser.extractGCCycleIdAndTextualLength("[2025-10-21T16:44:29.311+0200][3645.640s] GC(3");
assertEquals(-1 , packed);
assertEquals(-1, GCLogParser.extractGCCycleId(packed));
assertEquals(0, GCLogParser.extractGCCycleIdTextualLength(packed));
}

@Test
void extractGCID_malformed() {
assertEquals(-1, GCLogParser.extractGCID("[2025-10-21T16:44:29.311+0200][3645.640s] GC(3"));
}
}
Loading