Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(sequence-scatter-background): add categorical background drawing… #3759

Open
wants to merge 1 commit into
base: feat/sequence-scatter
Choose a base branch
from

Conversation

EchoChenGithub
Copy link

…(in progress)

[中文版模板 / Chinese template]

🤔 This is a ...

  • New feature
  • Bug fix
  • TypeScript definition update
  • Bundle size optimization
  • Performance optimization
  • Enhancement feature
  • Refactoring
  • Update dependency
  • Code style optimization
  • Test Case
  • Branch merge
  • Release
  • Site / documentation update
  • Demo update
  • Workflow
  • Other (about what?)

🔗 Related issue link

#3574

🔗 Related PR link

🐞 Bugserver case id

💡 Background and solution

  1. 问题与场景:

初步实现分类背景绘制功能:为了展示机器学习聚类算法的效果,动态显示不同标签的数据点在不同迭代轮次的聚集程度。

  1. 解决方案与实现:

创建了 SequenceScatterChartSpecTransformer 类,将特定的序列散点图配置转换为通用的 VChart 配置。

主要包括:

  • 数据转换: processSequenceData 函数处理原始序列数据,转换为 VChart 可识别的格式。
  • KDE密度估计: 使用核密度估计 (KDE) 方法计算数据点的密度,并通过 customMark 在散点图上叠加显示密度信息。密度信息通过 symbol 图元的透明度 (fillOpacity) 来体现,密度越高,透明度越低,视觉上越明显。
  • 动画支持: 增加了对 player 配置的支持,实现序列数据的动画展示。
  1. 效果展示
    图 1 为训练集 1 的效果,图 2 为训练集 2 的效果
    training1
    training2

  2. 未来改进方向

KDE 参数调优: 未来将增加手动调节 bandwidth 的选项(调节核密度估计的平滑程度,bandwith 越高,估计的结果越平滑。每个数据点的影响范围更大,会将更多的邻近点纳入考虑范围),允许用户对 KDE 结果进行更精细的控制。

归一化策略: KDE 密度值的归一化被临时注释,当前的密度值直接映射到透明度。虽然保留了不同 label 间 KDE 值的相对大小关系,但缺乏统一的密度范围,可能导致不同数据集间的密度视觉呈现不一致。需要探索更通用的策略(目前策略是为不同标签的数据统一生成 100*100 = 10000 个图元,因此比较聚集的点的背景显示不出透明度差异,如效果展示中的图 1的紫色分组)。

性能优化: 计划优化 calculateKDE 函数的计算过程,以提高大数据集下的性能。

自定义程度提升: 将提供更多的配置项,允许用户自定义 KDE 密度图的颜色、形状等属性,满足个性化的需求。

📝 Changelog

Language Changelog
🇺🇸 English
🇨🇳 Chinese

☑️ Self-Check before Merge

⚠️ Please check all items below before requesting a reviewing. ⚠️

  • Doc is updated/provided or not needed
  • Demo is updated/provided or not needed
  • TypeScript definition is updated/provided or not needed
  • Changelog is provided or not needed

🚀 Summary

copilot:summary

🔍 Walkthrough

copilot:walkthrough

@xile611 xile611 requested a review from skie1997 February 25, 2025 09:24
const xExtent = { min: xMin - xExpand, max: xMax + xExpand };
const yExtent = { min: yMin - yExpand, max: yMax + yExpand };

const xStep = (xExtent.max - xExtent.min) / bins;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

step决定了背景的填充色块的大小,这个大小其实是不变的。
现在的逻辑相当于保持每个group下填充色块的数量不变,而非色宽的大小不变。
写成定值(比如0.1),右下角group的填充色块就不会堆积在一起了。
修改后:
image

修改前:
image

symbolType: 'rect',
x: (datum: any, ctx: any) => {
// 获取region位置
// const regionStartPoint = ctx.chart.getAllRegions()[0].getLayoutStartPoint();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

注释的代码可以去掉

{
id: 'colorScale',
type: 'ordinal',
specified: {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

色盘建议提出来,写成常量


const kdeResult: Array<{ x: number; y: number; kde: number; label: string }> = [];

const expandRatio = 0.2; // 扩展比例
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

写成常量,const EXPAND_RATIO = 0.2,提出来放到同级目录下的constant.ts中

}
]
});
});
return result;
}

// KDE 相关的工具函数
function gaussKernel(x: number) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

工具函数可以放到同级目录下的utils.ts文件中

xField: spec.xField,
yField: spec.yField
dataIndex: 0,
xField: 'x',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

从配置中读取spec.xField, spec.yField, spec.seriesField

}
density = density / (points.length * h * h);
densities.push(density); // 先暂存 density 值
kdeResult.push({ x, y, kde: density, label }); // 同时也先存入 kdeResult
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

数据字段保持和spec统一,[spec.xField]: x, [spec.yField]: y....

}
}

// // 归一化每个 label 的 KDE 密度值到 [0, 1] 范围内
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maxDensity 和 minDensity 可以在caculateKDE的时候直接对比得到,类似👇,减少性能损耗。
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants