Skip to content

Commit 99c57a1

Browse files
committed
GRN特征选择使用案例&文档
1 parent e9bcaaf commit 99c57a1

3 files changed

Lines changed: 23 additions & 2 deletions

File tree

autox/autox_competition/feature_selection/Feature_selction.py renamed to autox/autox_competition/feature_selection/GRN_feature_selection.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -328,7 +328,7 @@ def train_fn(dataloaders, device, cat_num_classes, real_num):
328328

329329
return weights
330330

331-
class Feature_selection():
331+
class GRN_feature_selection():
332332
def __init__(self):
333333
self.new_columns = []
334334
# self.new_df = None

autox/autox_competition/feature_selection/README.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,3 +15,24 @@
1515
### 使用案例
1616
- [elo_AdversarialValidation_AutoX](https://www.kaggle.com/code/poteman/elo-adversarialvalidation-autox)
1717

18+
# GRN Feature Selection
19+
### 说明
20+
筛选重要性靠前的特征。
21+
原理:
22+
```
23+
第1步,准备好包含至少一列数值类型为连续型的目标值的数据集,和对应的列定义,列定义格式如下:
24+
column_definition = {
25+
'target':[目标值列名],
26+
'num':[连续型特征列1, 连续型特征列2, ..., 连续型特征列N],
27+
'cat':[离散型特征列1, 离散型特征列3, ..., 离散型特征列N]
28+
}
29+
第2步,根据列定义,取出N个num和N个cat列,使用MinMaxScaler对num列进行预处理后,划分为训练集和验证集,
30+
处理为Dataloader输入到GRN和单层nn组成的模型中;
31+
第3步,在模型中对cat输入进行embedding,并与num输入进行拼接,成为2*N的输入传给GRN;
32+
第4步,GRN计算2*N个特征的权重,并将权重乘以特征输入,作为输出传给单层nn,映射到1维与target计算损失,根据损失反向更新权重;
33+
第5步,模型每次迭代训练结束后在验证集计算一次损失,进行8次迭代训练后,取验证集上最优得分的特征权重作为最终结果;
34+
第6步,根据所需的最终特征数量,选择权重中排名对应靠前的特征作为输出,并从原数据集中提取对应的特征列作为新的数据集。
35+
```
36+
### 使用案例
37+
- [ubiquant_GRNFeatureSelection_AutoX](https://www.kaggle.com/hengwdai/grn-featureselection-autox)
38+
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,2 @@
1-
from .Feature_selction import Feature_selection
1+
from .GRN_feature_selection import GRN_feature_selection
22
from .adversarial_validation import AdversarialValidation

0 commit comments

Comments
 (0)