Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ID increasing #43

Open
Young-eng opened this issue Feb 8, 2025 · 15 comments
Open

ID increasing #43

Young-eng opened this issue Feb 8, 2025 · 15 comments

Comments

@Young-eng
Copy link

Young-eng commented Feb 8, 2025

When I trained my custom datasets, during evaluation, I saw the ID increase over frames, resulting poor precision or accuracy.

Image

Image

Objects in my dataset only have 1 class, and the number of objects in every frame is fixed, so it is weird.

Are there any parameters I need to change ? Thanks so much!

Here are my parameters in config:

SUPER_CONFIG_PATH:

MODE: train # "train" or "eval" or "submit", for the main.py script.

System config, like CPU/GPU

NUM_CPU_PER_GPU: # number of CPU per GPU
NUM_WORKERS: 10
DEVICE: cuda
AVAILABLE_GPUS: 0,

Git version:

GIT_VERSION: # you should input the git version here, if you are using wandb to log your experiments.

Datasets:

DATASETS: [DanceTrack] # for joint training, there may be multiple datasets, like: [CrowdHuman, MOT17]
DATASET_SPLITS: [train] # and corresponding splits, like: [train, val]
DATA_ROOT: /home/Projects/MOTIP/data/dataset/mot_final # datasets root

Sampling settings:

SAMPLE_STEPS: [0]
SAMPLE_LENGTHS: [40]
SAMPLE_MODES: [random_interval]
SAMPLE_INTERVALS: [4]

Data augmentation setting:

AUG_RESIZE_SCALES: [480, 512, 544, 576, 608, 640, 672, 704, 736, 768, 800]
AUG_MAX_SIZE: 1333
AUG_RANDOM_RESIZE: [400, 500, 600]
AUG_RANDOM_CROP_MIN: 384
AUG_RANDOM_CROP_MAX: 600
AUG_OVERFLOW_BBOX: False
AUG_REVERSE_CLIP: 0.0
AUG_RANDOM_SHIFT_MAX_RATIO: 0.06 # Only for static images

Model settings:

NUM_ID_VOCABULARY: 50
NUM_CLASSES: 1
MAX_TEMPORAL_LENGTH: 40
ID_LOSS_WEIGHT: 1
ID_LOSS_GPU_AVERAGE: True
ID_DECODER_LAYERS: 6
SEQ_HIDDEN_DIM: 256
SEQ_DIM_FEEDFORWARD: 512
SEQ_NUM_HEADS: 8

Backbone:

BACKBONE: resnet50
DILATION: False

About DETR-Framework

DETR_NUM_QUERIES: 300
DETR_NUM_FEATURE_LEVELS: 4
DETR_AUX_LOSS: True
DETR_WITH_BOX_REFINE: True
DETR_TWO_STAGE: False
DETR_MASKS: False
DETR_HIDDEN_DIM: 256
DETR_PE: sine
DETR_ENC_LAYERS: 6
DETR_DEC_LAYERS: 6
DETR_NUM_HEADS: 8
DETR_DIM_FEEDFORWARD: 1024
DETR_DROPOUT: 0.0
DETR_DEC_N_POINTS: 4
DETR_ENC_N_POINTS: 4
DETR_CLS_LOSS_COEF: 2.0
DETR_BBOX_LOSS_COEF: 5.0
DETR_GIOU_LOSS_COEF: 2.0
DETR_FOCAL_ALPHA: 0.25
DETR_PRETRAIN: ./pretrains/r50_deformable_detr_coco_dancetrack.pth
DETR_FRAMEWORK: Deformable-DETR

Training Setting:

TRAIN_STAGE: joint
SEED: 42
USE_DISTRIBUTED: False
DETR_NUM_TRAIN_FRAMES: 4

Below two parameters are for memory optimized DETR training:

DETR_CHECKPOINT_FRAMES: 2
SEQ_DECODER_CHECKPOINT: False

Training Augmentation:

TRAJ_DROP_RATIO: 0.5
TRAJ_SWITCH_RATIO: 0.3

Training Scheduler:

EPOCHS: 40
LR: 1.0e-4
LR_BACKBONE_NAMES: [backbone.0]
LR_BACKBONE_SCALE: 0.1
LR_LINEAR_PROJ_NAMES: [reference_points, sampling_offsets]
LR_LINEAR_PROJ_SCALE: 0.05
LR_WARMUP_EPOCHS: 1
WEIGHT_DECAY: 0.0005
CLIP_MAX_NORM: 0.1
SCHEDULER_TYPE: MultiStep
SCHEDULER_MILESTONES: [8, 12]
SCHEDULER_GAMMA: 0.1
BATCH_SIZE: 1
ACCUMULATE_STEPS: 2
RESUME_MODEL: "/home/Projects/MOTIP/outputs/checkpoint_13.pth"
RESUME_OPTIMIZER: True
RESUME_SCHEDULER: True
RESUME_STATES: True

Inference:

INFERENCE_MODEL: "/home/Projects/MOTIP/outputs/checkpoint_15.pth"
INFERENCE_ONLY_DETR: False
INFERENCE_DATASET: DanceTrack
INFERENCE_SPLIT: val
INFERENCE_CONFIG_PATH: # mostly, you don't need to set this parameter. See submit_engine.py L34 for more details.
INFERENCE_GROUP:
INFERENCE_MAX_SIZE: 1333
INFERENCE_ENSEMBLE: 0

Thresholds:

ID_THRESH: 0.2
DET_THRESH: 0.3
NEWBORN_THRESH: 0.6
AREA_THRESH: 100

Outputs:

OUTPUTS_DIR: /home/Projects/MOTIP/outputs/
OUTPUTS_PER_STEP: 100
SAVE_CHECKPOINT_PER_EPOCH: 1
USE_TENSORBOARD: True
USE_WANDB: False
PROJECT_NAME: MOTIP
EXP_NAME: r50_deformable_detr_motip_dancetrack
EXP_GROUP: default
EXP_OWNER:

Settings which are used to reduce the memory usage of DETR criterion.

Too many objects (such as CrowdHuman) may cause OOM error.

MEMORY_OPTIMIZED_DETR_CRITERION: False
AUTO_MEMORY_OPTIMIZED_DETR_CRITERION: False
CHECKPOINT_DETR_CRITERION: False

@HELLORPG
Copy link
Collaborator

HELLORPG commented Feb 8, 2025

What's the id_loss value during training?

@Young-eng
Copy link
Author

id_loss is like this :

Image

@HELLORPG
Copy link
Collaborator

HELLORPG commented Feb 8, 2025

The id_loss seems unusual (too large). Have you pre-trained the DETR model on your custom dataset (as shown in our doc)?

@Young-eng
Copy link
Author

No, I directly used your pretrained model, r50_deformable_detr_coco_dancetrack.pth, I thought it would work. :(

@HELLORPG
Copy link
Collaborator

HELLORPG commented Feb 8, 2025

Excuse me, what is the category of the targets in your custom dataset?

@Young-eng
Copy link
Author

One category, it is Drosophila, or fruit fly

@HELLORPG
Copy link
Collaborator

HELLORPG commented Feb 8, 2025

Our pre-trained DETR weights are tailored for human tracking. It appears that the categories you're working with differ significantly from our application scenario, so you might need to re-train the DETR on your custom dataset.

@Young-eng
Copy link
Author

Young-eng commented Feb 8, 2025

I re-train DETR, and set NUM_ID_VOCABULARY: 20, metrics still seems weird,

Image

Objects' ID from inference is increasing,

Image

@HELLORPG
Copy link
Collaborator

HELLORPG commented Feb 9, 2025

How many GPUs are you used for this training?

@Young-eng
Copy link
Author

1 GPU, RTX 3090

@HELLORPG
Copy link
Collaborator

HELLORPG commented Feb 9, 2025

In our experiments, we use 8 GPUs to ensure a batch size of 8. Compared to this, your configuration (1 GPU for a batch size of 1) deviates from our default setup. A too small batch can result in unreliable gradients, which in turn can affect the convergence of the model.

If you only have one GPU, you may need to carefully adjust the hyperparameters. For a related discussion, you can refer to #24 (currently only in Chinese) or #22. Specifically, I recommend:

  1. Increasing ACCUMULATE_STEPS x8.
  2. Correspondingly increasing EPOCHS (and related settings) to ensure the total number of optimization steps.

@Young-eng
Copy link
Author

Young-eng commented Feb 10, 2025

Thanks!!!

I set ACCUMULATE_STEPS=16, and set Epochs=50, it performances better than before, but It is not what I expected. Maybe I need to finetune some parameters.

Image

@HELLORPG
Copy link
Collaborator

You also need to modify some hyperparameters related to EPOCH to ensure that the learning rate curve is proportionally scaled.

For example, the below settings should be scaled, in my opinion.

SCHEDULER_MILESTONES: [8, 12]
LR_WARMUP_EPOCHS: 1

Additionally, please note that your dataset is much smaller than DanceTrack, by approximately over 30x. Therefore, you may need to further increase the training epochs to ensure there are enough iterations to guarantee model convergence. Honestly, this relies heavily on past experience. If you have any further questions, feel free to continue the discussion. I'm also looking forward to seeing how our method generalize across different datasets.

@Young-eng
Copy link
Author

Thanks for your patience and guidance!

I tuned these two factors, set epochs to 50, it gets improvements.

Image

Image

@HELLORPG
Copy link
Collaborator

HELLORPG commented Feb 12, 2025

That's good news.

However, too many steps in the learning rate drop may not bring improvements. For me, I suggest:

# from your settings:
EPOCHS: 50
LR_WARMUP_EPOCHS: 5
# my suggestion based on the above settings:
SCHEDULER_MILESTONES: [30, 45]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants