-
Notifications
You must be signed in to change notification settings - Fork 14
Description
Summary
The idea is to replace the entire model with a diffusion-based model. Please see below for details.
Motivation
This model is intended to be a more "wild idea", where it uses a completely custom architecture instead of using anything from literature. This is a high-risk experiment, that if successful could land an impactful paper in an ML-related conference (such as ICML).
Affected Area
This model is replacing the ENTIRE model, and not using any portion of the model from Aurora/ClimaX. This includes the channel reduction. Although #52 is also diffusion-related, this suggestion is replacing the entire model instead of just the predictor.
Proposed Approach
The following is the sample process for how to inference with this model. Only one timestep (or however many are generated at the same time) is shown for simplicity:
- Generate the channel label embeddings for the input channels (see [Code Idea]: Generate smart channel label embeddings #53, for example).
- Start diffusion. Please note: there are significant differences between this and normal diffusion.
from random import shuffle # assume shuffle is a method that exists in random, and that it pseudorandomly reorders elements in a given list.
init_e = [] # some array representing the embedding of the first/previous timestep.
last_e = ??? # the embedding of the last sample produced. IDK if it should be some fixed value (such as zero), a random value, or something else (like the init_e)
output_ch = [] # the list of output channels
for i in range(1000):
for ch in random.shuffle(output_ch):
last_e = diffuse(i, ch, last_e) # runs step i of the diffusion to generate the output for the channel ch with last_e as the embedding input. Can also use init_e as input here, if desired.
# retrieve final output.Alternatives Considered
There are few alternatives given the need for variable number of channels.