Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plans for supporting higher dimensional data? #708

Closed
chengchingwen opened this issue Mar 27, 2019 · 6 comments · Fixed by #1405
Closed

Plans for supporting higher dimensional data? #708

chengchingwen opened this issue Mar 27, 2019 · 6 comments · Fixed by #1405
Labels

Comments

@chengchingwen
Copy link
Member

Currently most of Flux's models and functions are build for 2 dimensional data (the input dimension x the batch size) except for convolution. And as the slogan (Flux is the ML library that doesn't make you tensor) said, I guess it is intend to avoid higher dimensional data (the tensor). However, there are lot's of models that need (at least for the performance issue) those higher dimensional operations. So I just wondering about what is the Flux's plan for it, would we eventually go to somewhere like other DL frameworks do?

@MikeInnes
Copy link
Member

We don't exactly avoid higher-dimensional tensors -- all the conv operations take them, for example. What we don't do is to e.g. automatically reshape inputs to Dense layers. This is more of a convenience issue than performance though, since you can already do the equivalent reshape by hand.

I guess an example of what we might do differently here would be helpful.

@pshashk
Copy link
Contributor

pshashk commented Mar 27, 2019

Some Flux's functions are designed for specific data layout. For example, Dense doesn't support data with more than 2 dimensions (#282). softmax and crossentropy doesn't have dims argument (FluxML/NNlib.jl#77).

Some limitations can be solved by manually iterating over tensor slices, but that would require more powerful eachslice in Base (now it works only over single dimension) and sparse gradients for views (#589) to avoid unnecessary allocations of zeros.

@dehann
Copy link

dehann commented Apr 28, 2020

Hi, I thought it might be useful to add a related use case I had recently ... had to convert a TensorFlow model into Flux and ended up with some hackery to consume higher dimensional input data into a Dense layer and then maxpool and flatten. The flatten was easily done with a reshape but Dense and maxpool were a bit more difficult. I did not find any documentation on current Flux.Dense taking 2 dimensional data and guess a bit more detail on 2D input (even an example) in the docs would go a long way; perhaps there already is an example and I just missed it?

For reference, here is what I ended up doing: the input data is 25x4 -- as in modjl(randn(25,4))

 modjl = Chain(
    x -> (x*W1)' .+ b1 .|> relu,
    x -> reshape(x', 25,8,1),          # was needed to use maxpool in a 'normal' way
    x -> maxpool(x, PoolDims(x, 4)),
    x -> reshape(x[:,:,1]',:),         # the TF flatten layer
    Dense(48,8,relu),
    Dense(8,2)
  )

# size(W1) = (4,8)
# size(b1) = (8,)

Where I'm forced to use a closure around W1 and b1. Perhaps Flux.Dense and maxpool are already capable of doing this but I was having some trouble trying to use the API.

Similar for saving and loading models, the Flux docs suggested BSON method did not work well enough yet -- since I was using a novel layer. Also exporting from TF was tricky since TF.Dense does not seem to work well with ONNX format. In the end, I did it all by manually working with the weights of the Chain object.

@CarloLucibello
Copy link
Member

Are you aware that the batch dimension in Julia libraries is the last one, while in python is typically the first one?

@CarloLucibello
Copy link
Member

Also, bson saving should be fine as long as you save and load models in the top level of your script. Try and file an issue if it doesn't work

@dehann
Copy link

dehann commented May 1, 2020

Hi @CarloLucibello , thanks for the feedback although I don't quite follow:

the batch dimension in Julia libraries is the last one

Did you mean there is a simplification to the modjl above that I missed?

bson saving should be fine

I'll check it again but pretty sure the loaded model was different from the original, sort of a silent error which can be quite disconcerting.

bors bot added a commit that referenced this issue Nov 30, 2020
1405: support multiple batch dimensions in Dense layer r=DhairyaLGandhi a=CarloLucibello

Since most deep learning frameworks support it, we also should.

I can't find a corresponding issue. #282 is slightly related. 
After this, we should close #708 

### PR Checklist

- [x] Tests are added
- [x] Entry in NEWS.md
- [x] Documentation, if applicable
- [ ] Final review from `@dhairyagandhi96` (for API changes).


Co-authored-by: Carlo Lucibello <[email protected]>
Co-authored-by: Dhairya Gandhi <[email protected]>
@bors bors bot closed this as completed in 075f42b Nov 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants