Skip to content

Component based builds #5427

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

theobat
Copy link
Contributor

@theobat theobat commented Nov 11, 2020

Component based builds

This is a work in progress following #4745.

@theobat theobat changed the title wip Component based builds Nov 11, 2020
@theobat theobat marked this pull request as draft November 11, 2020 09:57
@theobat theobat force-pushed the component-based-build branch from 7aa1f1a to b50005a Compare November 11, 2020 18:52
@qrilka
Copy link
Contributor

qrilka commented Nov 15, 2020

@theobat I didn't do a thorough review but the diff looks to be quite small - is it already able to do something or it's still very early WIP? Do you want to discuss some details of it?

@theobat
Copy link
Contributor Author

theobat commented Nov 15, 2020

It's a work in progress, I'll ping when I feel it's ready. At this point I'm still in a very exploratory phase. I'm documenting what I understand as I understand it. I'm focusing on the ConstructPlan phase and work around the M monad in there (which I feel is a very badly named for such a central stuff in there, same for the W datatype which I renamed).

@qrilka
Copy link
Contributor

qrilka commented Nov 15, 2020

Great, I agree that M and W are not best names but those were added way before I started working on this project so unfortunately can't give any extra comments. I suppose switching to something better would be a clear improvement.

@theobat
Copy link
Contributor Author

theobat commented Nov 15, 2020

Also if anything substantial or any interrogation arise, I'll raise it here or in the issue so that we can discuss it before implementing it :) I'll probably have more questions but not right now.

@theobat
Copy link
Contributor Author

theobat commented Nov 22, 2020

So I've stumbled upon something of interest while refactoring 'ConstructPlan'. There is the idea that we can build everything "in one pass" currently in stack (as stated here in the Task object for instance). I suppose this means building all components of a package at once such as a library an executable and tests. The direction taken by using NamedComponents as the build unit will prevent this as-is, is that within the scope of changes expected by the initial issue ? Or is there something to discuss here ? Should be "reunite" the NamedComponents into a single task after the plan is constructed or is that pointless/dangerous ?

The thing is, if we actually want that, the Task datatype is going to be a bit redundant (every task has a TaskType, which itself can be a LocalPackage or a Package. In LocalPackage we can have a Set of NamedComponents to build. Following the current design of Component addressed maps, this will only be a single thing then ?

@qrilka
Copy link
Contributor

qrilka commented Nov 22, 2020

Yes, "building at once" is one of the things which obviously need to be touched when introducing component-based builds. I suppose that building multiple things in 1 go could save some time. My idea about this was to use only components as build graph vertices as a 1st attempt an check out if it's possible to do "component grouping" as an extra optimization step. As for data type restructuring I don't see any problem with changing any of them if it's necessary.

@theobat
Copy link
Contributor Author

theobat commented Dec 3, 2020

I have a few questions for you @qrilka,

  • Do we want to keep track of which components of a package are already installed ? And what does "install" mean in stack's terminology, is it just downloading the files, building the package (as a whole seemingly..?) ?
    My intuition here tells me that, we should track which components are built, but not which components are downloaded (as we have to dowload the whole package anyway AFAIK).

  • Once the ConstructPlanMonad (former M monad) has finished its job, every Task is ready to do something, but we can reunite them by PackageName in the Plan. Do we need component based planTasks and planFinals ? I don't think so, but I'm not sure, and I have not had the time to explore much after plan construction.
    If it's not important afterwards, we can define some kind of wrapper around Tasks such as PartialTask and reunite them to tasks once we finished to gather dependencies (much like you suggested twice before) ?

  • In Package.hs

  | HasLibraries !(Set Text) -- ^ the foreign library names, sub libraries get built automatically without explicit component name passing

Is this something we want to keep ? I don't see where this materialize, but maybe that's an "after plan is constructed" thing ?

  • In the ResolveResult definition, I changed the rrComponent to something more explicit, because I'll need it later in the ConstructPlan:
-- | Necessary to enable component base Builds. 
data SingleOrAllComponent
  = SingleComponent !NamedComponent
  | AllComponent !(Set NamedComponent)
  -- ^ Nothing was specified so we infer that we should build possible
  -- target (including sub-libraries).
  | AllComponentUnresolved
  -- ^ This needs to be here for very peculiar cases where we don't know
  -- how to gather the list of dependencies (TO BE REFINED later).

data ResolveResult = ResolveResult
  { rrName :: !PackageName
  , rrRaw :: !RawInput
  , rrComponent :: !SingleOrAllComponent
  -- ^ Was a concrete component specified?
  -- Otherwise explicitely list all the components to build.
  -- This is part of the Component Based builds initiative.
  -- Implicit naming is not possible if we want to resolve circular deps & all. 
  -- (see https://github.com/commercialhaskell/stack/issues/4745)
  , rrAddedDep :: !(Maybe PackageLocationImmutable)
  -- ^ Only if we're adding this as a dependency
  , rrPackageType :: !PackageType
  -- ^ Is it a dependency or a project package ?
  }

But I have a few situations where I don't know how to gather all the components to build:

Target.hs line 352
Target.hs line 365
Target.hs line 377
Target.hs line 391

These all seem like situations where we don't have the cabal file at hand, but I didn't fully understand yet.

  • In addDep, there are 2 comments which seem like they should be addressed:
                       -- TODO look up in the package index and see if there's a
                        -- recommendation available
-- ....
                        Just (PIOnlyInstalled loc installed) -> do
                            -- FIXME Slightly hacky, no flags since
                            -- they likely won't affect executable
                            -- names. This code does not feel right.

Can I do something about it or is it too much out of the scope of the component based builds ?

@qrilka
Copy link
Contributor

qrilka commented Dec 3, 2020

  1. I suppose your intuition is quite close to what we have in the code base. The notion of being "installed" is mostly about packages being built and registered in a local package db. I think Cabal (or ghc-pkg?) doesn't track such information about executables for example and that leads to some hacks in the Stack's code base. Tracking such information is one of Stack's ideas how to exclude unnecessary recompilation so I think we need to do that but I guess there could be problems with non-library components.
  2. Just use whatever you see necessary. planFinals as I think were a bit of a hack to allow some component get built after package library gets built. In case when a plan nodes will contain one component (or a set of components for a package) such a construction should be expressed in plan's structure itself.
  3. Maybe we could get around that in Stack.Ghci and Stack.Coverage but I can't say for sure at the moment - I'd say leave it as it is and add a TODO if this could be done.
  4. That get's converted into TargetAll in combineResolveResults and concrete components get determined by build flags in Stack.Build.Source.loadLocalPackage (where there is a cabal file at hand). Hope this gives you some info
  5. I'd say it's better to get some POC working first (with no extra code touched) but if you see some low-hanging fruits after introducing component-based builds - putting that into some (public?) list to go through later would be very good.

@theobat
Copy link
Contributor Author

theobat commented Dec 6, 2020

I'm closing this because my branch has gone a bit wild: I mixed docs, refactorings and actual drastic changes. Besides I took a bad design decision by changing all the Map PackageName to maps with named components, it's not required. Even worse, I missed the whole changes required to the Execute module (which probably are the most importants).

I'll reopen a new pull request which, will only change docs, add comments and perhaps add very simple rafactorings. In this new pull request I'll detail the core changes I envision for component based builds with bullet points so that everything is clearer before I start actual coding (I wanted to do this here initially, but I've had a few moments of false sense of understanding, which led to this closing initiative).

@theobat theobat closed this Dec 6, 2020
@qrilka
Copy link
Contributor

qrilka commented Dec 6, 2020

I'm glad that you'e still continuing this effort @theobat
And please raise your voice if you need some help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants