Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
GHC's base libraries: Combining stability with innovation #51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GHC's base libraries: Combining stability with innovation #51
Changes from 19 commits
e6194f9
adff787
e492131
8ce3f1b
9b89aa5
9f5a4e9
2dcf6b4
0ea3a27
934a740
6a3b708
1e54527
7d9a71d
beecddf
6ab9197
66c8f93
d03211e
f165b46
4123e23
f3d8e74
82263f1
e709d67
9cd683d
c6a8c28
e75a82a
8ca912a
54a55a3
489c809
89a3796
c157f56
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, so the governance structure is:
base
governed by CLC and CLC proposalsghc-internals
governed solely by GHC team, no proposal necessaryghc-experimental
governed by GHC proposal processThe Win here is that GHC Team can unilaterally make changes to
ghc-internals
without asking anyone, which allows them to iterate quickly. GHC Proposals can go inghc-experimental
without involving CLC directly. And CLC has less extraneous work. Feels like a win-win-win to me.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something I'm always wary of here is the tendency for things to go stale - not quite popular enough to move to
base
, but useful enough to stay inghc-{experimental,internal}
. Is there room for a policy on how long something should be "unstable" before getting shelved or promoted?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with your concern -- and it also applies to accepted-but-not-implemented GHC proposals.
For present purposes I think we should avoid scope creep for this document, and leave it as a matter for the GHC Steering Committee
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
100% agree that
ghc-experimental
should have a clear lifecycle, and ideally an "up-or-out" policy of some kind.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part of the proposal feels orthogonal to the rest of the proposal, and I think it can be removed without impacting the benefits of the proposal.
If
ghc-internals
makes a breaking change, then per PVP, it will do a major version bump. This means that depending onghc-internals
has the same pain points as any other library - lax version bounds and a new release may break your code, or strict version bounds and then you have to make lots of Hackage revisions. I don't think providing a special case forghc-internals
makes much sense here, since we do not do that forghc
,ghc-prim
, or any other package.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also don't feel like this part of the proposal is well-motivated - specifically why
ghc-internals
should be treated separately from a potentialtext-internals
orbytestring-internals
.From my perspective, writing
build-depends: ghc-internals
is just as prone to pain as any other unbounded dependency. Andbuild-depends: ghc-internals == 9.6.4.*
should be perfectly safe, unless PVP is violated. But the point of this proposal is that we can make "breaking changes" toghc-internals
as a major version bump without incurring a major version bump forbase
- in other words, so we can be PVP compliant.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's really a response to a legitimate CLC concern that if it's too easy to depend on
ghc-internals
then that's what people will do. And if that happens we might grow an ecosystem of libraries many of whch depend, perhaps by accident, onghc-internals
. So every new release of GHC would force updates to all of those libraries. You could say "if you depend onghc-internals
then that's what you get, but the pain is real and not mitigated by saying "I told you so". To put it another way, it risks entirely bypassing the stability efforts of the CLC.So that is, I think, the reason for this text. You are right that all the same issues apply to any
-internals
package, and one might wonder about mechanisms to discourage depending on them. But to avoid scope creep in this document we just stuck toghc-internals
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discouragement is well motivated. Prohibition isn't.
We have to educate users more generally about the pattern we're trying to introduce here and how it relates to PVP, API stability, etc.
I could imagine that we try to summarize the result of these efforts for end users and library maintainers in a more distilled way and explain the pattern there in more depth.
I don't think we have to be particularly scared about ecosystem degrading.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@simonpj I think that makes good sense. I want GHC to be able to innovate freely and iterate quickly, and I wouldn't want CLC to start encroaching on the
ghc-internals
package for this reason. Where I think I'd draw the line is at describing the problem, noting it is a known ecosystem-wide problem that deserves a good solution, and somehow put a reminder on the package to use that solution when it is implemented.There are a few perspectives here, and I think it'd be good to be really clear about which perspective a bullet is trying to satisfy. This one appears to be trying to satisfy both "industrial users want fewer breaking changes when upgrading major versions of GHC" and "CLC wants to be able to promote stability for the ecosystem" - which are overlapping but subtly different concerns.
As an industry user and OSS contributor, the primary pain point of upgrading GHC is that there are breaking changes that require a large chunk of the ecosystem to be touched.
base
is usually a relatively small part of this pain, withghc-prim
andtemplate-haskell
causing far more issues. I don't expect thatghc-internals
is going to meaningfully impact the work that gets done here.If
ghc-internals
is banned bycabal-install
, then I don't see how it is different from un-exposed modules inbase
. Ifghc-internals
is banned on Hackage, then industrial users will need to vendor code in order to share libraries that depend on it - which increases the pain of a major upgrade by requiring applications to depend onghc-internals
directly, rather than sharing a library which can abstract over multiple versions ofghc-internals
via CPP or other techniques. If there are warnings, they will be ignored for good and bad reasons, and maintainers will have another package to consider when doing ecosystem upgrades, but the overall workload doesn't change much.As a CLC member,
ghc-internals
makes my life easier by having fewer changes tobase
, and changes tobase
that can be backed up by real ecosystem use inghc-internals
. But if usingghc-internals
is heavily discouraged, then it's difficult to identify whether or not something in that package is satisfying the needs of the ecosystem.Summarizing,
ghc-internals
and writing docs that it'll be a major version bump with breaking changes every GHC release.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The more I think about it, the more I am convinced that "all possible social and technical means" is too strong. It is obviously not literal. We are not going to hire a crack commando unit who survive as soldiers of fortune to go to your house if you depend on it.
Better language might be "develop both social and technical mechanisms to discourage..."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@parsonsmatt thanks, but I'm still not quite getting this. (It's surprisingly hard to explain all this accurately, as we are all finding!)
I do think we should discourage import of
ghc-internals
. For example, if the same function is available throughbase
you should definitely use it from there. Why? Because the GHC team might simply move it around to a different module inghc-internals
-- it's "just" an implementation matter, after all. But that would break your code.I think we all agree that "discourage" does not mean "prevent" or "ban".
I'm entirely open to changing the words that express "discourage". The "all social and techincal means" is a direct quote from @Bodigrim, who may have a view here.
At very least "discourage" must include "cannot happen except by conscious choice", so that uninformed users don't casually depend on a function from
ghc-internals
. Eg. if it's also available frombase
, that option should be presented more prominently somehow.I think that one process you have in mind is this:
GHC.Foo( wombat )
fromghc-internals
.ghc-internals
is discouraged, they still push through that pain barrier, because there is no alternative towombat
.wombat
intobase
.I think that's a great plan. But it's entirely compatible with (indeed somewhat based on) discouraging use of
ghc-internals
.Woud you like to propose some alternative concrete form of words? (We could move the discussion of possible mechanims into Section 6, i.e. plainly rumination around the theme rather than part of the core plan.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds excellent. Much like "partial", "internal" seems like a fairly natural warning category that other people in the ecosystem might want to use. And it's easy for a user to say "I know what I'm doing" by just turning off that warning category in the module where they use internals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's the
category
field in cabal package files which is pretty unstructured an all over the map. We could easily initiate a convention of theinternal
category actually meaning something -- e.g. triggering an extra textual notice on hackage pages listing the package, and extending cabal so it could maybe optionally filter or warn on such stuff.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems like more work for less fine control? If we already have the mechanism in GHC at a module level, that seems sufficient to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hackage itself cannot and should not read module-level pragmas, nor can cabal. This proposal is about having some packages be internal and some not. So we need package-level mechanisms for marking things internal, such that both hackage and cabal can take appropriate action (perhaps additional html, like a big red warning box on hackage, and for cabal the ability to warn on using internals as a direct rather than indirect dependency, etc)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I understood people are also worried about indirect dependencies (coming from things other than the original package), because they may easily lack behind upgrades and bubble up churn.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that would be another thing that would be nice to address. the
category
field does not address that, but seems useful nonetheless, and I would encourage its consideration regardless.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Sorry, I realised that what I was saying was ambiguous based on the line I was commenting on: I'm in favour of a module-level INTERNAL warning, added manually to source files, not magically added to packages by cabal or Hackage)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really all? Including GHC/Cabal simply refusing to allow packages called
ghc-internals
if the package currently built isn'tbase
or another internal whitelisted package?But it says “discourage”, not “prevent”, which sounds reasonable.
Maybe more realistic: hackage could reject uploads of packages that depend on
ghc-internals
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that is key. If people want to tinker with the internals of some open source software, we should not stop them. But we should make sure they are very aware they are stepping outside the stable safe zone. And likewise we ought to someday make anyone that uses the tinker's software (transitively!) also aware they are exposed to instability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(The difference between
ghc-prim
andghc-internal
also isn't clear. But one answer isAnother answer is
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't say in the same document that we should use "all means necessary" to avoid
ghc-internals
dependencies, and also recommend people to import a function from there.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expect
ghc-internals
to mainly contain things one cannot define locally? I would expect also some zany integrate-with-ghc-so-tightly-someone-turns-blue projects to always need to useghc-internals
too. I think it's less important to deny that category exists, than emphasize that the vast majority of projects are not in that category.Put another way, I'd expect those projects to know who they are and not reach out to GHC devs or the CLC, and I'd expect almost everyone who does reach out to be told "actually, you don't need to do things that way". Is that what you were thinking too? Or were you imagining we might sometimes move something to
ghc-internals
(without a reexport elsewhere) that shouldn't be there, by mistake?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If such projects are relying on functionality from
base
that GHC developers and the CLC both agree should be moved toghc-internals
, they ought to reach out and say "hey, we use this for such-and-such", and make a case for it to be put inghc-prim
instead.The point here is to make, FAIAP, a "hidden" library containing code in the strictest intersection of "compiler bootstrap" and "necessary to allow general-purpose programming": stuff like the guts of IO, datatype representations, and the like. There should be nothing in
ghc-internals
that is both:base
,ghc-prim
, orghc-experimental
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with the main "There should be nothing in ghc-internals that is both: useful for general programming, and not exported from..." point, but do not that I don't think anyone intends
ghc-prim
to be any more user-facing thanghc-internals
. Perhaps the proposal should make this more explicit?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be "not exported fomr base or ghc-experimental".
ghc-prim
is every bit "internal" asghc-internals
. Anything inghc-prim
that is useful for general programming should be be exposed bybase
orghc-experimenal
. Indeed maybe we should make that clearer.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This tool would be amazing to have for other libraries - I'd love to have something here so I could know for sure if a new release is versioned appropriately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this already exists as https://kowainik.github.io/posts/policeman-bristol - or is what you're looking for something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
policeman
is a good approximation but when I examined it for use in GHC I found it had a few shortcomings that made it challenging to use in GHC:MINIMAL
declarations of classes, and the like are not accounted for in the PVP assessmentbase
is (unfortunately) such a package.hie
files, which we currently don't produce in the GHC buildHappily, dumping the declarations of a package is quite straightforward. I suspect someone could turn GHC's test into a useful Hackage package (or fold it into
policeman
).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO a lot of this would be helped by tracking the import/re-export provenance more directly. "Imported from A and originally defined in E" is nice but it's also often very nice to know the exact chain of imports and re-exports that brought something into scope. GHC could then have a flag to trim this provenance information to the direct dependencies of a package. That way, if you find yourself depending on
ghc-internals:SecretType
, you can see exactly how it got introduced into your codebase.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In asking ghc-proposals/ghc-proposals#595 (comment) I was wondering something similar --- ideally we we do now know the entire provenance of imports and (re)exports, and there are a number of things we can do with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't you need to import the relevant class to have these instances be visible? And if it's only exposed in
ghc-internals
, wouldn't that require an explicit import of the class?This may just be an artifact of how instance visibility works across component boundaries, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not necessarily. The primary way that this exposure might happen is via Haddocks. For instance, imagine that we have a class, exposed via
ghc-experimental
(and not inbase
) but defined inghc-internals
which some class exposed viabase
is an instance of. Underhaddock
's current logic, this instance would be shown in the documentation forbase
, despite the class itself not being visible inbase
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a technical limitation of the way that
base
and GHC work, and would in principle be fixed by allow a GHC version to work with multiple versions ofbase
, right?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you proposing (eventually) to continue to ship the
base
with the bug and a new one with the bug? That would work. But I think the point in this case was that some bugs are not a matter of implementation but inherent to the interface itself.(I remain very pro multiple
base
versions with one GHC, to be clear :).)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm confused - can a released GHC use a different version of
base
? I was under the impression that thebase <-> GHC
relationship was fixed, and every version of GHC can only use a single version ofbase
. That means that GHC X.Y.Z must always use the samebase
version, and a newbase
version means a new release of GHC.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Going on a tangent. This is not what this proposal is about.)
I think there is much confusion around this topic because people confuse GHC the software with GHC the official bindist.
The answer is Yes, and it's not even a new feature. See https://gitlab.haskell.org/ghc/ghc/-/blob/master/compiler/GHC/Unit/Types.hs#L562.
The short version that is one is liable to shoot themselves in the foot if they have more than one
base
in a package database, but one is always free to have multiple package databases they will not mix together. In each "parallel universe" of the separate package database, there is only onebase
, and no potential for confusion.GHC in fact has no way of knowing whether this is the "official base" that is was shipped with the bindist, or ones own. How could it now? (It isn't shipped with trusted ABI hashes for
base
, for example, on the contrary, it expectsbase
to have no ABI hashes; it is maximally trusting of whateverbase
you give it!)For a variety of reasons, no one has yet proposed we take advantage of this. But we can. There is no technical limitation, just the ergonomic/human problem of it being easy to get our package databases with their separate
base
s mixed up.(I believe @alt-romes might be working on getting all wired-ing packages ABI hashes to solve the humans-getting-confused problem, but I am not sure about that.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed -- I think there is a consensus in favour of that!
But, to be clear, achieving that goal is not part of the current document. So this thread is somewhat "by the way".
Still, the plan outlined this document should make the goal of a reinstallable base much more achievable. GHC knows (wired into its binary) where, say
GHC.Base.map
is defined. If we move that definition from one module to another, everything will stop working. I think that is one reason thatbase
and GHC are so tightly bound together. But ifGHC.Base.map
was inghc-internals
then GHC andghc-internals
would remain tightly bound together, butbase
would be much more loosely coupled. I don't yet see any technical obstacle to installing a freshbase
without changingghc
.I could be wrong; but all that matters for this conversation is that the plan here is a step in the right direction for the reinstallable-base goal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fantastic, thanks for the clarification!
Allowing
base
to iterate without incurring a GHC release cycle would be a big win for the ecosystem, for sure.