Skip to content

Conversation

@osa1
Copy link
Member

@osa1 osa1 commented Sep 26, 2025

Questions and TODOs:

  • Do we want to allow higher-kinded types now, or later? Do we have any immediate use cases?

  • Should we represent kinds with types? (i.e. type-in-type)

  • In the interpreter we implement Hash for Ty, because we represent predicate sets as HashSet<Pred> and Pred refers to Tys for the type arguments of traits. We can't do it automatically in Fir yet, so we'll have to manually implement a lot of Hashs. Also, FunArgs is currently using a HashSet. We may have to make it a sorted Vec[(name: Id, ty: Ty)] (for named arguments).

    But I wonder if there's a better representation of predicate sets than HashSet[Pred]? It would be simpler if we could avoid hashing here, at least initially.

    Reminder: pred sets are not just for function contexts [Iterator[iter, item, exn], Eq[item]] etc. they also hold generated predicates, to be resolved later.

    Allowing duplicates could potentially generate a lot of redundant predicates, wasting time in the resolving phase. For example, every == in a function can generate a Eq[t] predicate.

    I think using a Vec[Pred] and linear search when adding may be good enough for now. TODO: Check the max. pred set size in the interpreter when running the programs in the repo.

    Predicate sets will be vectors, at least for now. See Predicate sets contain duplicate predicates #229 for the discussion.

  • In TyEnv (in the interpreter) we currently maintain two maps, one for type variables, one for type constructors. But type variables are also treated as type constructors (just with no known constructors) in most places. AFAICS only in conversions we use the type var map to preserve sharing when the type AST mentions the same type variable.

    In the compiler we use TyId for type constructors and LocalId for type variables. So the map will have a key type with the union of these.

    I wonder if it would make sense to have another map (three in total):

    • cons: ScopeMap[TyId, TyCon]: for mapping named types to type constructors
    • vars: ScopeSet[LocalId]: type variables in scope. In the compiler, TyCons for these don't have any information anyway, so a set should be OK.
    • varConversions: ScopeMap[LocalId, Ty]: to be able to convert type variables in scope without breaking sharing.

    I think this will require adding one more variant to Ty, to be able to distinguish type constructors from rigid type variables. E.g.:

    ## A type constructor, e.g. `Vec`, `Option`, `U32`.
    Con(
        id: TyId,
        kind: Kind,
    )
    
    ## A rigid type variable.
    RVar(
        id: LocalId,
    )
    

@osa1
Copy link
Member Author

osa1 commented Sep 30, 2025

I think the Ty type is currently not quite right.

  • Because we can't have a type variable with kind other than * or row, and we don't allow partial type applications (because you wouldn't be able to bind anything to partial apps), Con cannot be Vec, Option etc. It needs to be fully applied.

    So the examples in the documentation are not right, Vec and Option cannot be Cons.

    Because both rigid type variables and constructors are represented as Con, Cons currently can have row kinds.

    If we separate rigid type variables and constructors to different Ty variants (see comment below), then Cons will only have kind * as row types are represented as Anonymous with the isRow flag set.

  • Because anonymous types are not Apps, I think App doesn't need a kind field. An App is always a *.

@osa1
Copy link
Member Author

osa1 commented Sep 30, 2025

Thinking about this more: a quantified type variable becomes

  • A rigid type variable (let's call this rvar) when the qvar is in a function signature and we're checking the function's body.
  • A unification variable (let's call this uvar) when the qvar is in a scheme of a function we're calling.

A type variable is only on eo of these (qvar, rvar, uvar) at any given time. It cannot be more than one of these.

So we would be able to use just one Ty variant for all of these, but currently we update uvar contents as we unify them (link them to other types), so they have different fields.

We could either have a map to map unification variables to their linked types, or keep using different variants.

If we use different variants, it might make sense to have a variant for rvars too.

If we do that, I think that also solves the problem with the TyEnvs mentioned in the PR description.

@osa1
Copy link
Member Author

osa1 commented Oct 27, 2025

TODO: Environments should keep (in addition to schemes etc.) unique top-level definition indices/ids to be able to import the same top-level thing multiple times, via different imports.

E.g. a library re-exports A/B/T (the type T) and I also import A/B/T directly.

This is also a language design question, but we may also want to allow importing different things under the same name, and only fail when we use that thing. (instead of when we import it)

If we design the data structures to allow this we may choose to not allow it later easily. So I think it makes sense to design for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants