Modules Divided: Interface and Implement

Posted on 2011 October 3 by dmbarbour

In a recent article, I described how anonymous modules avoid an entanglement problem, and sketched an approach to achieving this. Unfortunately, the approach I sketch seems verbose, and would tend towards high search costs. While those disadvantages can be mitigated by discipline and idiom, dependency on discipline and idiom stinks of design failure.

I would prefer that the path-of-least resistance avoid verbosity. And, while those expensive constraint-solving searches should be available (because they’re a very nice feature for adaptability, specialization, and meta-programming), I believe that extra verbosity should be required to ‘opt-in’ rather than ‘opt-out’.

By ‘avoid verbosity’ I mean:

import statements should be short one-liners; complex assertions or requirements should be separate.
module headers should be a short one-liner, even if there are a lot of parameters.
export should be implicit in the common case

I’ve been thinking about how to achieve this, and it occurs to me that limited use of names would be appropriate – i.e. to achieve these short one-liners, it would be convenient to simply provide a unique name. The trick will be to borrow the benefits of associated with names while avoiding problematic entanglements with the namespace.

Named Interfaces and Anonymous Implementations

What I propose is to partition modules into two distinct sets: interfaces and implementations. Implementations are anonymous. Interfaces are uniquely identified by reference or name.

There are constraints on the relationships between these sets:

Interfaces do not depend on implementations.
Dependencies between interfaces are linear and acyclic, e.g. based on a single-inheritance relationship (extends, weakens, adjusts).
Implementations each declare one interface they implement.
Dependency between implementations is indirect via ‘import’ of an interface. However, there are no limits on the number of imports. A linker must find an implementation for each imported interface.

This helps disentangle modules to some degree. To reuse an implementation-module M initially from project Q in project R, a developer would need to do the following:

copy module M to the new project.
copy the interface M implements to the new project.
copy the interfaces M imports to the new project.
copy the interface inheritance chains to the new project.
rename interfaces from project R whose names conflict with those already in project Q.
write implementation modules for any new interfaces M imports.

Fortunately, the first five steps would be finite, predictable, and simple. Each interface has a finite chain of dependencies. Renaming modules can be painful, but tends to be a one-time pain and should be easy to automate by such mechanisms as a refactoring browser or simple import-rule between independently maintained DVCS repositories.

The sixth step is, of course, to provide for M’s dependencies. If projects R and Q are already close, this might require only adding a few implementation modules as ‘glue’ between interfaces (saying how to take one set of interfaces and emit another). If not, developers of Q could grab a bit more of R’s implementation and repeat the process until closer to a common set of dependencies.

The advantage of these constraints is that each step is simple, predictable, and and under relatively easy control. Developers can decide just how much of one project to bring into another.

Parameterized Modules

Parameters offer developers more control and awareness of dependencies and relationships between modules. With implementation and interface divided, and with the new goals to reduce verbosity, parameterization needs a slight change in vision.

To avoid redundant declaration of parameters in each interface module, be able to ‘inherit’ a useful set of parameters.
To avoid redundant declaration of parameters in each implementation module, parameters must be a property of the interface being implemented.
To avoid redundant specification of parameters at ‘import’, interfaces must provide default values for parameters.
To avoid redundant specification of common non-default parameters, developers should be able to create new interfaces that override the defaults, yet use the existing implementation.
To avoid redundant plumbing of parameters at ‘import’, developers shouldn’t need to name each parameter to propagate them.

These requirements direct me towards design based on records and keyword:

An interface may declare named parameters, and optionally some default values.
At each ‘import’, developers provide a record of keyword parameters, along with the interface name.
Within an implementation module, parameters are accessible as a record. This makes it easy to delegate parameters, or provide the same parameters to multiple imports.
Interface inheritance can override default parameters. If default parameters are the only property affected.
Within an interface module, parameters are accessible by name – e.g. when describing assertions or invariants. It is easy to constrain that one parameter is greater than another, for example. Since the same parameters are available to the implementations, it is possible for modules to ‘adapt’ the implementation to the parametric requirements.

Interfaces don’t do much with parameters – no transforming them, for example. Most complexity is left to the implementation modules.

Encapsulation, Implementation Hiding, and Exports

An interface can declare exports. Outside of these declared exports, nothing defined in the implementation module is visible. Thus, there is no need to declare exports per implementation module. The exports list is also inherited, which avoids redundancy between interfaces.

It should be easy for an implementation module to ‘delegate’ exports to another module, i.e. importing the names then forwarding most of them as-is. This goal is a simple idiom for indirect ‘implementation inheritance’ by use of import.

Interface Inheritance

I’ve several times mentioned the possibility of interface inheritance. My goals with this are to:

avoid verbosity from copy-and-paste programming of interfaces.
allow the structure of interfaces to help direct a search for valid implementations.
keep it simple and sufficient, and preferably familiar

A possible simple, sufficient model is to allow three forms of inheritance: extends/narrows/adjusts. Assuming interface Q inherits from interface P:

extends – Q is ‘a kind of’ P, but can have more functions. When the linker is searching for an implementation of interface P, it might decide to use an implementation of Q.
weakens – Q is ‘an abstraction of’ P; that is, P is ‘a kind of’ Q. This allows developers to relax constraints and lose exports or parameters. When the linker is searching for an implementation of Q, it can use a P.
adjusts – Q is ‘similar to’ P. The operations from extension and weakening are both allowed. The linker won’t relate Q and P by interface name, though it will often be feasible for developers to create ‘implementation glue’ modules that can import P and export Q or vice versa.

I think most OO developers would also be familiar with these concepts.

The critical bit, I think, is that each constraint (assertion, invariant, etc.) will need to be named if we are to allow relaxation of them.

Module Versioning

Modules can easily be versioned, often with extends/weakens inheritance or a little implementation glue from a previous version. I think it feasible that interfaces could even be immutable after construction – i.e. named with their version or secure hash – or perhaps accessible in those terms.

Constraint-Based Linkers

There are several similarities to the prior system:

Implementation modules are anonymous, ambiguous, subject to search.
A search can fail due to failing an import or not finding a solution that meets invariants.
Searches will be a function of parameters.
The amount of search tends to grow with the number of modules.
There is no way for an individual module to prevent use of search.
Search based on preference heuristics is feasible (though not described in this article).
Easy support for multiple versions of a module.

And there are also many advantages:

Much less verbose. No need to increase verbosity to reduce ambiguity.
Interface names will provide a decent record of human intent. It is unlikely that a search system will substitute one implementation module for another in a ‘bad way’, based on similar structure.
Creating a new interface name is an easy way to control searches.
Weakly ‘opt-in’ – developers can easily avoid importing interfaces known for heavy search.
Effective indexing – interface names provide a common attribute for the majority of indexing requirements, which is much less ad-hoc.

I’m not sure whether this is the right balance of search and specification in any absolute sense, but I’m inclined to favor this two-tiered interface and implementation approach over the purely anonymous variation.

User-Defined Syntax

In an earlier article on user-defined syntax, I noted that use of parametric modules or search makes ‘import’ of a language module an impossibility (assuming a requirement to statically parse the modules).

The module system described here has an interesting property whereby every module either inherits or implements exactly one interface module. This opens the possibility of defining the syntax in the interface.

I have not explored this possibility very thoroughly, but my first inclination is rejection: the language of implementation is not, properly, a concern of the interface being implemented.

At the moment, I continue to favor that each module header is actually two lines: one to specify the language module, one to specify the semantic structure of the module (e.g. interface vs. implement).

Wiki Integration?

When I gave up on names a while back, I cried a tear because one my vision of a wiki-based IDE was dead in the water. Granted, I’ve had much time to re-envision that concept, and have some nifty ideas for distributed development that don’t rely much on names.

The clean mix of named interfaces and anonymous implementations, however, will serve very nicely and can bring back a lot of wiki-like features. Each interface could provide an automated set of backlinks to the implementations and related interfaces. Each implementation would provide links to interfaces.

Grammar?

Here’s a sketch, ignoring user-defined syntax.


Start = Interface | Implement
InterfaceName = Name('.'Name)*
Name = [A-Za-z][a-zA-Z0-9_]*
Interface = 'interface' InterfaceName IFRest
IFRest = 'weakens' InterfaceName (WeakensDecl*)
       | 'extends' InterfaceName (ExtendsDecl*)
       | 'adjusts' InterfaceName (WeakensDecl | ExtendsDecl)*
ExtendsDecl = '\n' (ProvideDecl | RequireDecl | DefaultDecl | SpecifyDecl)
WeakensDecl = '\n' (ExcludeDecl)
RequireDecl = 'require' Name 
DefaultDecl = 'default' Name VExpr 
ProvideDecl = 'provide' Name 
SpecifyDecl = 'specify' Name SpecRule 
ExcludeDecl = 'exclude' Name 

Implement = 'implement' InterfaceName Args? (ImplementDecl*) Export?
Args = '(' Name ')' -- optional, to capture args as record
ImplementDecl = '\n' (Define | Import)
Import = 'import' InterfaceName VExpr? 'from' VExpr ('as' Name)?
Define = 'define' Name VExpr
Export = 'export' VExpr

Notes on the grammar:

As the grammar is defined, interface inheritance is not optional. I’m assuming an empty root interface ‘Void’ will be provided for bootstrap. (i.e. interface Foo extends Void)
There is no enforced relationship between interface names and interface inheritance. Interface names are not namespaces. The dotted structure is only meaningful to developers.
Uniqueness of interface names, and acyclic inheritance, would be externally enforced (by compilers, IDEs, databases, etc.). Within each registry, interface names should be unique.
`require` specifies a parameter, and `provide` specifies an export. These are simply names.
Parameters, and possibly exports too, can be set with a `default` value. A value expression is allowed, and an interesting possibility is to specify defaults as an expression from other parameters or exports. In Awelon, values are acyclic and I’d probably require redefining any inherited defaults that would introduce a cycle based on the new definition.
An implementation module is required even if the interface defaults everything. It could be very trivial, though: the simplest implementation module is simply `implement InterfaceName`
Specifications cover assertions, types, heuristic preferences, and similar. Specifications are named mostly so they can be weakened (excluded).
Within an implementation module, I’m assuming name pollution is controlled by the 'as' Name option. Developers will probably receive a warning in case of name shadowing.
The 'from' VExpr in imports is for matchmakers or search-spaces. I described these in the earlier article. I still need to consider how matchmakers or search spaces will interact with the interface names. Different search spaces might act as ‘true’ namespaces, unlike interface names.
The Args is to capture parameters in a named record. Otherwise the names will be provided as part of the initial environment – which isn’t very convenient when forwarding those parameters to another import.
The Export option is to specify exports via record. Otherwise, the names would be taken from the final environment. Export offers a lot of precise control.

This entry was posted in Language Design, Modularity and tagged constraint, entanglement, grammar, inheritance, matchmaker, modularity, namespace, path-of-least-resistance, verbosity, wiki ide. Bookmark the permalink.

10 Responses to Modules Divided: Interface and Implement

Kevin says:

2011 October 7 at 9 am

Yay for design over discipline! 🙂

The critical bit, I think, is that each constraint (assertion, invariant, etc.) will need to be named if we are to allow relaxation of them.

You could also match on (even partial) constraints. e.g. “exclude [x < 35]" could relax the "x < 35" constraint. You could even assign a name to the constraint outside of the original definition, though maybe that would not benefit you at this stage without formal semantics.

I just realized that "content-based" matching probably refers to single format (syntactic / structural) matching rather than multi-format (semantic) matching modulo equivalence relations. Is that right?

Since I'm interested in full semantics, I'd flesh out the interface system with semantic name (model) integration using equivalence relations, etc. In fact, I view everything as named, and your implementation modules simply as increasingly grounded interfaces. i.e. an implementation module is still only partially specified because it requires the right MatchMaker(s) to link and compile it for a specific platform, just like an interface.

I see programmers as really only working with constraints (composing interfaces) in whatever vocabulary they choose, with semantic model/name integration into the system occurring automatically as a normalization preface to further MatchMaking.

Reply
- dmbarbour says:
  
  2011 October 7 at 1 pm
  
  I consider ‘full semantics’ to be an unrealistic goal. I will never have full semantics or specifications for sensors, actuators, foreign services, operating systems, or the real world. There will always be much validation and verification that is performed empirically, outside the prerogative of a programming language. There will always be much implicit ‘intent’ that is captured informally by shared symbols, shared names, shared vocabulary, shared interpretation, shared culture, and comments or other documentation.
  
  I design module systems with the understanding that expressing and protecting hidden semantics is a non-trivial concern. In the earlier article, I was using shared symbols like ‘toRomanNumeral’ – something that wouldn’t be included by accident in a module. With named interfaces, I can use interface names to similar purpose, and benefit from the extends/weakens relationships as appropriately extending/weakening the implicit semantics. (And also benefit from the centralized documentation.) Even when “interface Baz extends Bar” is the sum total of the Baz interface, the implicit semantics are extended. The declaration ‘implement Baz’ is a pretty clear indication that one intends also to implement the implicit semantics of Baz and Bar.
  
  Separation of interface and implement is separation of specification and integration. My goal is not to ‘refine’ an implementation, but rather to describe a solution separate from constructing one.
  
  I agree that there are issues with content-based matching, since it is difficult to compare semantics for equality (especially if some of them are implicit…).
  
  Reply
  - Kevin says:
    
    2011 October 8 at 3 pm
    
    I’m sorry I was sloppy — I meant that the ability to define semantics should be complete, not that all semantics must be known. There will always be unknowns. But it is useful to minimize them to enable automated reasoning.
    
    “Protect hidden semantics” is a little vague — does “hidden” mean a lack of dependency or a lack of knowledge? Interfaces allow independence from the additional semantics of implementations, but they do not require that the semantics be unknown. And, at the very least, we need those semantics in order to automate system adaptation to new constraints (including optimization) through the safe transformation of interfaces (common standards) and implementations (instances).
    
    If “interface Baz extends Bar” is the sum total of the Baz interface, then why are you creating Baz? You can do it, but by withholding or diluting or obfuscating your intention, you are limiting future automation. e.g. perhaps multiple interfaces are semantically identical, but without that knowledge you prevent automatic merges, bridges, refactorings, etc.
    
    Separation of interface and implement is separation of specification and integration. My goal is not to ‘refine’ an implementation, but rather to describe a solution separate from constructing one.
    
    It is fine to describe a solution separately from describing processes for constructing that solution, but both are descriptions of solutions — the latter just describes a solution to construction. In both cases, you refine models and MatchMakers realize them.
    
    I love examples, so let’s consider a model of the complete semantics of toRomanNumeral(n) for n <= 5:
    
    toRomanNumeral 1 -> "I" toRomanNumeral 2 -> "II" toRomanNumeral 3 -> "III" toRomanNumeral 4 -> "IV" toRomanNumeral 5 -> "V"
    
    Is that a model of interface constraints? Or an implementation? Maybe it’s a unit test? Is the similarity between these a coincidence?
    
    Mathematically, you could say that the model is grounded (there are no free variables in the construction of the function, at least for 1-5) but it is abstract relative to what we want. We want an instance of the function which we can call in a specific environment and get results, and for that there are tons of free variables that must be specified before we reach an actual implementation.
    
    e.g. What’s the format of input and output? Do you care if the function is implemented using a hash map or an array or using “I”*n for 1 <= n <= 3 ? Not really. MatchMakers should answer those questions using overall efficiency and compatibility constraints, amongst others.
    
    I appreciate your talking this over with me. Most of the time this all just stays in my head, so maybe I'm missing something.
  - dmbarbour says:
    
    2011 October 8 at 5 pm
    
    Assume interface Qux is imported by interface Bar to support specification. It might be possible to implement Bar without using a Qux. With ‘full semantics’, you could implement Bar without Qux, then validate that all of Qux’s relevant properties are implemented. With hidden semantics, however, this is no longer feasible: the only way to validate that an object implements a Qux (with all its hidden quirks) is to use a module that declares itself to implement Qux. Essentially, saying we use a Qux in an interface becomes a constraint on the dependencies utilized by the implementation.
    
    By “hidden” I mean a lack of knowledge – understanding or expression of behavior not available to a static interpreter or compiler of the code. Examples include FFI, comments, how a foreign service reacts internally to a given message. There are many reasons one might create “interface Baz extends Bar”, and one might describe their reasons in comments or blog articles. An application can easily ‘depend’ on semantics that are hidden or not expressed to the compiler, which is precisely why it is important to protect them.
    
    You are right that protecting hidden semantics can hinder some automation. For example, it would be a bad thing for a linker to ‘automatically’ bridge two similar interfaces. But a great deal of automation is still feasible. One could always provide a developer agent to make search code (including comments) and make suggestions to developers for possible bridging and refactoring. Developers can easily bridge similar interfaces by writing an implementation module. Clever use of search can allow all of logic programming, e.g. to automatically build a lambda expression that meets various requirements.
    
    I understand a ‘description’ as being a value I can compare to an object to determine whether the description fits the object or not. My proposed implementation modules are not ‘descriptions’ in any practical sense: you cannot reference them for comparison! However, interfaces are descriptions in a very practical sense. By rejecting the ability for one interface to name another, I achieve effective separation between interface and implement in purpose and practice. An interface literally cannot construct a solution, and an implementation module literally cannot describe something to be implemented in other code.
    
    I grant, there is no coincidence that an implementation of ‘toRomanNumeral’ would be similar to a specification of the same. Redundancy is a requirement for verification, after all, and pure functions on a finite domain are something of a trivial case.
Kevin says:

2011 October 8 at 11 pm

An application can easily ‘depend’ on semantics that are hidden or not expressed to the compiler, which is precisely why it is important to protect them.

I don’t doubt that an application can indirectly “depend” on semantics unavailable to the compiler (i.e. how interfaces are traditionally used to hide implementations), instead I question why we wouldn’t want those semantics to be available to the compiler. Worst case scenario, we don’t use them, just as if they were unavailable.

Any semantic failure or rigidity can be solved by developer intervention, I just question why we shouldn’t automate it if we can.

I understand a ‘description’ as being a value I can compare to an object to determine whether the description fits the object or not. My proposed implementation modules are not ‘descriptions’ in any practical sense: you cannot reference them for comparison!

Descriptions (models, types) can also be used to create objects, which is presumably what your implementation modules are used for. You describe it like a type that can be used for constructing instances but not for testing/detecting instances.

An interface literally cannot construct a solution, and an implementation module literally cannot describe something to be implemented in other code.

An interface coupled with a MatchMaker does return a solution, right? (assuming there exists available implementations of that interface in the current system). Similarly, an implementation coupled with a “compiler” also returns a solution. I see a parallel here.

Moreover, imports within implementation modules abstracts them. At import-time, if no instances of the interface are matched by the MatchMaker, your implementation module is abstract. It is not realizable. It cannot be instantiated / constructed.

Reply
- dmbarbour says:
  
  2011 October 9 at 1 am
  
  In an argument about whether full semantics is ‘realistic’, I do not believe that ‘want’ is a relevant factor. Developers are able to refine a model before implementing it, if that is their desire and the situation allows.
  
  Descriptions (models, types) can also be used to create objects, which is presumably what your implementation modules are used for.
  
  Recipes can also be used to create objects. If you create an object purely by recipe – i.e. without a description of how the finished product should behave – you’ll never know whether you got it right. My implementation modules are more analogous to recipes than to refined ‘models’.
  
  imports within implementation modules abstracts them
  
  Yes. Similarly, recipes might be called ‘abstract’ because they have named ingredients.
  
  Reply
  - Kevin says:
    
    2011 October 9 at 9 am
    
    Recipes can also be used to create objects. If you create an object purely by recipe – i.e. without a description of how the finished product should behave – you’ll never know whether you got it right. My implementation modules are more analogous to recipes than to refined ‘models’.
    
    A recipe is a model of a construction process. It is refined just like any other model.
    
    Semantics can be captured by describing detection (rewriting from the object to the model) or by describing construction (rewriting from the model to the object). Ideally, both. They are different, but both are a matter of rewriting under equivalence, and both can use the same model as a meeting point. That’s the power of semantics.
    
    Yes. Similarly, recipes might be called ‘abstract’ because they have named ingredients.
    
    Not only are they abstract because the ingredients need to be matched, they are abstract because the sub-processes are abstract and need to be matched.
    
    I’m not making a dent, am I? 🙂 Ah well, I think your import system is nevertheless a step in the right direction.
  - dmbarbour says:
    
    2011 October 9 at 1 pm
    
    My inner-pedant agrees with what you’re saying :D. Anything you can write down is a ‘model’ in some weak sense. Nonetheless, a model of a construction process is a significant level of abstraction from a model of what is being constructed, and a separation of interface and implement will protect the distinction. I envision this level of indirection as a joint to inject FFI and foreign services into the registry of module. The matchmaker becomes a font of authority, state resources, and so on.
Kevin says:

2011 October 9 at 6 pm

I basically agree with you to the extent you are going. Separating interface and implementation is important in any system. My focus on semantics is primarily useful for automatically adapting interfaces and implementations to new constraints. i.e. the evolution of systems.

Reply
dmbarbour says:

2011 November 10 at 3 pm

[Addendum] It seems awkward that an interface module would `require` a matchmaker parameter. The matchmaker is an implementation detail. It seems, to avoid this awkwardness, I will need parameters to modules that are not relevant to the interface. Second, parameters to an interface can be used only to abstract a specification. But abstract specifications are difficult to implement, and developers will only ever implement a finite subset of specializations. So rather than using parameters to turn one spec into many, it seems reasonable that developers simply create an `extends` interface module for each distinct specialization they create. This will be simple and sufficient to select a specialization.

The parameters responsibility should be moved from interface into implementation modules. That is, parameters are still made available to implementation modules and provided at `import`, but are invisible to the interface. If a required parameter is missing, the linker can reject that implementation module and search for another. It’s okay to provide excess arguments to import a module.

Reply