Using Signatures to Improve Smalltalk Productivity and Reuse
Steve Burbeck
IBM Research
sburbeck@us.ibm.com
Revision History:
First Review Draft - 1/23/95
Second Review Draft - 2/26/95
Third Draft - 6/22/95
HTML version - 10/29/97
Non Confidential Document
TABLE OF CONTENTS
1. OVERVIEW
2. A KEY PROBLEM: MISSING DESIGN
3. METHOD SIGNATURES
4. QUALIFIERS
5. SIGNATURE EXAMPLES
6. FOCUS ON DESIGN INTENT
7. NEXT STEPS
8. REFERENCES
9. APPENDIX: QUICK REFERENCE
TO SIGNATURES
1. Overview
1.1 A Blueprint for Change
Change is coming to the way we develop object oriented systems. Present-day
development of object systems relies on a craft culture: each component
is hand crafted by highly skilled programmers to fit the unique needs of
each project. This craft culture is already hard pressed to meet
the needs of current projects. There are too few skilled object technology
artisans to meet the growing demand. Even when enough skilled artisans
can be found, a craft culture offers few opportunities for economies of
scale. Bigger projects simply require more artisans; the larger the
staff, the higher the management overhead, the lower the productivity of
the technical staff, and the greater the risks due to lack of effective
communication. Moreover, a craft culture tends to resist reuse; a
highly skilled artisan would usually rather build a custom object than
reuse one that doesn’t quite fit. In doing so, the professional
skills of the artisan improve but accumulation of reuseful and reusable
software components is slow at best. Large scale use and reuse of
object technology requires a new culture.
As with all cultural transitions, the transition to a reuse culture
will take some time and will progress in fits and starts. However
the transition is likely to pass through three phases.
-
Phase one --- characterized by one-off development in unique projects.
-
Phase two --- characterized by harvesting and ad hoc reuse.
-
Phase three --- characterized by large scale reuse in asset-based projects.
Most current OO projects are in the first phase. Each phase will
have its unique challenges. The challenges in phase one are to improve
the productivity of the artisans and to pave the way for harvesting.
We must grow the skills base, provide processes and tools that maximize
the productivity of OO developers, and maximize the harvestability of the
one-off components developed in phase one projects. The sooner the
necessary skills base, practices, and tools are in place to repeatably
complete successful OO projects, the sooner an organization can enter phase
two. Therefore one important mission of the North American Object
Foundry is to institute the practices and provide the tools that most quickly
and effectively address these phase one issues.
1.2 A First Step
A reuse culture cannot be based upon the reuse of code alone -- reuse requires
an understanding of fine grain design decisions upon which the code is
based. This paper describes a processes for capturing an important
component of that design information. In brief, we propose that developers
add a formatted comment, called a signature comment, into
the text of each method. The signature characterizes the object that
is returned from the method and the objects intended as arguments, if any.
These object type descriptors, called qualifiers, provide
information to the reader of the method that otherwise must be deduced
by reading the code and/or browsing the clients of the code. This
added information improves communication within a project team and helps
those who later seek to harvest and reuse the documented components.
It thereby addresses all three of the key phase one challenges.
-
It helps inexperienced developers learn from the experts and allows the
experts to better guide and critique the work of the inexperienced developers;
-
It facilitates the communication of design information within project teams;
-
It facilitates the communication of design from project teams to harvesting
teams.
We recommend that this process be taught to beginning Smalltalk programmers
and be adopted as standard practice in all Smalltalk projects.
2. A Key Problem: Missing
Design
Object-oriented design involves just a handful of basic elements.
One of the most important for understanding, harvesting and reusing an
OO system is collaboration -- the client-server relationship between objects.
OO designers document collaborations with a variety of techniques (examples
include CRC cards, message flow diagrams, object interaction diagrams,
and scenario transcripts). These approaches differ in the level of
granularity at which they describe collaboration. Some techniques,
CRC for instance, deal only with class-to-class or subsystem-to-subsystem
collaborations. Others, such as object interaction diagrams,
document collaborations at the method level.
Once the system is implemented, Smalltalk code becomes the fine grain
embodiment of design; each message sent within a method is
a method level collaboration. To the extent that the fine grain design
is complete, there is a one-to-one relationship between messages described
in code and fine grain method level collaborations described in the design.
However, the fine grain design is seldom either complete or accurate.
In many projects the senior programmers design on a white-board, produce
the program, then produce a retrospective design to satisfy management.
Even when relatively complete designs are available, the design seldom
matches the actual implementation. Changes made to the code are often
not accompanied by appropriate changes to the design because design resides
elsewhere and programmers seldom take the trouble to find it and edit it.
Programmers and managers familiar with these realities consider that only
trustworthy description of the design to be the code itself.
Code may be trustworthy but it is not a very readable description of
design. Code describes the interactions that will take place between
the objects involved in its execution but it does not explicitly specify
the identity of these objects or their placement in the inheritance hierarchy.
In terms of theatrical scenarios, the code precisely defines the dialog
of the scene but says little about the actors. Yet the designer's
intent in casting the actors in method-level collaborations forms a substantial
portion of the knowledge needed to understand, reuse, or modify an object-oriented
system.
The problem of missing design is further compounded when inexperienced
developers are part of the development team or are expected to maintain
the resulting system. Commercial Smalltalk systems provide unusually
rich tools to help developers deduce collaborations from code. There
are tools for browsing senders and implementors, stepping through code
in a debugger, and inspecting arguments at runtime. One of the skills
common to experienced Smalltalk developers is the ability to use those
tools (together with accumulated knowledge of Smalltalk base classes) to
quickly and effectively understand existing collaborations. Experienced
developers also tend to follow good coding conventions and style which
helps readers of their code to understand the collaborations. Inexperienced
developers, on the other hand, are unfamiliar with the existing base classes,
unaware of many telltale conventions, and not skilled with the investigative
tools. To compound this problem, inexperienced developers usually
fail to adhere to good design and coding conventions and good style.
The components they develop are more difficult for both novice and experienced
developers to fathom. So the larger the project and the less experienced
the developers, the more vital it is to capture and maintain the fine-grain
design decisions that are otherwise only embodied in the code.
2.1 Know the Cast by their Signatures
Rather than asking downstream users of Smalltalk code to deduce this fine
grain collaboration information each time they browse a method, developers
should include that information with their code. In other words,
for each little scene (i.e., each method), the programmer should identify
the actors.
This paper proposes that developers add a formatted comment, called
a signature comment (or simply a signature), to each method.
The signature characterizes the object intended by the design to be returned
from the method and the objects intended as arguments, if any.
Signatures document method-level collaborations in a standard and portable
way that will enhance the effectiveness of project development staff.
-
Project development is done by teams of programmers who all too often communicate
effectively with one another only through the medium of Smalltalk code.
Fine grain design information embedded in the code will improve communication.
-
Maintenance is usually done by people other than the designer or initial
programmer. Detailed and accurate design information can reduce the
necessity for maintenance programmers to deduce the method level collaborations
by browsing and reading code.
-
Standard portable documentation techniques will also reduce the "spin up"
time of new staff who move between projects and dialects.
Improved method-level documentation will pay additional dividends when
code is harvested and subsequently reused.
-
The most time consuming part of the refactoring, generalization, and refinement
that takes place during harvesting is the need to deduce the collaboration
web. The proposed documentation reduces the need for such deductions.
-
Better documentation will also help in assessing the quality of candidates
for harvesting.
-
Better documentation will help potential reuse consumers to assess the
suitability of candidates for reuse in later projects.
-
The consumers of the harvested components in later projects need to become
familiar with the design and implementation of the new components.
Accurate method level documentation (in this case provided by the foundry
harvesting projects) help speed this process.
Since the proposed signature process requires no special tools it can be
quickly adopted in projects and in training programs. And since the
benefits of using signatures begin accumulating immediately, we recommend
that it be standard practice to include a signature in every method.
3. Method Signatures
The term signature refers to design information about a method.
What is proposed in this paper is the simplest sort of signature -- one
that defines in a formal way the kind of object intended to be returned
from the method and, if the method has arguments, the kind of objects intended
to occupy the arguments. The "kind of object" is specified
by a qualifier (described in the next section). Some
discussion about extensions to signature information appears at the end
of this paper.
A signature is an organized collection of qualifiers that is entered
into each method as a formatted comment. It should be inserted between
the method’s message pattern and the normal method comment (see examples
below).
A signature for a unary method (i.e., one without arguments) simply
documents the qualification of the object returned from the method.
Its format is:
For methods with arguments, the qualifier for each argument is prefaced
by the name of the argument with an appended colon. The argument
qualifiers are separated by commas.
<arg1Name: qualifier, arg2Name: qualifier, ..., ^ qualifier>
The following examples illustrate signatures for methods with and without
arguments. These signatures also illustrate some common kinds of
qualifiers: those that place an object in an inheritance hierarchy,
those that specify certain special objects (e.g., true) and those that
include a list of alternatives.
3.1 Example -- Appraisal
>>asText
-
asText
-
"<^hierarchyOf String>"
-
"Answer myself rendered
as text."
-
| stream |
-
stream := WriteStream
on: String new.
-
self presentAsTextOn:
stream.
-
^stream contents
The signature for this unary method asserts that the method returns some
kind of String. Experienced Smalltalk programmers who are familiar
with the idiom of Streams would have little trouble deducing that this
method returns a string. Novices who are not familiar with Stream
protocol would not necessarily reach that conclusion quickly even though
the method has only three lines of code. In any case, it is quicker
to read the signature than it is to deduce the return qualifier from the
code.
3.2 Example -- Magnitude
>>between:and:
-
between: min and: max
-
"<min: hierarchyOf Magnitude,
max: hierarchyOf Magnitude,
-
^(true | false)>"
-
^(min <= self) and: [self <=
max]
The signature for this method asserts that min and max are expected to
be kinds of magnitudes and the method returns true or false. Note
here that the readability of signatures is as important as that of code.
Long signatures should be folded onto multiple lines.
4. Qualifiers
A Qualifier characterizes the objects that are qualified to occupy
a variable given the role the variable plays in the design of the method.
As such, the system of qualifiers proposed here is an OO type system from
the clients’ viewpoint [Den91]. We use the
term qualifier rather than type to avoid some of the confusion and debate
about just what is an OO type.
Several OO type systems have been published (some of which will be discussed
in a later section). They differ as much in the characterization
of the problem they are attempting to solve as they do in the proposed
solution. Among the different goals these systems address are:
-
Documentation of design intent -- characterizing the objects intended
by design to participate in collaborations.
-
Documentation of the effect of the code -- characterizing the assumptions
about objects that are inherent in the messages actually sent to them and
the results actually expected of them.
-
Type safety -- characterizing the objects with the goal of preventing
the occurrence of doesNotUnderstand.
-
Component substitutability and reuse -- characterizing the objects,
classes, hierarchies, and frameworks so that their ability to participate
in a component architecture can be accurately assessed.
-
Formal proof of correctness -- characterizing the classes and methods
with the rigor necessary to allow formal reasoning systems to assess the
"correctness" of the system.
The distinctions between these goals are not necessarily sharp and most
of the published OO type systems address more than one. In general
though, the above list is in order of increasing focus on formal methods.
Increasing formality, rigor, and abstraction tends to be accompanied by
decreasing usefulness for human-to-human communication. The qualifiers
proposed here address the human side of the human/machine interaction.
Their primary goal is to document design intent in real-world Smalltalk
applications.
The most commonly encountered OO type system is the one in C++.
C++ types are relatively simple because messages may be sent only to objects
that inherit from a common class, collections may hold only objects that
inherit from a common class, and classes don't exist at runtime.
Smalltalk is fully object-oriented and fully polymorphic. In Smalltalk
any object that understands the necessary messages may take part in a collaboration
without regard for inheritance, Smalltalk collections can accommodate elements
of arbitrary (and varying) types, and Smalltalk collaborators may be classes
themselves. All of these issues are handled by the proposed qualifier
system.
The syntax of qualifiers and signatures is intended to balance the competing
issues of human readability, verbosity, expressiveness, and machine parsability.
Most frequently the intended collaborators and returned objects are accurately
described as domain objects that inherit from a given class. Also
common are the familiar pseudo variables: true, false, nil, or self.
These simple cases can be qualified simply and easily. When
the idiosyncrasies and complexities found in actual Smalltalk usage exceed
the reach of these simple qualifier, composite qualifiers -- qualifiers
built from groups of other qualifiers -- can be used.
To reduce the burden of entering the qualifiers, the most frequently
used qualifiers have alternate short forms (indicated below by the underlined
bold characters). However qualifiers should generally be spelled
out unless the resulting signature itself becomes unwieldy.
4.1 Class Qualifiers
The most common case is one in which the designer specifies acceptable
objects in terms of their membership in a class or in a hierarchy of classes.
The name of the class (exactly as it would appear in code) follows an indicator
of how the qualified object relates to that class.
-
hierarchyOf AClassName -- This indicates
the most common and usually most desirable case in which acceptable objects
are instances of a given class or any of its subclasses.
EXAMPLE -- hierarchyOf Collection (or hOf Collection).
This indicates that any collection will do.
-
instanceOf AClassName -- This indicates
that the only valid objects are instances of precisely the given class.
This should be used only when the designer knows or intends that instances
of subclasses are not acceptable (e.g., when subclasses have subtractive
inheritance or when the semantics of subclasses do not satisfy the needs
of the method).
EXAMPLE -- instanceOf Set (or iOf Set). Nothing
but a set will do.
-
class AClassName -- This indicates the valid object is the given
class itself. It may be used in combination with hierarchyOf
to indicate that the class or any of its subclasses is valid. When
used in this combination, hierarchyOf appears before class.
EXAMPLE: class PushButton is that unique class object
and no other.
EXAMPLE: hierarchyOf class Window specifies the
Window class object or any of its subclasses.
-
any -- This indicates that any object is valid. In a
single rooted system, any is equivalent to hierarchyOf Object, but
many systems are no longer single rooted, e.g., VisualAge.
-
none -- This indicates that no object is valid. It is
used to indicate that the method, by design, does not return, e.g., signals
a walkback or an exception (see also the discussion about block qualifiers).
Note: none may be used as a member of a group of possible return
qualifiers if the message invokes an error under certain circumstances
but otherwise returns an object. In that case the other alternative(s)
should specify what is returned when an error does not occur.
4.2 Special Qualifiers
Some common cases, determined both by the definition of Smalltalk and by
conventional usage, deserve special qualifiers to reflect that usage.
-
nil, true, false -- these usually appear in return qualifiers. They
indicate that the specific unique object (nil, true, or false) is expected.
Note that hierarchyOf Boolean means either true or false.
-
self, myClass -- these qualifiers denote specific objects
that are defined by the context in which they appear. The self
qualifier indicates the receiver of the message; myClass indicates
the class of the receiver. The myClass qualifier can appear in a
qualifier wherever a class itself may appear, e.g., hierarchyOf myClass.
Note that myClass and instanceOf myClass do not mean the
same thing. An object qualified by myClass is a class.
An object qualified by instanceOf myClass is an instance of that
class, not the class itself. And instanceOf myClass is not
the same as self -- self refers to the same object as the
receiver whereas instanceOf myClass refers to a new object that
is the same class as the receiver. Examples:
Code |
Appropriate Qualifier |
^self |
^self |
^self new (in a class method) |
^hierarchyOf self |
^self class |
^myClass |
^self class new |
^hierarchyOf myClass |
-
arg1, arg2, ... -- these qualifiers refer to arguments of the method
in which the signature appears (see example in a later section).
They are typically used in return qualifiers to indicate that one of the
input arguments is returned. A formally accurate but less specific
return argument could simply be the qualifier for the appropriate argument.
Using the argN form adds the extra information that the same object
is being returned, not merely an object with the same qualifier.
It also avoids retyping the argument qualifier.
4.3 Qualifier Aspects
In some cases an object cannot be adequately characterized without also
characterizing other objects that it "contains" or "controls." To
qualify a collection, for instance, one usually must not only specify the
type of collection, but also the type(s) of objects contained in the collection.
In such cases we allow a comma separated list of aspect qualifiers,
delimited by curly brackets, to be appended to the qualifier.
qualifier {aspectName1: qualifier, aspectName2: qualifier, ... }
The predefined aspects in common usage are given in the following table.
The default aspect is to be assumed in qualifiers in which the aspect is
not explicitly given.
Qualified Object |
Predefined Aspects |
Default Aspect Qualifier |
Collection |
of: |
any |
Dictionary, Association |
key:, value: |
any |
Point |
x:, y: |
hierarchyOf Number |
Stream |
on: |
hierarchyOf String |
EXAMPLES
hierarchyOf OrderedCollection {of: hOf MortgageBackedSecurity}
instanceOf Dictionary {key: hOf CustomerSite, value: hOf SalesTerritory]
The default aspect qualifiers for Points and Streams cover most normal
usage. Only rarely do domain designs use points to hold pairs of
objects other than numbers, and when they do, Associations would usually
be a better pairing device. Streams on collections other than Strings
can be very useful, but that usage is relatively rare. So aspects
are seldom needed for these classes. Collections, Dictionaries and
Associations, on the other hand, typically are not used to hold or associate
arbitrary objects. The default in these cases is not chosen
because it is the typical case but because it is all that is reasonable
to assume if the aspect qualifier is missing. Good practice dictates
the use of aspects on all Collections, Dictionaries and Associations even
if the default happens to be correct. Points and Streams should specify
their aspects only if the default is not correct.
Syntactically, aspects are arbitrary annotations to qualifiers.
The user can make up new aspects if needed.
4.4 Behavioral Qualifiers
4.4.1 Block Qualifier
Blocks may be received as arguments and less frequently returned from methods.
When used in that way, they are essentially unnamed methods. To describe
them adequately requires the same information about the block as is provided
in a method signature. So block qualifiers resemble method signatures.
Block qualifiers differ from method signatures in that they are enclosed
in square brackets (as are the blocks they describe) and their arguments,
if any, are named by position rather than argument name. The format
is:
[ :blockArg1 qualifier, :blockArg2
qualifier, ..., ^qualifier ]
For instance, the block qualifier that describes the argument to the Collection>>#select:
method would be
[ :blockArg1 any, ^ hierarchyOf Boolean]
which indicates that the block accepts any object as its argument and returns
a Boolean.
Since a block may return one of its block arguments, the return qualifier
inside a block may be one of its arguments, i.e., b1 or blockArg1.
Note that the block return qualifier qualifies the result of the block
(i.e., the object returned when the block is evaluated); it does not describe
what is returned from the method containing the block, even if the block
contains an explicit return (^). If the evaluation of a block
simply causes the method to return some value, then the block return
qualifier should be none.
EXAMPLE -- the block [^ 'some string'] would be qualified
as [^none].
Note, however, that method returns from within blocks passed as arguments
involve subtle semantics. The appearance of none in a block
qualifier ought to warrant a second look by the designer or programmer.
4.4.2 Signature Qualifier
Signature qualifiers describe objects in terms of the methods they support
without restricting their class. A signature qualifier states that
the object is qualified if it implements a method with the given signature.
We use a format different from that of the method signature itself because
the signature qualifier must specify the selector of the message and because
the names of the arguments to the method are irrelevant. So a signature
qualifier looks much like the message to be sent with the arguments, if
any, replaced by their qualifier. There are three types of signature
qualifier corresponding to the three types of message -- unary, binary,
and keyword. Keyword messages may have multiple keywords. In
that case, the keyword-qualifier pairs are separated from one another by
commas.
unary -- < selector, ^ qualifier >
EXAMPLE -- <asString, ^hierarchyOf String>
binary -- < binarySelector qualifier, ^ qualifier >
EXAMPLE -- < <= hierarchyOf Number, ^hierarchyOf Point>
keyword -- < firstKeyword: qualifier, secondKeyword: qualifier,
..., ^ qualifier >
EXAMPLE
-- <copyFrom: hOf Integer, to: hOf Integer, ^hierarchyOf String>
A candidate clearly qualifies if the method it implements or inherits
has a signature that exactly matches the required signature.
Otherwise a candidate object qualifies if its arguments and return qualifiers
satisfy the following relationships to those of the signature qualifier:
-
all of the arguments of the candidate’s method signature are at least as
general as those of the signature qualifier, that is they match or are
superclasses of the corresponding signature qualifier class, and
-
the candidate’s return qualifier is at least as specific, that is it matches
or is a subclass of the signature qualifier class.
Signature qualifiers are needed or useful only for highly polymorphic cases,
e.g., frameworks, where classes are expected to be added about which all
that is known is that they will implement the signature. Because
signature qualifiers provide very little information about the kind of
object expected, they should be used only in the rare cases where a high
degree of polymorphism and only a high degree of polymorphism is explicitly
intended.
4.5 Composition of Qualifiers
4.5.1 Alternative Qualifier
Alternative (also known as union) qualifiers express the fact that an object
meets at least one of a set of qualifications. The set of alternatives
is enclosed in parentheses and separated by vertical bars to indicate a
logical or:
( qualifier | qualifier | ... )
This construct is useful when the alternative classes are in different
portions of the hierarchy yet still implement the desired protocol.
For example suppose we are interested in searching through a collection
of products in a mail-order company model. The method that does the
search may work equally well for searching a catalog of products or a warehouse
of products even though Catalog and Warehouse classes may
have no inheritance relationship to each other. The alternative qualifier
would describe that fact thus:
(hierarchyOf Catalog | hierarchyOf Warehouse)
Another common usage is (true | false) instead of hierarchyOf
Boolean for easier readability.
It is a hallowed tradition in Smalltalk to receive or return nil
instead of the otherwise expected object as a flag for exceptional circumstances.
For instance a method that normally returns a price list may return nil
as an indicator that there is no price list:
^(hierarchyOf PriceList | nil)
Note: modern exception handling mechanisms often serve better for
such purposes.
4.5.2 Conjunction Qualifier
At times an object must meet all of a list of qualifications. That
calls for a conjunction (or intersection) qualifier. The parenthesized
list in a conjunction qualifier is separated by ampersands to indicate
a logical and:
(qualifier & qualifier & ... )
One situation that may call for a conjunction qualifier is when a class
hierarchy qualifier needs an added behavioral requirement, e.g.,
( hierarchyOf Collection & <addAll: hOf Collection, ^ hOf
Collection> )
which specifies only those collections that understand addAll:.
The need to use this kind of qualifier usually stems from anomalies in
the structure of a hierarchy. When possible these hierarchy anomalies
should be corrected rather than papered over with complex conjunction qualifiers.
A conjunction qualifier may also be used to specify a set of signatures
that the qualified object must understand. This should be avoided
except in the rare case in which it is the explicit design intent of the
method that a set of signature qualifiers is all that properly characterizes
candidate objects.
Note: the alternative indicator (|) and the conjunction indicator (&)
may not be mixed within a single list. In the very rare case
where such a complex qualifier is needed, use explicit parenthesized groups.
5. Signature Examples
First, a general note on style: As with code, signatures should be written
with consistent style. Long signatures should be laid out with the
same care as long Smalltalk statements, i.e., placed on multiple lines
with indenting to clarify the groupings. The most common qualifiers
have both a long and an abbreviated form - the long form should be preferred
for simple qualifiers. When the length of a signature or the complexity
of a qualifier becomes a problem, abbreviations may aid readability.
5.1 Use of instanceOf
versus hierarchyOf
Most classes inherit the behavior of their superclasses in a compatible
manner, so the most common qualifier is hierarchyOf. But methods
may appear in an abstract superclass that are not intended to be used by
all subclasses. Or a subclass may override a method in a way that
is semantically incompatible with its siblings and superclasses.
The Digitalk Collection hierarchy, for example, has some examples of subclasses
that do not (and in fact cannot) properly implement some of the methods
they inherit. Dictionaries are subclasses of Set that do not behave like
sets. So Dictionaries are clearly not acceptable replacements for
an expected Set argument. When the programmer wishes to pass a set
as an argument, the qualifier should be instanceOf Set to indicate
that Sets are acceptable, but subclasses of Set are not. For instance:
-
Set>>intersect:
-
intersect: aSet
-
"<aSet: instanceOf
Set, ^instanceOf Set >"
-
"Return intersection
of self and aSet "
-
^self select: [ :element
| aSet includes: element ]
Similar problems crop up in the Smalltalk-80 Collection hierarchy [Coo92].
Note: if you find it necessary to use instanceOf with classes of
your own design, you should consider refactoring the classes to eliminate
that necessity.
5.2 Use of argN
It is not uncommon for a method to return one of the arguments it received.
For instance:
-
Collection>>addAll:
-
addAll: aCollection
-
"<aCollection: hierarchyOf
Collection, ^arg1 >"
-
"Answer aCollection.
Add each element of
-
aCollection to the
elements of the receiver."
-
aCollection do: [ :element
| self add: element].
-
^aCollection
Here aCollection is both the argument and the object returned by
the method. So the return qualifier could simply be the same as the
argument qualifier and be technically correct. However, if it is
qualified as arg1 rather than hierarchyOf Collection, the return
qualifier retains the identity information. This is an important
point to document and future static analysis tools will be able to make
use of such information.
5.3 Use of class
qualifier
A common design pattern, known as the FactoryMethod pattern [GHJV94], relies
on the class of an object being determined at runtime. The method
that decides the class must return that class. For example a multimedia
window may have some (usually private) service method to provide the class.
In that case, the return qualifier specifies a class rather than an instance.
-
Viewer>>viewerClassForDocument:
-
viewerClassForDocument: aDocument
-
"<aDocument: hierarchyOf
Document,
-
^hierarchyOf class
DocumentViewer>"
-
"Answer the proper
class to view this kind of document."
-
-
^ViewerClasses at:
aDocument ifAbsent: [GenericViewer]
5.4
Use of a block qualifier, myClass, any, true, false
The collection hierarchy, in part because it supports such a wide range
of use and in part because it contains some of the oldest classes in the
Smalltalk image, requires some of the most complex kinds of qualifiers.
Abstract behavior intended to be inherited in a vary general way can be
difficult to qualify. Consider:
-
Collection>>select:
-
select: aBlock
-
"<aBlock: [blockArg1:
any, ^(true | false)],
-
^instanceOf myClass>"
-
"For each element in
the receiver, evaluate
-
aBlock
with that element as the argument.
-
Answer
a new collection containing those elements
-
of the
receiver for which aBlock evaluates to true."
-
-
| answer |
-
answer := self species
new.
-
self do: [ :element
|
-
(aBlock value: element)
-
ifTrue: [answer add: element]].
-
^answer
The argument, aBlock, must be a one argument block whose value
is true or false. Nothing can be said of the
argument to the block; it can be any object. The return qualifier
presents a problem that is not completely representable with the present
qualification system. The method returns a new instance of a collection
that is almost always of the same class as the receiver. The return
qualifier is therefore instanceOf myClass. However the collection
isn't created by 'self class new' it is created by 'self species
new'. For almost all classes, species returns the same
thing as class. The exceptions in the Digitalk VOS/2 image
are: Interval for which species returns Array, Symbol which becomes
String, SymbolSet which becomes Set, and DoubleByteSymbol which becomes
DoubleByteString. For the current focus of documenting design intent,
these exceptions are too rare to warrant an additional qualifier type (e.g.,
mySpecies).
5.5 Use
of a qualifier modified by an aspect
-
MethodArtifact>>comments
-
comments
-
"< ^instanceOf OrderedCollection
{of: hierarchyOf String}>"
-
" lazy getter so that
comments are extracted only once."
-
-
comments isNil
-
ifTrue:
-
[comments := OrderedCollection new.
-
self extractComments].
-
^comments
Note that comments is an instance variable of the MethodArtifact
class. Signatures for getter and setter methods serve also to document
the qualification of the associated instance variable. When proper
tool support is available, instance variables should also have qualifiers.
Until then, qualifiers in the setter and getter methods document the same
information. Note also that instanceOf OrderedCollection is
used rather than hierarchyOf because the OrderedCollection class
is referenced explicitly so we know its class exactly.
Laissez faire getters, such as the example above, guarantee to initialize
the instance variable with the proper kind of object. If the variable
has not been initialized, it will be nil, which will trigger the
initialization code contained in the method. With other methods of
initialization that are less reliable, there may be a question of
what the proper return qualifier for a getter method should be.
From the perspective of type-safety, the fact that the method might return
nil could tempt one to add nil as an alternative to the getter’s
return qualifier. However bugs, no matter how common, are not a part
of intended design. From the perspective of documenting design intent,
nil should be included as an alternative only if it is the
intent of the designer that nil be a legitimate return value from the getter.
5.6
Use of a signature qualifier in a conjunction
A signature qualifier can be used in conjunction with a hierarchy qualifier
to restrict the qualified object to those classes that support the required
behavior. The need for this is very rare but one case might occur
in methods that accept arguments qualified by hierarchyOf Collection
. Not all collection classes in the Digitalk collection hierarchy
support collect: even though the method is implemented in Collection
and therefore inherited by all subclasses.
This is not at all obvious from reading the code for collect:
which is quite general. The subtle problem is that collect: requires
that the receiver (or more technically, the species of the receiver) implement
add: in a way that any object can be accepted as the argument.
Yet Dictionaries and their subclasses require that the objects being added
must be Associations (or more precisely that they understand key
and value).
In most real-world applications, a method would not be qualified to
accept any arbitrary collection and, therefore, this problem would not
occur. In the rare case where the problem might occur, the argument
qualifier could specify that the argument is some kind of collection
but that it also must support add: with any argument.
That is expressed with a conjunction qualifier:
-
(hierarchyOf Collection & <add: any,
^arg1>)
An object whose add: method would be qualified as <add: hOf Association,
^arg1> does not meet the requirement imposed by the qualifier <add:
any, ^arg1> . hOf Association is more specific not less
specific than any (see the discussion in the signature qualifier section).
5.7
Use of none when the method does not return
Methods that simply signal an error are most often found in abstract classes.
For example:
-
FixedSizeCollection>>add:
-
add: anObject
-
"<anObject: any, ^none >"
-
"Add anObject to the receiver. This method reports
-
an error since fixed size collections cannot grow."
-
^self invalidMessage
More commonly, methods may signal an error under certain circumstances
and return in others. Such methods may use the none qualifier as
one member of an alternative qualifier.
6. Focus on Design Intent
The signature and qualification system presented here borrows from a number
of prior systems for OO typing. For instance, Suzuki, [Suz81],
first proposed alternative (or union) qualifiers. We borrow most
from Borning & Ingalls, [BI82] who pioneered
the notions behind what are here called special qualifiers (in particular,
class, self, and argN), aspect qualifiers, signature qualifiers, and block
qualifiers.
Type systems differ in detail and in the purpose for which the systems
were developed. Borning & Ingalls focus on documentation of design
with some support for compile time type safety. Johnson and Graver,
[Joh86], [GJ90], focus on
type safety and compiler code optimization. Thomson [Tho93]
focuses on documenting the actual effect of the code (for portability and
reverse engineering of existing code). Wills [Wil91]
focuses on substitutability of components and formal provability of correctness.
Cook [Coo92] focuses on documentation and factoring
of existing Smalltalk Collection classes. The Trellis/Owl system
[SCB86] focuses on documentation and type safety.
America [Ame90] focuses on behavioral description
and formal correctness. It is important to recognize that the various
goals of OO type systems have differing degrees of relevance and value
at different stages in the life cycle of OO development, harvesting, and
reuse.
Lifecycle Stage |
Important Type System Issues |
first stage development |
documentation of design intent |
refactoring |
documentation of actual use, type safety |
harvesting and reuse |
design intent, type safety, component substitutability |
certification for reuse |
type safety and proof of correctness |
The system proposed here is intended to document design intent and to
support some aspects of type safety. We believe that the burden of
developing reusable code cannot realistically be placed on designers or
developers in the initial project. A phase one project necessarily
focuses on modeling the domain and getting a complete running application.
Generality and the needs of reuse are difficult to foresee in early projects.
The well known rule-of-thumb is that components (classes, hierarchies,
frameworks, and subsystems) must be refactored in light of other contexts
two or three times before they become generally reusable assuming, as Adams
points out [Ada92], that they are reuseful
in the first place. If signature documentation required the designer
or programmer to understand in advance how the components will be reused,
the qualification system might languish unused.
Design intent, although forgotten all too quickly, is known when the
method is written and, if captured then, becomes increasingly valuable
at each later stage of the lifecycle. Yet design intent may conflict
with the actual code. When that occurs, the conflict must be resolved.
Either or both of the design or the code may be incomplete or inaccurate.
Only the designer/programmer can make the call. Automated type checking
and type inference systems can provide input to the human referee when
mismatches are detected but cannot replace the need for documenting design
intent.
This is not to say that we are forever done once design intent has been
documented, just that the process cannot properly begin without capturing
design intent. Each successive stage needs to refine prexisting
design information and augment that information. Once project components
are harvested the appropriate focus of a type system will change to one
of describing highly polished reusable classes. As components
are harvested, mature, are refactored, and gain in generality, issues of
correctness, reusability, and substitutability may equal in importance
issues of the documentation of design intent. It is not clear at
this time whether that additional information can be captured in a format
similar to the signatures and qualifiers presented here or will require
a qualitatively different format.
7. Next Steps
7.1 Adoption by Smalltalk
Projects
We believe that signature documentation will provide immediate benefits
in new and ongoing projects. Project leaders, designers, programmers,
and the customer will benefit by the capture of design information that
otherwise would be lost. Signatures in everyday use will also improve
communication between designers and programmers, between teams of designers
and programmers, and between individuals on a team. Smalltalk programmers
spend far more time reading code than writing code. Of the time spent
reading code, a not inconsiderable portion is spent deducing the requirements
of the arguments and deciding what kind of object is returned. Inexperienced
Smalltalk programmers spend proportionally more time trying to understand
the code. Where accurate signatures are provided, programmers
will find immediate productivity improvements.
Adoption of signature documentation does, however, require some changes
to everyday work practices. And these changes may be seen, especially
by experienced Smalltalkers, as an unwarranted burden. The new burdens
placed on the programmers include the thought required to provide the good
qualifiers, the additional typing to enter them, and the need to change
the signature if the method changes in ways that invalidate the signature
information. The need to enter a signature for every method is indeed
a burden, though usually a small one. Some simple tool support should
be available in 1995 to minimize that burden (see later section).
The thought required to decide the proper qualifiers may be perceived as
a burden, but experience has shown that it is a benefit in disguise; typically
the extra thought improves the design. Maintaining accuracy when
methods are edited is the key issue. Eventually tool support will
be available to signal mismatches between the signatures and the code.
Until that time, programmers will have to be motivated to update their
signatures by management oversight, peer pressure (both ad hoc and during
code reviews), and professional pride.
Newly trained Smalltalk programmers can and should learn to use signatures
(see next section). Experienced Smalltalkers will need to learn on
the job, perhaps in the midst of a project in which they already seem swamped.
We believe that a transition to the use of signatures can be made in mid-project
with little impact and that the improved documentation of design will improve
productivity enough to repay the cost of entering the signatures.
Nonetheless, the engineers, together with their project leaders,
will have to weigh the trade-offs in light of the details of their project.
7.2 Adoption in Training
Programs
Training in signature documentation better prepares students to use signatures
when they take their place in Smalltalk projects. Signatures can
also provide immediate pedagogical benefits in the classroom. Signatures
can clarify an issue for trainees that is often confusing for those new
to Smalltalk: just what objects are returned from methods.
Also, the ability to clearly specify and reason about fine grain design
will facilitate the teaching, discussion and review of fine grain design
issues.
7.3
Evaluation and Evolution of the Signature Process
The effectiveness of this system in real use needs to be assessed in the
first few months of deployment. The completeness, expressiveness,
usefulness, and correctness of the qualification system can only be judged
after real-world use in the trenches. We expect to modify, enhance
and provide better guidance about standard usage patterns after signatures
have seen some use in projects.
We already are aware of a number of areas that can and will receive
attention in future iterations of the signature system. Additional
information may be included in signatures, signatures may be used in additional
ways, and additional tools will be needed:
-
Variables other than arguments (i.e., instance variables, class variables,
pool variables, and globals) need to be qualified. The
issue is where and how to maintain the information.
-
Tools are needed to ensure the accurate correspondence between signatures
and code. Static analysis tools can compare signatures to the code
at compile time and runtime tools can compare actual arguments and returned
objects during execution to those specified by the signatures.
-
The effort to enter signatures must be minimized.
-
The presence of a signature embedded in the code may tend to be distracting.
When signatures are fully supported by tools, they may appear as separate
entities.
-
Especially in ongoing projects, the backlog of prior code needs signatures.
-
A project to provide accurate signatures for the basic code already in
the base Smalltalk image from each vendor would pay immediate dividends.
We expect help from several groups in this project.
-
A number of method-level design issues other than qualification could be
captured in signatures. Examples are: pre/post conditions, exceptions
raised, and side effects.
-
Extract signatures into separate design documentation, i.e., export batch
interface documentation from signatures.
-
Provide feedback on the quality and plausibility of the signatures without
doing full static analysis, e.g., flag questionable practices such as signature
qualifiers, or appearance of instance qualifiers in argument qualifiers,
etc.
The most important longer term issue is support for assuring the accuracy
of the signatures. The qualifier syntax is designed to be parsed
by future tools that can provide a static analysis of the match between
the collaborators described by the qualifiers and the collaborations implicit
in the method's code. Tools may also make use of signature information
to extract larger scale design -- for example to roll-up method collaborations
into class or even subsystem collaborations, and to present these collaborations
in interaction diagrams and collaboration nets.
8. References
-
[Ada92] Sam Adams. Software reuse and the enterprise.
In Software Development ‘92 Spring Proceedings. pp. 7 - 13, 1992.
-
[Ame90] Pierre America. Designing an Object-Oriented
programming language with behavioral subtyping. In Foundations of
Object-Oriented Languages, LNCS. pp. 60 - 90, 1990.
-
[BI82] A. H. Borning and D. H. H. Ingalls. A
type declaration and inference system for Smalltalk. In Conference
Record of the Ninth Annual ACM Symposium on Principles of Programming Languages,
pp. 133 - 139, 1982.
-
[Coo92] W. R. Cook. Interfaces and specifications
for the Smalltalk-80 Collection classes. In Proceedings of OOPSLA ‘92,
pp. 1 - 15, November 1992.
-
[Den91] Richard J. DeNatale. Types from the
client’s viewpoint. IBM Technical Report no. TR29.1246, September,
1991.
-
[GHJV94] Eric Gamma, Richard Helm, Ralph Johnson
& John Vlissides. Design Patterns. Addison-Wesley, 1994.
-
[GJ90] Justin O. Graver and Ralph E. Johnson.
A type system for Smalltalk. In 17th Annual ACM Symposium on Principles
of Programming Languages, pp. 136 - 150, 1990.
-
[Joh86] Ralph E. Johnson. Type-checking Smalltalk.
In Proceedings of OOPSLA ‘86, pp. 315 - 321. Printed as SIGPLAN Notices,
21(11). November 1986.
-
[OP92] Nicholas Oxhoj, Jens Palsberg and Michael I.
Schwartzbach. Making type inference practical. In Proceedings
of ECOOP ‘92, pp. 329 - 349. 1992.
-
[SCB86] Craig Schaffert, Topher Cooper, Bruce Bullis,
Mike Kilian, and Carrie Wilpolt. An introduction to Trellis/Owl.
In Proceedings of OOPSLA ‘86, pp. 9 - 16, November 1986. printed as SIGPLAN
Notices, 21(11).
-
[Suz81] Norihisa Suzuki. Inferring types in
Smalltalk. In Conference Record of the Eighth Annual ACM Symposium
on Principles of Programming Languages, pp. 187 - 199, 1981.
-
[Tho93] David G. Thomson. Believable Specifications:
Organizing and Describing Object Interfaces Using Protocol Conformance.
Master’s Thesis, School of Computer Science, Carleton University, 1993.
-
[Wil91] Alan Wills. Capsules and types in Fresco.
In Proceedings of ECOOP ‘91, pp. 59 - 76. 1991.
9. Appendix: Quick
Reference to Signatures
9.1 Method Signatures
For unary methods (which do not have arguments) only the return qualifier
is specified:
-
< ^ qualifier>
For methods with one or more arguments, the signature adds a qualifier
for each argument keyed by the argument’s name:
-
< arg1Name: qualifier, arg2Name: qualifier, ..., ^ qualifier>
9.2 Qualifiers
9.2.1 Class Qualifiers
-
hierarchyOf AClass -- instances of AClass
or one of its subclasses
-
instanceOf AClass -- instances of AClass
only
-
hierarchyOf class AClass -- AClass or
one of its subclasses
-
class AClass -- AClass itself
-
any -- any object is valid
-
none -- no object is valid
9.2.2 Special Qualifiers
-
nil, true, false -- these qualifiers refer to the exact objects
specified
-
self, myClass -- these qualifiers denote specific objects
that are defined by the context in which they appear. The self
qualifier indicates the receiver of the message; myClass indicates
the class of the receiver. Examples:
Code |
Appropriate Qualifier |
^self |
^self |
^self new (in a class method) |
^hierarchyOf self |
^self class |
^myClass |
^self class new |
^hierarchyOf myClass |
-
arg1, arg2, ... -- these qualifiers refer to arguments of the method
in which the signature appears.
9.2.3 Qualifier Aspects
-
qualifier {aspectName1: qualifier, aspectName2: qualifier, ... }
The predefined aspects are:
Qualified Object |
Predefined Aspects |
Default Aspect Qualifier |
Collection |
of: |
any |
Dictionary, Association |
key:, value: |
any |
Point |
x:, y: |
hierarchyOf Number |
Stream |
on: |
hierarchyOf String |
9.2.4 Alternative Qualifier
-
( qualifier | qualifier | ... ) -- one or more of the qualifiers
must apply
9.2.5 Conjunction Qualifier
-
( qualifier & qualifier & ... ) -- all of the qualifiers
must apply
9.2.6 Block Qualifier
-
[ :blockArg1 qualifier, :blockArg2
qualifier, ..., ^ qualifier ]
Since the return result of a block may be one of its block arguments, the
return qualifier inside a block may refer to its arguments, e.g., ^
b1 or ^ blockArg1. Note: the return qualifies the result
of the block, not a method return.
9.2.7 Signature Qualifier
-
unary -- < selector, ^ qualifier >
-
binary -- < selector qualifier, ^ qualifier >
-
keyword -- < firstKeyword: qualifier, secondKeyword: qualifier,
..., ^ qualifier >