Self-assembly can create complex structures at many scales, including molecular assemblies , proteins , mesoscale objects [2, 15] and collections of tiny robots [1, 3, 7]. This technique can also reconfigure such structures in response to changes in their environments. Self-assembly is particularly appropriate when fabrication technology is unable to directly place components in specific locations or when the precise location of each component needed to achieve desired functionality is not known a priori due to unmodelled variations in the environment or components. How can a self-assembly process be designed to be robust with respect to these variations? This presentation addresses this question with a general design principle based on the characteristics of the statistical distributions of self-assembled structures.
To describe this principle, note that, fundamentally, self-assembly operates on a collection of components to produce a variety of global structures. For example, for proteins the components are a sequence of amino acids and the global structures are the possible folded shapes. Rather than requiring operations that can precisely place components into their final positions, self-assembly relies on a statistical exploration of many potential structures. The particular global structure produced from given components is determined by biases in this exploration, as determined by interactions among the components. The strength of the interaction between components depends on their relative locations in the structure. In the context of designing a self-assembly process, these interactions can reflect constraints on the desirability of a component being near its neighbors in the global structure. In most interesting cases, the constraints will be somewhat frustrated in that not all can be simultaneously satisfied. The interactions combine to define, for each global structure, a measure of the extent to which the constraints are violated, which can be viewed as an energy for that structure. From this viewpoint, each set of components will tend to assemble into that global structure with the minimum energy for that set.
In this context, we can ask how many different component collections assemble to the same global structure. We refer to this count as the designability of the structure. A given assembly process can then be characterized by a distribution of designability, i.e., the number of global structures with various designability values. A schematic example of one such distribution is shown in Fig. 1.
Figure 1: Schematic distribution of designability, i.e., the number of different component configurations producing a given global structure. Each point on the curve indicates the number of global structures with a given designability. The long tail of the distribution indicates a few global structures are highly designable, i.e., far more designable than typical cases.
Significantly, the distribution of designability illustrated here is extremely skewed: a few structures are much more designable than most others. These highly designable structures can be formed in relatively many ways in response to interactions among the component parts. Thus such structures are relatively more tolerant of errors in the choice of components, and hence their precise interactions, than is typically the case, producing one form of robustness for self-assembly .
A second property of self-assembled structures is their energy gap, that is the difference in energy, due to interactions among the components, between the global structure with the smallest energy for those components and that structure with the second smallest energy. Structures with relatively large energy gaps will be more robust with respect to environmental noise than those with smaller gaps. More precisely, we can define the robustness of a structure as the average of the energy gap associated with all component configurations that produce that global structure.
Designability reflects the behavior of a given global structure with respect to different sets of components. Thus it characterizes the effect of errors or other changes in the set of components. By contrast, robustness characterizes a given set of components with respect to the different global structures that set could form.
Interestingly, self-assembly processes with skewed distributions of designability can also produce relatively large energy gaps for the highly designable structures, as illustrated in Fig. 2.
Schematic illustration of how robustness of global structures varies
with their designability.
This association is easily understood from the fact that small changes in the configuration of components will usually result in only small changes in the energies of global structures. If a particular structure has a large energy gap, small changes in the energies are likely to leave that structure as still having the minimum energy. Conversely, small changes in a structure with a small gap are fairly likely to change the minimum energy structure.
Schematic illustration of energies associated with various global
structures, ordered to that neighbors differ by small changes. In this
case the energy gap is determined by a local minimum shown by the
solid arrow. The gray arrow indicates the gap due to a small change in
Combining these properties, self-assembly processes whose designablily distribution includes highly designable structures are likely to be particularly robust, both with respect to errors in the specification of the components and environmental noise. Thus we have a general design principle for robust self-assembly: select the components, interactions and possible global structures so the types of structures desired for a particular application are highly designable.
Applying this principle requires two capabilities. The first is finding processes leading to highly designable structures of the desired forms. Identifying such processes is largely an open statistical problem for large systems since it uses properties of the tails of statistical distributions which are more difficult to characterize than those of the central parts of distributions. However, some specific examples have been identified, e.g., lattice-based models of protein folding  which suggest evolution has taken advantage of this design principle. Whether or not such simplified models accurately capture the behavior of natural protein folding, they do show that such distributions exist in systems with fairly simple interactions and components. Furthermore, this framework is well suited to use genetic algorithms  to find appropriate processes where possible interactions and global structures correspond to genotypes and phenotypes, respectively. A related issue is the possibility of energy barriers in the formation of the minimum-energy structure, which could cause the system to take a long time to settle into the final structure or remain stuck in local minima. Statistically, the extent of this problem is determined by the typical fraction of initial conditions that lead readily to the energy minimum. Again, the protein lattice models suggest this need not be a severe limitation for simple interactions . In more complex cases where this could be a problem, using computational markets to determine the interactions could provide one remedy by allowing the system to exploit locally coordinated groups of changes that move around these barriers .
The second requirement for applying this design principle is the ability to create the necessary interactions among the components. For simple components, the range of possible interactions may be fairly limited and hence further restrict the search for suitable processes. More complex components, such as tiny robots, can have arbitrary interactions programmed into the components, subject only to restrictions on the timely availability of required information, which would tend to enforce the use of local interactions. For example, the interactions could include arbitrage opportunities in market-based systems [4, 11]. The ability to program desired interactions is particularly significant in allowing designed systems to more accurately reflect any simplifications in the models than would be the case for describing naturally existing assembly situations. Thus, like the development of computational ecologies , designed self-assembly provides an example of how prescriptive use of simple models relating global to local behaviors can result in simpler and more accurate analyses than their approximate descriptive use for naturally existing systems.
Achieving a general understanding of the conditions that give rise to highly designable structures is largely a computational problem that can be addressed before actual implementations become possible. Thus, developing this principle for self-assembly design is particularly appropriate in situations where explorations of design possibilities takes place well ahead of the necessary technological capabilities . Moreover, statistical distributions of the self-assembly processes could be usefully applied to a range of component complexity, from simple physical structures to small robots operating in market-based systems.
I have benefited from discussions with Don Kimber and Rajan Lukose.