Advanced ML Development
Generative AI & Large Language Models (LLMs)
ML Business & Strategy
MLOps
Tools, APIs & Frameworks

Why are we doing this anyway? ML Basics & Principles

Modularization and cognitive psychology

Carola Lilienthal
Jun 7, 2022

Modularization is frequently discussed, but after some time, the speakers realize that they don’t mean the same thing. Over the last fifty years, computer science has given us a number of good explanations about what modularization is all about—but is that really enough to come to the same conclusions and arguments?

I didn’t learn the real reason why modularization is so important until I started studying cognitive psychology. Therefore, in this article, I will bring modularization and cognitive psychology together, giving you the crucial arguments at hand about why modularization actually helps us with software development.

Parnas is still right!

In the last 20 to 30 years, we’ve developed many large software systems in Java, C++, C#, and PHP. These systems hold a lot of business value but they frustrate development teams because they can only be developed with steadily increasing effort. David Parnas’ 50-year-old recipe for finding a way out of this situation is called modularization. It is said that if we have a modular architecture, then we have independent units that can be understood and quickly developed by small teams. Additionally, modular architecture gives us the possibility to deploy individual modules separately, making our architecture scalable. These are precisely the arguments that architects and developers discuss, and yet we always disagree on what exactly we mean by modular, modules, modular architectures, and modularization.

In my doctoral thesis, I dealt with the question of how to structure software systems so people—or our human brains—can find their way around them. This is especially important since development teams spend a lot of time reading and understanding existing code. Fortunately, cognitive psychology has identified various mechanisms that our brain uses to grasp complex structures. One of them provides a perfect explanation for modularization: It’s called chunking. With chunking as a basis, we can describe modularization much better than we can with design principles and heuristics, which are often used as justifications [1]. Additionally, cognitive psychology provides us with two other mechanisms: Hierarchies and schemata, bringing further vital cues for modularization.

Chunking ➔ Modularization

In order for people to manage the amount of information they confront, they must select and group partial information together into larger units. In cognitive psychology, the construction of higher-order abstractions increasingly grouped together is called chunking (Fig. 1). By storing partial information as higher-order knowledge units, short-term memory is freed up and more information can be absorbed.

Fig. 1: Chunking

 

As an example, consider a person working with a telegraph for the first time. At first, they hear transmitted Morse code as short and long tones; they process them as separate units of knowledge. But after a while, they’ll be able to combine the sounds into letters—new units of knowledge—so that they can more quickly understand what’s being transmitted. Some time later, individual letters become words, representing larger units of knowledge, and finally, they become whole sentences.

Developers and architects automatically apply chunking when they’re exposed to new software. Program text is read in detail, and the lines are grouped into knowledge units and are, therefore, retained. Little by little, the knowledge units are summarized on and on until they achieve an understanding of the program text and structures it embodies.
This approach to programs is called bottom-up program comprehension and is typically used by development teams when they’re unfamiliar with a software system and its application domain and need to gain understanding. Development teams are more likely to use top-down program comprehension when they have knowledge of the application domain and software system. Top-down program comprehension mainly uses two structure-building processes: forming hierarchies and building schemata. We will introduce these in the following sections. 

 

Stay up to date

Learn more about MLCON

 

 

Another form of chunking can be seen in experts. They don’t store new knowledge units individually in short-term memory, but summarize them directly by activating previously-stored knowledge units. However, knowledge units can only be built from other knowledge units that fit together in a way that makes sense to the subject. During experiments with experts and novices, the two groups were presented with word groups from the expert’s knowledge area. Experts were able to remember five times as many terms as the beginners, but only if the word groups contained meaningfully related terms.

These findings were also verified in developers and architects. Chunking also works for software systems, but only if the software system’s structure represents meaningfully connected units. Program units that randomly group operations or functions together in a way that isn’t obvious to development teams why they belong together don’t facilitate chunking. The bottom line is that chunking can only be used when meaningful relationships exist between chunks.

Modules as coherent units

Therefore, it’s essential that modularization and modular architectures consist of building blocks such as classes, components, modules, and layers grouped together in meaningfully related elements. There are several design principles in computer science that aim to satisfy the requirement of coherent units:

  • Information Hiding:  In 1972, David Parnas was the first person to require that a module should hide exactly one design decision and the data structure for this design decision should be encapsulated in the module (encapsulation and locality). Parnas named this principle Information Hiding [2].
  • Separation of Concerns: In an article titled “A Discipline of Programming” [3]—which is still worth reading today—Dijkstra wrote that different parts of a larger task should be represented in different elements of the solution, if possible. Here it’s about decomposing large knowledge units with multiple tasks. In the refactoring movement, units with too many responsibilities resurfaced as code smells under the name God Class.
  • Cohesion: In the 1970s, Myers elaborated his ideas about design and introduced the cohesion measurement for evaluating cohesion in modules [4]. Coad and Yourdon extended the concept for object orientation [5].
  • Responsibility-driven Design: In the same vein as Information Hiding and cohesion,  Rebecca Wirfs-Brock’s heuristic concept aims to create classes by competencies: A class is a design unit that should satisfy exactly one responsibility and combine only one role [6].
  • Single Responsibility Principle (SRP): First, Robert Martin’s SOLID principles state that each class should perform just one defined task. Only functions that directly contribute to fulfilling this task should be present in a class. The effect of focusing on one task is that there should never be more than one reason to change a class. Robert Martin adds the Common Closure Principle at the architectural level for this. Classes should be local in their parent building blocks, so changes will always affect either all classes or none [7].

All of these principles want to promote chunking through a unit’s internal cohesion. But modularity has even more to offer. According to Parnas, a module should also form a capsule for the inner implementation with its interface.

Modules with modular interfaces

Chunking can be heavily supported by interfaces, if the interfaces—what a surprise—form meaningful units. The unit of knowledge needed for chunking can be prepared in the module’s interface so well that development teams don’t need to gather the chunk by analyzing the inside of the module anymore.

A good coherent interface results when you apply the principles in the last section to the design of the module’s interior as well as its interface [1], [7], [8]:

  • Explicit and encapsulating interface: Modules should make their interfaces explicit. In other words, the module’s task must be clearly identifiable, and internal implementation is abstracted from it.
  • Delegating interfaces and the Law of Demeter: Since interfaces are capsules, services offered in them must be made to enable delegation. True delegation occurs when services at an interface completely take over tasks. Services that return internals to the caller, which then must make further calls to get to its destination, violate the Law of Demeter.
  • Explicit dependencies: By a module’s interface, you should be able to directly recognize which other modules it communicates with. If you fulfill this requirement, then development teams will know which other modules they need to understand or create to work with the module, without having to look into its implementation. Dependency injection fits directly with this basic principle as it causes all dependencies to be injected in a module via the interface.

The goal of all of these principles is interfaces that support chunking. If they’re met, then interfaces will process a unit of knowledge faster. If the basic principles of coupling are also met, then we’ve gained a lot for chunking in program comprehension.

Modules with loose coupling

In order to understand and change an architecture’s module, development teams need an overview of the module itself and its neighboring modules. All modules that the target module works together with are important. The more dependencies there are from one module to another (Fig. 2), the more difficult it becomes to analyze individual participants with the limited capacity of short-term memory and to form suitable knowledge units. Chunking is much easier when there are fewer modules and dependencies in play.

Fig. 2: Strongly coupled classes (left) and packages/directories (right)

 

In computer science, the loose coupling principle starts here [9], [10], [11]. Coupling refers to the degree of dependency between a software system’s modules. The more dependencies in a system, the stronger the coupling. If a system’s modules were developed in accordance with the principles seen in the previous two sections on units and interfaces, then the system should automatically consist of loosely coupled modules. A module performing one related task needs fewer other modules than a module performing many different tasks. If the interface is created in a delegating way according to the Law of Demeter, then the caller only needs this interface. They do not have to move from interface to interface, finally completing their tasks with lots of additional coupling.

Up until now, chunking has helped us look at modularization for the inside and outside of a module and its relationship. Excitingly, the next cognitive mechanism also plays into understanding modularization.

Modularization through patterns

The most efficient cognitive mechanism people use to structure complex relationships are schemata. A schema can be understood as a concept consisting of a combination of abstract and concrete knowledge. On the abstract level, a schema consists of typical properties of the relationships it schematically depicts. On the concrete level, a schema contains a set of examples that represent prototypical manifestations of the schema. For example, each of us has a teacher schema that describes abstract features of teachers, including images of our own teachers as prototypical characteristics.

If we have a schema for a connection in our life, then we can process the questions and problems we’re dealing with much faster than we would without a schema. Let’s look at an example. During an experiment, chess masters and beginners were shown game positions on a chessboard for about five seconds. When it came to sensible game piece placement, the chess masters were able to reconstruct the positions of more than twenty pieces. They saw schemata of positions they knew and stored them in their short-term memory. But the weaker players could only reproduce the position of four or five pieces. The beginners had to memorize the chess pieces’ positions individually. But when the pieces were randomly presented on the chessboard to the experts and laymen, the masters no longer had an advantage. They couldn’t use schemata and thus, they couldn’t more easily remember the game pieces’ distribution, which was meaningless to them.

MYRIAD OF TOOLS & FRAMEWORKS

Tools, APIs & Frameworks

 

The design and architecture patterns widely used in software development exploit the strength of the human brain to work with schemata. If developers and architects already worked with a pattern and formed a schema from it, then they can recognize and understand program texts and structures designed according to these patterns more quickly. Constructing schemata provides decisive speed advantages for understanding complex structures. This is also why patterns found their way into software development years ago.

Figure 3 shows an anonymized blackboard image I developed with a team to record their patterns. On the right side of the image, the source code in the architecture analysis tool Sotograph is divided into pattern categories—you’ll see there are a lot of green relationships and a few red ones. The red relationships go from bottom to top against the layering created by the patterns. The low amount of red relationships is a very good result and testifies to the fact that the development team uses patterns consistently.

Fig. 3: Class level pattern = pattern language

 

It’s also exciting to see what proportion of the source code can be assigned to patterns and how many patterns the system ultimately contains. If 80% or more of the source code can be assigned to patterns, then I say that the system has a pattern language. Here, the development team created its own language to make it easier to discuss architecture.

Using patterns in the source code is especially important for modular architecture. Remember: for chunking, it’s crucial that we find meaningfully related units that have a common task. How can the modules’ tasks be described if not with patterns? Modularization is deepened and improved with extensive use of patterns if you can recognize which pattern the respective module belongs to and if the patterns are used consistently.

Hierarchies ➔ Modularization

The third cognitive mechanism, hierarchies, also plays an important role in perceiving and understanding complex structures and storing knowledge. People can absorb knowledge well, reproduce it, and navigate it if it’s available in hierarchical structures. Research about learning related word categories, organizing learning materials, text comprehension, text analysis, and text reproduction shows that hierarchies are beneficial. When reproducing lists of terms and texts, the subjects’ memory performance was significantly higher when they were offered decision trees with categorical subordination. Subjects learned content significantly faster with the help of hierarchical chapter structures or thought maps. If hierarchical structures were not available, test subjects tried to arrange the text hierarchically themselves. From these studies, cognitive psychology draws the conclusion that hierarchically ordered content is easier for people to learn and process, and content can be retrieved more efficiently from a hierarchical structure.

Hierarchy formation is supported in programming languages in the contained relationship. Classes are contained in packages or directories, while packages/directories are contained in packages/directories, and finally in projects or modules and build artifacts. These hierarchies fit our cognitive mechanisms. If hierarchies are based on the architecture’s patterns, they support us not only with their hierarchical structuring but also with their architecture patterns.

Let’s have a look at a bad example and a good example: Imagine that a team specified that a system should consist of four modules, which, in turn, contain some submodules (Fig 4).

Fig 4.: Architecture with four modules

 

This structure provides the development team with an architectural pattern of four top-level modules, each containing additional modules. Now, imagine that this system is implemented in Java and organized in a single Eclipse project due to its size. In this case, you’d expect that the architectural pattern of four modules with submodules should be reflected in the system’s package tree.

Figure 5 shows the anonymized package tree of a Java system where the development team made this exact statement: “Four modules with submodules, that’s our architecture!”.

The diagram in Figure 5 shows packages and arrows. The arrows go from the parent package to its children.

Fig. 5: A poorly implemented planned architectural pattern

 

In fact, the four modules can be found in the package tree. In Figure 5, they’re marked in the module’s colors as seen in Figure 4 (green, orange, purple, and blue). However, two of the modules are distributed over the package tree and their submodules are actually partially sorted under foreign upper packages. This implementation in the package tree isn’t consistent with the pattern that the architecture specified. It leads to confusion for developers and architects. Introducing one package root node each for the orange and purple components would solve this.

 

Figure 6 shows a better mapping of the architecture pattern to the pattern tree. In this system, the architectural pattern is symmetrically transferable to the package tree. Here, developers can quickly navigate using the hierarchical structure and benefit from the architectural pattern.

Fig. 6: A well-implemented architectural pattern

 

If the contained relationship is used correctly, it supports our cognitive mechanism hierarchies. This doesn’t apply to all other kinds of relationships: we can link random classes and interfaces in a source code base by usage relationship and/or inheritance relationship. By doing this, we create intertwined structures (cycles) that aren’t hierarchical in any way. It takes some discipline and effort to use the usage and inheritance relationships hierarchically. If the development team pursues this goal from the beginning, usually, the results are almost cycle-free. If the value of being cycle-free isn’t clear from the beginning, then structures like the one in Figure 7 will emerge.

Fig. 7: Cycle of 242 classes

 

But the desire to achieve freedom from cycles is not an end in itself! It’s not about satisfying some technical structure idea of “cycles must be avoided”. Instead, the goal is to design a modular architecture.


If you make sure that individual building blocks in your design are modular (meaning, they are each responsible for just one task) then cycle-free design and architecture usually emerge of their own accord. A module providing basic functionality should never need functionality from the modules that build upon it. If the tasks are clearly distributed, then it’s obvious which module must use which other module to fulfill its task. A reverse, cyclic relationship won’t arise in the first place.

Summary: Modularization rules

The three cognitive mechanisms of chunking, schemata, and hierarchies give us the background knowledge to use modularization in our discussions clearly and unambiguously. A well-modularized architecture contains modules that facilitate chunking, hierarchies, and schemata. In summary, we can establish the following rules. The modules in a modular architecture must:

  1. form a cohesive, coherent whole within them that’s responsible for exactly one clearly defined task (unit as a chunk),
  2. form an explicit, minimal, and delegating capsule to the outside (interface as a chunk),
  3. be designed according to uniform patterns throughout (pattern consistency) and
  4. be minimally, loosely, and cycle-free coupled with other modules (coupling for chunk separation and hierarchies).

If these mechanisms and their implementation in architecture are clear to the development team, then an important foundation for modularization has been laid.

 

Links & Literature

[1] This article is a revised excerpt from my book: Lilienthal, Carola: “Durable Software Architectures. Analyzing, Limiting, and Reducing Technical Debt”, dpunkt.verlag, 2019.

[2] Parnas, David Lorge: “On the Criteria to be Used in Decomposing Systems into Modules”; in: Communications of the ACM (15/12), 1972

[3] Dijkstra, Edsger Wybe: “A Discipline of Programming”; Prentice Hall, 1976

[4] Myers, Glenford J.: “Composite/Structured Design”; Van Nostrand Reinhold, 1978

[5] Coad, Peter; Yourdon, Edward: “OOD: Objektorientiertes Design”; Prentice Hall, 1994

[6] Wirfs-Brock, Rebecca; McKean, Alan: “Object Design: Roles, Responsibilities, and Collaborations”; Pearson Education, 2002

[7] Martin, Robert Cecil: “Agile Software Development, Principles, Patterns, and Practices”; Prentice Hall International, 2013

[8] Bass, Len; Clements, Paul; Kazman, Rick: “Software Architecture in Practice”; Addison-Wesley, 2012

[9] Booch, Grady: “Object-Oriented Analysis and Design with Applications”; Addison Wesley Longman Publishing Co., 2004

[10] Gamma, Erich; Helm, Richard; Johnson, Ralph E.; Vlissides, John: “Design Patterns. Elements of Reusable Object-Oriented Software”; Addison-Wesley, 1994

[11] Züllighoven, Heinz: “Object-Oriented Construction Handbook”; Morgan Kaufmann Publishers, 2005

Top Articles About ML Basics & Principles

Behind the Tracks

DON'T MISS ANY ML CONFERENCE NEWS!