Opening Open-Endedness

Mar 12, 2025

In this post, I will open for you and myself the concept of open-endedness based on the following two papers - "Open-Endedness is Essential for Artificial Superhuman Intelligence" by Edward Hughes, Michael Dennis, Jack Parker-Holder, Feryal Behbahani, Aditi Mavalankar, Yuge Shi, Tom Schaul, Tim Rocktaschel (Article 1) and "OMNI: Open-Endedness via Models of Human Notions of Interestingness" by Jenny Zhang, Kenneth Stanley, Joel Lehman, Jeff Clune (Article 2).

Both papers are quite fresh so we may be sure that they reflect the latest state of thinking about open-endedness or rather confusion about it. The latter requires proposing a way to clarify what I view as issues and how to handle them.

What open-endedness is after

According to Article 1, "a system is open-ended from the perspective of the observer O if and only if it generates sequences of artifacts that are both novel and learnable. The novelty aspect ensures the presence of information gain within the system, while learnability guarantees that this information gain holds meaning and is “interesting” to the observer". According to Article 2, "Open-ended algorithms aim to learn new, interesting behaviors".

Both papers mention the difficulty to define what "novel/new" and "interesting" tasks and artifacts are. There are other difficulties as well. According to Article 1, "the field of open-endedness has faced numerous challenges. Principal among these has been the problem of structuring the search space". Article 2 supports that idea, "That requires a vast environment search space".

Let's address these from the point of view of my theory of intelligence.

What is novel

May I say that novel is unseen before? Long ago, Heraclitus stated the following, "No man ever steps in the same river twice. For it's not the same river and he's not the same man". Obviously, this is not what the authors of both papers mean by "novel".

I propose to differentiate unique phenomena according to Heraclitus into "interchangeable" and "different". To do that we require to add one more factor - purpose or action. With respect to some purpose or action some phenomena are interchangeable and some are different. The procedure that allows for establishing that is a simple comparison.

Comparison of two phenomena is performed one property at a time. With respect to a used property, point-accurate measurements do not allow for interchangeability. Range-based ones do. If both phenomena project to the same range they are interchangeable, otherwise they are different.

Any phenomenon (object, action, scene, etc.) has multiple properties. Considering each property a dimension this makes phenomena multi-dimensional. Measuring the value of one property is like projecting a multi-dimensional phenomenon on the respective axis. Or we may say that measuring is dimensionality-reduction operation.

Action is another dimensionality-reduction tool. Any action affects only a subset of a phenomenon's properties. "Renaming" affects only the "name" property, "moving" affects the "position" property.

Dimensionality-reduction is important as it makes cognitive operations simpler. From the above, one can conclude that cognitive "computation" is essentially comparison. In fact, we will see further that it is comparison-based selection or search.

What is search space

Since comparisons operate not with phenomena as a whole but with separate properties, this makes comparable properties the core components of intelligence.

Multiple properties create multi-dimensional "concept space". By "concept" I mean a multi-dimensional cube or pocket of that space defined by the ranges of relevant properties.

Properties bring unique semantics about them. Height is different from weight. Therefore, we cannot treat their numeric values in the same way.

Navigation in the search space of comparable properties is best illustrated by the game 20 Questions. The game deserved attention of such prominent scientists as C.S. Peirce and Alan Newell for a reason. Played properly, it teaches us how concepts and their hierarchy are formed.

This hierarchy is important to track because rules are introduced at each level and exceptions are introduced at lower levels.

What is interesting

The rules mentioned in the previous section are known as knowledge-how. They establish correspondence between input parameters of some action and its results. Both inputs and outputs are expressed as ranges of some properties. For example, "less oxidized tea leaves" produce tea that tastes differently from tea brewed using "more oxidized tea leaves". Here we considered two properties - "oxidation level" and "tea taste".

The scientific method requires to "change one parameter keeping the others the same and observe the results". This basic principle also uses dimensionality reduction in the above proposed search space. If we recall Einstein's definition of insanity we will see familiar motives.

Exceptions in results play an important role for the ranges of input parameters. We did not discuss how the boundaries of ranges are established. This is achieved by respecting the exceptions in results. Or rather what is considered by different results. In other words, when some difference is significant with respect to the considered purpose or action then we draw boundaries there. The following quote from the SEP entry on Vagueness states it nicely, "Where there is no perceived need for a decision, criteria are left undeveloped."

The importance of knowledge-how is hard to overestimate. When we know how to affect "tea taste" by manipulating different input parameters we are confident about our decisions. I understand that "tea taste" is not so "interesting". You may replace it with "nuclear chain reaction" as the result of "nuclear fuel quantity". Having reliable recipes is important for future encounters with a given task. Because of that filling the gaps in knowledge-how may be considered as an ultimate reward as per RL. Article 1 authors may find this idea useful because to them, "A key problem in RL is how to shape exploration towards novel and learnable behaviors in high-dimensional domains".

The role of language

Article 2 proposes to use foundation models, in particular LLMs, to figure out "interesting" tasks, "The insight is that we can utilize foundation models (FMs) as a model of interestingness (MoI), because they already internalize human concepts of interestingness from training on vast amounts of human-generated data, where humans naturally write about what they find interesting or boring." Taking the above into account, I doubt that LLMs will serve that purpose reliably.

I propose to use language and knowledge recorded by humans using it. The genius of language hypothesis implies that language reflects a lot of cognitive functions. I consider language as the only intelligent tool developed by humans so far.

If you consider the content words in any language you will see that they refer to ranges of properties mentioned above. The same words may refer to the same range of the same property which makes those words synonyms. Or the same word may refer to different ranges of different properties which makes that word polysemous.

If you are interested in how natural languages use ranges to perform their role in communication, I wrote several posts about it - https://alexandernaumenko.substack.com/p/symbolic-communication or https://alexandernaumenko.substack.com/p/the-pointing-role-of-language.

Here, I want to attract your attention to the important fact that natural languages absorbed all the concepts relevant to humans throughout their history. By analyzing the meanings of words we may create the list of properties, their ranges and actions affecting them. It will be a good starting point for implementing better algorithms that understand language and also perception algorithms to understand what a machine "senses". The genius of language will help us in understanding cognition.

The best game for open-ended cognitive science

I have already mentioned the game 20 Questions. Let me reiterate this one more time. I consider this game the most important game for cognitive scientists. The game teaches us about concepts, their defining features and further properties, generalization, and the core algorithm of cognition.

There are many approaches to object categorization - Wittgenstein's family resemblance theory, prototype theory, or exemplar theory, to mention a few. All three are based on similarities. But ask yourself - do you compare an unknown object to all the known categories? Do those theories determine which properties to use for comparison? Are they suitable for real-time applications?

The procedure of the game 20 Questions answers the above questions in a spectacular way. Each answer gives you a concept. "Is it tangible?" leaves you with Tangible or Abstract category. "Is it animate?" differentiates Living vs Inanimate categories. We get the defining features (properties to compare) of every category, we get the semantics of each category in terms of what other properties to expect.

Note that at each level a concept is more general than at the levels below. This explains generalization. The usual procedure of the game introduces differences between sibling subclasses of each class. It is important to understand and remember - a concept is defined by differences from other classes and not by similarities of its instances. When we forget those differences and move up the concept tree we generalize. It is possible to generalize along different paths depending on the properties differences in which we decide to ignore.

Defining features are the only relevant properties for categorization. They represent only a tiny subset of a fitting object's properties. It makes categorization another dimensionality-reduction procedure. All that simplifies cognitive computations.

The above demonstrates why Fodor's idea of atomic concepts is flawed. When we talk about concepts, everything is recognized in comparison.

Finally, let's consider the whole procedure of the game. It starts with the set of known categories. Then it applies semantically respective filters based on the given object's properties. After each answer we get a fitting category at some level of generalization. We may generalize the performed algorithm to the following - the selection of the most fitting option from the available ones respecting relevant constraints. I call this the core algorithm of cognition.

There may be several outcomes - multiple results, a single result, or no results at all. The latter is the basis for "I don't know" answer.

Tasks

Each task starts with its recognition. The 20 Questions procedure may be perfectly adapted for that. Any task then may have multiple algorithms for its solution as well as criteria for what to consider its solution. The context may provide constraints.

An open-ended algorithm implemented on the above principles will be able not only to approach any task it encounters but also re-evaluate the context to decide if a higher-priority task emerged and requires switching to it. For example, an FSD car moving from A to B may encounter a wounded person on a sidewalk. It will update its current task to "take the person to a hospital". Unless it transports a hundred hearts for immediate transplantation. In that case, the decision may be to continue with the current task but in parallel to "call 911".

Tools or specialization

I envision at least two ways to progress our knowledge-how.

The first one is related to exploring smaller subpockets of the concept space by looking for other relevant parameters and their effects.

The second approach is to develop tools that will allow us to reach to previously unreachable ranges. For example, throwing objects by hand is inferior to using a catapult. Using rockets is superior to both.

Both of the above approaches rely on the use of comparable properties. Also they rely on critical observation. If an agent observes that some action affects properties for which it did not have any rules it should make it curious about exploring novel opportunities for additional recipes.

Learning or generating novelty

Article 1 required an open-ended algorithm to generate novel and learnable artifacts. I consider it unreasonable. We may easily push this to the extreme - imagine an ASI that will provide any of 8 billion people with a radically novel tool for any task after each use of a previous tool. It will be utterly inefficient. I propose to search for novelty if there is a serious reason for that. For example, an ASI may decide to look for an alternative for plastic cups upon finding out about the dangers of plastic.

Article 2 expects an open-ended algorithm to learn novel tasks and how to solve them. I can relate to that. Consider a person familiar with deer-hunting who first arrived to Australia and encountered kangaroos. Even though "kangaroo-hunting" is a novel task but it can easily generalize along with deer-hunting to "animal-hunting". Using the same methods that person will test if they fit. Any exceptions will lead to the development of updated methods. Upon success, "deer-cooking" methods will be tested for kangaroo meat. Again, exceptions will lead to updates. But note that multiple properties are shared by deer and kangaroos. Even though kangaroo is a different animal, the situation is not totally unknown.

***

The above explains why comparable properties may be considered an atoms of cognitive complexity and the keys to open-ended algorithms. We need to shift research focus from objects and actions to comparable properties if we want to progress the cognitive science. Such book titles as "How to do things with words" should not hinder our progress in development of intelligent algorithms and understanding the nature of intelligence.

Mykola Rabchevskiy

Mar 13

The concept of openness seems too vague to be helpful. It is not a property of a system but a property of the interaction of two systems, one of which plays the role of an observer (researcher). The outcome depends on the ability of the observer to detect artifacts generated by the observed system and its ability to detect novelty. Since the generated artifacts may be internal to the observed system, and the system's design may prevent their availability without destroying the observed system, the situation becomes even more confusing.

An example of an open system that probably does not meet the expectations of the author of the concept is a slot machine observed by a person; an artifact is a sequence of numbers generated by a given moment in time. Each new artifact differs from the previous one, observed, studied, etc. What "useful residue" do we have as a result?

Expand full comment

Dakara

Mar 22

"Article 2 proposes to use foundation models, in particular LLMs, to figure out "interesting" tasks,"

Yes, I strongly agree with your view here. LLMs can only determine what is novel by matching against patterns of what has already been called novel. It can't properly evaluate something that doesn't exist and its significance. What we consider interesting changes over time. LLMs can't capture this relevance.

As for everything else, I'm considering it with interest, but as of yet my thoughts are still in process.

1 reply by Alexander Naumenko

1 more comment...

I Solve Intelligence - it's Symbolic

Discussion about this post