NoSQL and SQL Data Modeling: Bringing Together Data, Semantics, and Software

Chapter 9
Object-Oriented Programming Languages

Programming languages have undergone almost continuous evolution since they were first introduced as a way to express through symbols what instructions should be given to computers. A major change in programming occurred when the Simula programming language introduced the idea of objects in the late 1960s. The concepts of objects and classes were further developed in SmallTalk, C++, and other programming languages.

Today, most programming languages in wide use (other than C) are object-oriented and don’t make much of a fuss about it. This chapter will focus on two of the currently most popular object-oriented programming languages, namely Java and C#.

Classes, Objects, Types, and Variables

Chapter 4 in part I explained the type/class split, where types designate sets and classes describe objects. Chapters 10 and 11 will explore in depth the implications of this split for data modeling. For those readers who have a background in object-oriented programming, we’ll preview those chapters here, and relate them specifically to Java and C#.

As chapter 11 explains, the idea of a “type” in object-oriented programming languages was inherited directly from high-level languages that pre-dated the object-oriented revolution. The definition of a traditional programming-language type is built in to the language definition, and specifies two things:

the set of values that can be represented by variables of the type
the amount of storage to be allocated to variables of the type

These two aspects of a traditional programming-language type cannot be separated.

When object orientation came along, the programming language construct called “class” gave the programmer the ability to extend the set of types that a compiler knew about and could enforce. A class enabled a programmer to specify two things:

the structure of an object in memory composed of variables and/or other objects
a set of routines exclusively authorized to operate on the components of objects of the class, called methods

Since the built-in types of programming languages had very simple structures, this led to the belief that types are always simple or primitive, and classes always describe structures.

COMN’s goal is to be able to describe all of the following things in a single notation:

material objects in the real world
concepts
data about real-world objects and concepts
objects in memory whose otherwise meaningless states represent data about real-world objects and concepts

To achieve this, COMN separates the two aspects of type that were inherited from early programming languages. The term type is reserved for designating a set of values or objects that a variable might represent completely apart from any specification of memory allocation or structure. Thus, a type is completely abstract. A type can also be as complex as any class except that it doesn’t dictate memory allocation. Types are not necessarily primitive or simple. The term class is reserved for describing the structure and/or behavior (that is, the methods) of computer objects in a computer’s memory or storage. The physical states of the objects of a class, in COMN terms, have no meaning unless the class declares itself to represent a type. If a class represents a type, then each state of each object of the class represents a member of the set designated by that type.

One of the main advantages of this approach is that it takes those “built-in types” that have been inherent in programming languages ever since there were programming languages and makes them quite a bit less special. There is still the necessary set of “predefined” types, but now these types are not tied to any particular programming language’s assumptions about how they are represented in memory, including such things as endian-ness. An integer type simply gives a range of integers that the type designates. Whether that type is represented in memory by a little-endian binary two’s complement integer, a string of Unicode decimal digits, or something else is not specified by the type. That kind of information is given (eventually) by a class representing the type that has all the details of the implementation— and the class encapsulates those details so that programs can’t become coupled to them.

This is tremendously powerful for data modeling (and for software design, but this book is not about software design!). A data modeler can first capture descriptions of requirements and data as types, variables, and values, without even thinking about how they will be represented in a computer’s memory or a database. Then, in a separate stage of design, a data modeler can specify exactly how those things should be represented in physical classes and objects that are available for use on the implementation platform.

So, as you read further in the book, keep in mind that, in COMN, a type tells you the range of values a variable can take on or the set of things the variable can represent, but doesn’t give you any clue as to how those things will be represented. And a class can tell you all about a structure in memory and how it can be manipulated through its methods, but is meaningless unless it indicates the type it represents. The next two chapters go into these facts in depth.

Terminology

Java Term	COMN Term
class	class
primitive type	a class representing a simple type
reference type	the class of a pointer or reference to a computer object
variable	a computer object whose class is either a class representing a primitive type or the class of a pointer or reference to a computer object
no Java equivalent	variable: a symbol which may or may not be represented by a computer object in a compiled program
object	computer object
value	value
no Java equivalent	state: the meaningless physical state of a computer object

C# Term	COMN Term
class	class
interface	class interface
simple type	a class representing a simple type
enum type	a class representing a simple type whose values are named
struct type	a class without encapsulation
nullable type	a class that includes a representation that a value is unknown
array type	an array class
variable of value type	a computer object whose class represents a simple type, enum type, struct type, or nullable type
variable of class type or of interface type	a computer object whose class is a pointer or reference to a computer object
variable of array type	a computer object whose class is a pointer or reference to a computer object representing an array
no C# equivalent	variable: a symbol which may or may not be represented by a computer object in a compiled program
object	computer object
value	value
no C# equivalent	state: the meaningless physical state of a computer object

Key Points

Object-oriented programming languages inherited types from early programming languages that specified both value sets and memory structure.
COMN separates the designation of a set of values from the description of computer object structure and behavior. Types designate sets without specifying memory structure. Classes describe computer objects in terms of their structure in memory and the routines (methods) exclusively authorized to operate on them. The otherwise meaningless physical states of objects only have meaning if their classes represent types.

Previous Chapter

Chapter 8 Semantic Notations

Next Chapter

Part III Freedom in Meaning

Table of Contents for NoSQL and SQL Data Modeling: Bringing Together Data, Semantics, and Software

Table of Contents for
NoSQL and SQL Data Modeling: Bringing Together Data, Semantics, and Software