Learning Outcomes

  • Understand the motivation for different programming paradigms: to abstract machine operation into human understandable and composable programs
  • Understand the difference between syntax the textual symbols and grammatical rules of a program, and semantics the meaning of what is computed
  • Understand that there are different models of computation upon which different programming languages are based, including machine models such as Turing Machines and von Neumann architecture, and the Lambda Calculus based on mathematical functions.

Syntax versus Semantics

As a branch of Computer Science, the theory and practice of programming has grown from a very practical need: to create tools to help us get computers to perform useful and often complex tasks. Programming languages are tools, that are designed and built by people to make life easier in this endeavour. Furthermore, they are rapidly evolving tools that have grown in subtlety and complexity over the many decades to take advantage of changing computer hardware and to take advantage of different ways to model computation.

An important distinction to make when considering different programming languages, is between syntax and semantics. The syntax of a programming language is the set of symbols and rules for combining them (the grammar) into a correctly structured program. These rules are often arbitrary and chosen for historical reasons, ease of implementation or even aesthetics. An example of a syntactic design choice is the use of indentation to denote block structure or symbols such as “BEGIN” and “END” or “{” and “}”.

Example: functions in python and C that are syntactically different, but semantically identical:

# python code
def sumTo(n):
   sum = 0
   for i in range(0,n):
       sum = sum + i
   return sum
// C code:
int sumTo(int n) {
   int sum = 0;
   for(int i = 0; i < n; i++) {
       sum += i;
   }
   return sum;
}

By contrast, the “semantics” of a programming language relate to the meaning of a program: how does it structure data? How does it execute or evaluate?

In this course we will certainly be studying the syntax of the different languages we encounter. It is necessary to understand syntax in order to correctly compose programs that a compiler or interpreter can make sense of. The syntactic choices made by language designers are also often interesting in their own right and can have significant impact on the ease with which we can create or read programs. However, arguably the more profound learning outcome from this course should be an appreciation for the semantics of programming and how different languages lend themselves to fundamentally different approaches to programming, with different abstractions for modelling problems and different ways of executing and evaluating. Hence, we will be considering several languages that support quite different programming paradigms.

For example, as we move forward we will see that C programs vary from the underlying machine language mostly syntactically. C abstracts certain details and particulars of the machine architecture and has much more efficient syntax than Assembly language, but the semantics of C are not so far removed from Assembler - especially modern assembler that supports procedures and conveniences for dealing with arrays.

By contrast, Haskell and MiniZinc (which we will be exploring in the second half of this course) represent quite different paradigms where the underlying machine architecture, and even the mode of execution, is significantly abstracted away. MiniZinc, in particular, is completely declarative in the sense that the programmer’s job is to define and model the constraints of a problem. The approach to finding the solution, in fact any algorithmic details, are completely hidden.

One other important concept that we try to convey in this course is that while some languages are engineered to support particular paradigms (such as functional programming or logic programming), the ideas can be brought to many different programming languages. For example, we begin learning about functional programming in JavaScript (actually TypeScript and EcmaScript 2017) and we will demonstrate that functional programming style is not only possible in this language, but brings with it a number of benefits.

Later, we will pivot from JavaScript (briefly) to PureScript, a haskell-like language that compiles to JavaScript. We will see that, while PureScript syntax is very different to Javascript, the JavaScript generated by the PureScript compiler is not so different to the way we implemented functional paradigms manually in JavaScript.

Then we will dive a little-more deeply into a language that more completely “buys into” the functional paradigm: Haskell. As well as having a syntax that makes functional programming very clean, the haskell compiler strictly enforces purity and makes the interesting choice of being lazy by default.

In summary, the languages we will study (in varying degrees of depth) will be Assembler, C/C++, JavaScript (ES2017 and TypeScript), PureScript, Haskell and MiniZinc, with JavaScript/TypeScript and Haskell being the main languages explored in problem sets and assignments. Thus, this course will be a tour through programming paradigms that represent different levels of abstraction from the underlying machine architecture. To begin, we spend just a little time at the level of least abstraction: the hardware itself.

The Machine Level

Conceptually, modern computer architecture deviates little from the von Neumann model proposed in 1945 by Hungarian-American computer scientist John von Neumann. The development of the von Neumann model occurred shortly after the introduction of the theoretical Turing Machine concept.

The von Neumann architecture was among the first to unify the concepts of data and programs. That is, in a von Neumann architecture, a program is just data that can be loaded into memory. A program is a list of instructions that read data from memory, manipulate it, and then write it back to memory. This is much more flexible than previous computer designs which had stored the programs separately in a fixed (read-only) manner.

The classic von Neumann model looks like so:

von Neumann Architecture

At a high-level, standard modern computer architecture still fits within this model:

Modern Computer Architecture

Programs run on an x86 machine according to the Instruction Execution Cycle:

  • CPU fetches instruction from instruction queue, increments instruction pointer
  • CPU decodes instruction and decides if it has operands
  • If necessary, CPU fetches operands from registers or memory
  • CPU executes the instruction
  • If necessary, CPU stores result, sets status flags, etc.

Registers are locations on the CPU with very low-latency access due to their physical proximity to the execution engine. Modern x86 processors also have 2 or 3 levels of cache memory physically on-board the CPU. Accessing cache memory is slower than accessing registers (with all sorts of very complicated special cases) but still many times faster than accessing main memory. The CPU handles the movement of instructions and data between levels of cache memory and main memory automatically, cleverly and—for the most part—transparently. To cut a long story short, it can be very difficult to predict how long a particular memory access will take. Probably, accessing small amounts of memory repeatedly will be cached at a high-level and therefore fast.

Other Models of Computation

Turing Machines are a conceptual model of computation based on a physical analogy of tapes being written to and read from as they feed through a machine. The operations written on the tape determine the machine’s operation. The von Neumann model of computation has in common with Turing Machines that it follows an imperative paradigm of sequential instruction execution, but it is a practical model upon which most modern computers base their architecture.

There are other models of computation that are also useful. In particular, we will look at the Lambda Calculus, which was roughly a contemporary of these other models but based on mathematical functions, their application and composition. We will see that while imperative programming languages are roughly abstractions of machine architecture, functional programming languages provide an alternative abstraction built upon the rules of the lambda calculus.

Alternative Abstractions

Humans think about programs in a different way to machines. Machines can read thousands or millions of instructions per second and execute them tirelessly and with precision. Humans need to understand programs at a higher level, and relate the elements of the program back to the semantics of the task and the ultimate goal. There is a clear and overwhelmingly agreed-upon need to create human-readable and writable languages which abstract away the details of the underlying computation into chunks that people can reason about. However, there are many ways to create these abstractions.

Comparing the Dominant Programming Paradigms

First some definitions which help to describe the different families of programming languages that we will be considering in this course. It is assumed that the reader already has experience with the first three paradigms (imperative, procedural and objected-oriented) since these are taught in most introductory programming courses. The rest are discussed further in later chapters in these notes.

  • Imperative programs are a sequence of statements that change a programs state. They follow the Turing machine model described above. This is probably the dominant paradigm for programming languages today. Languages from Assembler to Python are built around this concept and most modern languages still allow you to program in this style.
  • Procedural languages are basically imperative in nature, but add the concept of named procedures, i.e. named subroutines that may be invoked from elsewhere in the program, with parameterised variables that may be passed in. The result of the procedure may be returned as an explicit return value, or the procedure may mutate (change) the value of variables referenced elsewhere in the program (such a mutation is called a side effect). Again, most modern languages support definition of such procedures. Sometimes, the term function is used synonymously with procedure. Some languages (e.g. Pascal) make a distinction between procedures which mutate state without an explicit return value, while a function has an explicit return value. Note that the latter is closer to the definition of function in mathematics.
  • Object-oriented languages are built around the concept of objects where an objects captures the set of data (state) and behaviours (methods) associated with entities in the system. We assume most readers have already studied the basics of object oriented programming, such as defining classes of objects, inheritance and encapsulation.
  • Declarative languages focus on declaring what a procedure (or function) should do rather than how it should do it. Classic examples of declarative languages are SQL - which precisely describes a relational database query (i.e. what should be returned) without having to tell the database how the computation should be performed (e.g. the sequence of steps to carry out to perform the query). HTML is also declarative: describing what a web page should look like without specifying how to render it. SQL and HTML both avoid defining the how by relying on an engine which already knows how to do the task (i.e. the database system (SQL) or the browser (HTML)). By contrast, Prolog is an example of a language which is declarative but also a complete programming language (being able to perform any computation that a Turing machine can compute, i.e. Turing Complete). Declarative languages like Prolog manage to perform arbitrarily complex computations (e.g. with loops and evolving state) through recursive definitions.
  • Functional languages are built around the concept of composable functions. Such languages support higher order functions which can take other functions as arguments or return new functions as their result, following the rules of the Lambda Calculus (which we will explore in some depth later). Functional languages such as Haskell (which we are going to explore later) adhere closely to the mathematical definition of functions that are pure, i.e. that do not have side effects but whose only result is an explicit return value. Pure functional programs tend to be closer to a declarative definition of the solution since their manipulation of state is made explicit from the function parameter and return types rather than hidden in side effects (this is a quality which we will see through many examples later). Many modern languages support a mixture of functional (i.e. by supporting higher-order functions) and imperative style programming (by allowing mutable variables and side effects from functions).

Until relatively recently there was a fairly clear distinction between languages which followed the Functional Programming (FP) paradigm versus the Object Oriented (OO) paradigm. From the 1990s to… well it still is… Object Oriented programming has been arguably the dominant paradigm. Programmers and system architects have found organising their programs and data into class hierarchies a natural way to model everything from simulations, to games, to business rules. But this focus on creating rigid structures around data is a static view of the world which becomes messy when dealing with dynamic state. Functional programming, however, focuses on functions as transformations of data. These two alternatives, Object-Oriented versus Functional, have been described as a Kingdom of Nouns versus a Kingdom of Verbs, respectively, where nouns are descriptions of data and verbs are functions.

Historically these were seen as competing paradigms, leading to quite different languages such as C++ and Haskell. However, modern languages like TypeScript, Scala, C++17, Java >8, Swift, Rust, etc, provide facilities for both paradigms for you to use (or abuse) as you please.

Thus, these are no longer competing paradigms. Rather they are complementary, or at-least they can be depending on the skill of the programmer. They provide programmers with a choice of abstractions to apply as necessary to better manage complexity and achieve robust, reusable, scalable and maintainable code.

These notes focus on introducing programmers who are familiar with the OO paradigm to FP concepts via the ubiquitous multiparadigm language of JavaScript. The following table functions as a summary but also an overview of the contents of these notes with links to relevant sections.

  Functional Object-Oriented
Unit of Composition Functions Objects (classes)
Programming Style Declarative Imperative
Control Flow Functions, recursion and chaining Loops and conditionals
Polymorphism Parametric Sub-Typing
Data and Behaviour Loosely coupled through pure, generic functions Tightly coupled in classes with methods
State Management Treats objects as immutable Favours mutation of objects through instance methods
Thread Safety Pure functions easily used concurrently Can be difficult to manage
Encapsulation Less essential Needed to protect data integrity
Model of Computation Lambda Calculus Imperative (von Neumann/Turing)