Computability

Computability is about solving problems effectively. Computability Theory is the study of the limitations of computers in solving problems. Computability theory can also be understood with the help of complexity theory which is the classifying of computable problems by their difficulty. Computability theory explores questions such as:

Is there an algorithmic solution to every problem?
Are there limits to computers?
Are there relations between different unsolvable problems?

Whereas complexity deals with questions such as:

How can we compare the efficiency of different algorithms?
What kind of problems can be solved efficiently?
How do we measure time and memory requirements?

Before we dive in, there are a few things we need to understand. Computers can solve a lot of problems. They can solve these problems practically or theoretically. A computer may be able to solve a problem theoretically but not practically.

A term that will be used frequently that we should be aware of is ‘Automata’. This is the study of abstract machines.

There are different models of computational devices. A Turing machine is one of them, it is a mathematical model of computation that defines an abstract machine. The Turing Machine was invented by Alan Turing in 1936 and it can be used to determine an output to a certain input using a predefined set of rules. Lambda calculus is a formal mathematical system which expresses computation. It can be used to simulate any Turing machine.

Problems

A problem specification is made up of a set of legal inputs to a problem and a set of desired outputs relating to the inputs. There are different kinds of problems in computability theory:

Decision Problem: This problem will return true or false.
Search Problem: Find something that meets a certain requirement.
Enumeration Problem: Find many things that meet as certain requirement.
Counting Problem: Count the number of things that satisfy a certain requirement.
Optimisation Problem: Find something that best satisfies a certain requirement.
Structuring Problems: Make something satisfy a certain requirement.

Reminder on Algorithms

Algorithms are designed to solve problems. An algorithm is a finite sequence of instructions to solve a problem, where each instruction is chosen from a set of well-defined instructions.

Formal Language

A formal language consists of words whose letters have been taken from an alphabet and are well formed. Being well formed means that they obey all the rules of grammar, and so a formal language is defined by a set of rules. These set of rules are called formal grammar.

A language is said to be a regular language if it can be defined with a Finite State Machine. A Finite State Machine is a computational model used to simulate sequential logic. It transitions between states and can only be in one state at a given time. I recommend this article to get an overview of Finite State Machines. Finite State Machines have limited memory and so cannot store strings. Therefore, any language that is non-regular cannot be defined by Finite State Machines as the language may require memory to represent itself.

A natural language like French or English is made up of words which in turn are made up of a set of characters from an alphabet. All words are strings that are made up of a sequence of characters but not all sequences of characters made up a valid string of a language. For example, ‘ball’ is a string in the English language but ‘lbal’ is not. We can call the strings that are in the language ‘grammatical’ and the ones that are not ‘ungrammatical’.

Definition: Automaton

An automata is a formal device that can function as a grammar. So it can define grammatical from non-grammatical strings in a language.

Strings, Strings, Strings

If we let A be a finite set of strings made up of an alphabet {a,b,c}, then ‘cab’ is a string on A. A is known as a vocabulary. The set A is made up of a finite set of characters and so is a finite set. Strings on A can be assumed to be of a finite length. The length of a string is the number of symbols it is made up of. For example, the length of the string ‘cab’ is 3. An important thing to remember is that strings of length one over A are different than a symbol that makes up A. So the string ‘a’ is different from the symbol a. A string can be empty and the empty string is denoted by E. Strings are thought to be identical if they are made up of the same symbols in the same order. A* is the set of all strings on A.

Operations on Strings

We can perform operations on strings. So we can reverse a string and this operation is written as x^R, where x is the string we want to reverse. So 'abc'^R = 'cba' . The empty string reversed is just the empty string: E^R = E. Strings can also be concatenated. This means we add them together and is denoted by ⌢. 'abc'⌢'cba' = 'abccba'.

Infinite sets can not be represented by listing all their elements. Instead, we can list the rules that define the set. For example for the set B = {5,7,9,11..}, b is an element of the set B if 2x + 1 = b for any natural number x.

Back to Formal Languages

Given an alphabet A, a language L is any subset of A*. There are an infinite number of languages that cannot be defined by a grammar. This means that are no or little rules defining the language. The study of formal languages is the study of languages that can be defined by a grammar.

Formal Grammar

So a formal grammar is a deductive system of axioms and rules of inference that generates sentences of a language as its’ theorems.

A grammar will usually contain an initial axiom S and then a finite set of rules v -> w where v and w are strings. v -> w means that v can be written as w. So if AB -> ED is a rule, then CAB can be written as CED.

Grammars have two alphabets that are disjoint. (Meaning they don’t overlap). These are terminals and non-terminals. Terminals are the elementary symbols of a language defined by formal grammar. These cannot be changed using the rules of grammar. Non-Terminals are replaced by groups of terminal symbols. A non-terminal can be changed using the rules of grammar. So if we have one rule i -> j. The i is a non-terminal as it can be changed using this rule. We don’t have a rule to change j to anything else, so therefore it is a terminal symbol. Strings on the left side of a rule must always contain one non-terminal. We usually use lowercase letters to represent terminals and uppercase letters to represent non-terminals.

Chomsky Hierarchy

The rules that make up a grammar determine the generative power of that language. This is the number of possible strings it can generate. Languages that belong to the same generative power are made up of rules of the same kind. The Chomsky Hierarchy classifies languages according to the kinds of ‘rewrite’ rules they contain. A production is a ‘rewrite’ rule that specifies a symbol substitution that can be performed recursively. At the top of the hierarchy lies the most general grammars that have the fewest restrictions on the rules they use.

Type 0 has only one restriction. That is, that at least one non-terminal must be on the left hand side of a rule. Type 0 grammars are called Unrestricted Writing Systems. Type 0 grammars generate recursively enumerable languages. Their productions have no restrictions.

Type 1 grammars are called context-sensitive grammars. Type 1 grammars are of the form αAβ → αψβ. Where ψ doesn’t equal E (Empty String). A way to describe this is that you can rewrite a string with another that is at least it’s length.

Type 2 grammars are of the form A → ψ. The left hand side of the rule can only have one variable and it is a non-terminal. Type 2 grammars generate context-free languages.

Type 3 grammars are of the form A → xB or A → x. They have a one non-terminal on the left hand side of the rule and the right hand side can consist of a string of terminals or a string of terminals ending in a single non-terminal. It is the most restricted form of grammar and generates regular languages. These are all languages that can be accepted by a Finite State Machine.

In this hierarchy, each type is built on one another. Type 3 grammars are also of type 2, 1 and 0. Type 2 are of type 1 and 0 and so on.

Type	Language
0	Recursively Enumerable
1	Context Sensitive
2	Context Free
3	Regular

It is important to note that this hierarchy is not usually applied to human spoken languages as the Pirahã language spoken by an Amazonian tribe in Brazil has been reported to be non-recursive which puts a definite hole in the theory.

To end…

So we have looked at an overview of computability, learned what a formal language is and studied the Chomsky Hierarchy. In the next article we will dive into Finite State Machines which we have slightly touched on here.

Computability