Translators - Compilers and Interpreters

In the late 1950's, the first and second generations of computers were becoming quite popular, but further expansion was hampered by the difficulty of writing machine language and assembly language programs.  Thus, High Level Languages were invented.

High Level Languages

An HLL allows the programmer to write commands that are similar to natural language (usually English).  It would be very expensive to create CPUs that understand English (or Java), so the source code must be translated into (native) machine code.  The result is stored as an executable file, which can be loaded directly into the memory and run.  (Note:  Java cheats a bit, produced Java Byte Codes, which are not actually executed directly by the CPU - more about this later).  By making programming easier, the creation of HLLs allowed computers to be programmed by a lot more people - especially engineers and scientists - without much training.  This fuelled widespread commercial use of computers and we entered the third generation of the computer revolution.  Hence HLLs are sometimes called 3rd Generation Languages.

Syntax and Semantics

As these new languages were being created, there was a large range of choices available for how to write commands.  

Here are examples of counting to 10 in various languages:

Fortran

Basic

Pascal

C++

do  i  =  1 , 10
  print  * ,  i
enddo
print  * , 'Finished'

10  X = 1
20  PRINT X
30  X = X + 1
30  IF X <= 10 GOTO 20
40  PRINT "Finished"

for num := 1 to 10 do
begin
    writeln(num);
end;
writeln('Finished');

for (c = 1; c <= 10; c++)
{   cout << c; }
cout << "Finished";

These languages are all from the same "family" - they are quite similar to each other.  These were all intended as scientific and/or general purpose languages.

FORTRAN stands for Formula Translation.  It was the first HLL to become popular (in the late 1950's).

BASIC stands for Beginners All-purpose Symbolic Instruction Code.  It was intended for ... BEGINNERS.  It wasn't very efficient (e.g. fast) and didn't allow good programmers to control memory usage very well, so it never become a "serious" language ... UNTIL Microsoft invented Visual Basic and integrated it into their Office software (Word, Excel, etc).  Then Visual Basic became a very popular tool for business programmers, and is probably the most popular language world-wide if you count the number of programmers using it.

Pascal was invented by Niklas Wirth and named after the famous mathematician Blaise Pascal.  Pascal was purposely created as a structured language (using begin..end to mark blocks of code).  Wirth intended to make a good learning language for college Computer Science students.  Structural requirements were supposed to force students to "do it right", rather than making spaghetti code like lots of Basic programmers did.  Pascal eventually developed into Delphi, which is a very popular RAD (Rapid Application Development) tool.  It makes faster programs than Visual Basic, but is more difficult to learn.

C++ has a rather strange name.  It was based on C, the programming language used to develop Unix (the professional version of Linux, but much older).  C was developed by programmers to be used by programmers.  So they built in lots of short-cuts, to make the code more compact.  The ++ was added when C was turned into an Object Oriented Language. (More about OOP later).  C++ was very popular during the past 20 years (and still is).  It gives programmers low-level, direct control of hardware and memory management.  Although programmers see that as a huge advantage, it leads to buggy programs with lots of memory leaks.

Java was invented around 1995.  It's syntax was based on C++, so that programmers could easily learn the new language.  But Java's keywords and classes are quite different than C++, so it's semantic content is quite different.  Java was created to solve 3 large problems in the software industry:

  1. Guarantee ROBUST (reliable) programs
    It was just too common to write buggy software in C++, and lots of it got sold.  Java implements a number of programming restrictions that force programs to write programs "correctly" (whatever that is).
  2. Improve portability
    C++ programs had to be changed slightly and recompiled to get them running on Unix or Windows or MacOS or a different operating system.  After compiling, Java programs are supposed to "run anywhere", as long as there is a Java Virtual Machine available.  "Run anywhere" works pretty well, but native applications (C++, Visual Basic, Delphi) are still faster and better.
  3. Enable applications to run in the World-Wide-Web
    All earlier programming languages were intended for stand-alone computers (either PCs or mainframes).  They executed only in the local environment - meaning the memory and storage available in the computer itself.  Java enables distributed processing - that means a program can run in pieces on a bunch of different computers, connected by a network.

Although C++ and Java are quite large and sophisticated, they are still 3rd generation languages, requiring unnatural syntax and peculiar words.  Programmers still require a lengthy, specialized education - programming is not a "turn-key" operation.  The world needs something better.

Coming "Real Soon Now" - 4GL

Fourth Generation Languages should be more natural than 3GLs.  They should be very similar to "normal" English (or some other language).  They should be fault tolerant - if you makes a grammar or speling  miztake, it should still be possible for the computer to run the program.  You should not be required to break down your thoughts into tiny little pieces - you should have BIG POWERFUL COMMANDS, like :  

Maybe you shouldn't need to type commands at all - why can't the computer just listen to what you say and "get on with it" - like the computers in Star Trek?

 Some of these sound more achievable than others.  We were supposed to have 4GL languages around 1990, then it was 1995, then 2000, and there are still none in sight.  Japan made a huge investment in developing 4GLs, but nothing tangible came out of it - except Aibo and Furby and PS3 and Nintendo and .... well, well, maybe we did get something out of it.   Most professional software development is still done in Visual Basic, C++, or Java.


Compilers and Interpreters

A compiler reads a source code file and searches for syntax errors.  If it finds any, it prints out a list so the programmer can fix them.  But if there are no errors, it translates the entire program into executable machine code and stores these numbers (bytes) in an output file.  Then the user (or programmer) can load the program and execute it.

An interpreter works differently.  It start by running the program, before it ever takes a look at it.  Then it fetches the first command, figures out what it means, and executes it.  Then it moves on to the next command.  If it ever finds a syntax error, it stops executing and prints an error message.  This is very convenient for a programmer, but allows a risk that an end-user might be running the program, just about to save a bunch of data, and the interpreter encounters a silly syntax error, crashes, and loses all the data.

The most common example of an interpreter is the JavaScript interpreter contained in your browser, which executes JavaScript found in web-pages.  You may have seen a JavaScript error message when surfing the web.  This isn't so bad, because you don't use web-pages to do your work.  But interpreted languages in a mission critical environment (businesses, airports, etc) are definitely a bad idea.