How to learn pascal easily
This document will explain the basics about compilers as well as provide links to well-known Pascal compilers and explain how to set up Free Pascal.
About Computer Languages and Compilers
When talking about computer languages, there are basically three major terms that will be used.
- Machine language -- actual binary code that gives basic instructions to the computer's CPU. These are usually very simple commands like adding two numbers or moving data from one memory location to another.
- Assembly language -- a way for humans to program computers directly without memorizing strings of binary numbers. There is a one-to-one correspondance with machine code. For example, in Intel x86 machine language, ADD and MOV are mnemonics for the addition and move operations.
- High-level language -- permits humans to write complex programs without going step-by step. High-level languages include Pascal, C, C++, FORTRAN, Java, BASIC, and many more. One command in a high-level language, like writing a string to a file, may translate to dozens or even hundreds of machine language instructions.
Microprocessors can only run machine language programs directly. Assembly language programs are assembled, or translated into machine language. Likewise, programs written in high-level languages, like Pascal, must also be translated into machine language before they can be run. To do this translation is to compile a program.
The program that accomplishes the translation is called a compiler. This program is rather complex since it not only creates machine language instructions from lines of code, but often also optimizes the code to run faster, adds error-correction code, and links the code with subroutines stored elsewhere. For example, when you tell the computer to print something to the screen, the compiler translates this as a call to a pre-written module. Your code must then be linked to the code that the compiler manufacturer provides before an executable program results.
With high-level languages, there are again three basic terms to remember:
- Source code -- the code that you write. This typically has an extension that indicates the language used. For example, Pascal source code usually ends in ".pas" and C++ code usually ends in ".cpp"
- Object code -- the result of compiling. Object code usually includes only one module of a program, and cannot be run yet since it is incomplete. On DOS/Windows systems, this usually has an extension of ".obj"
- Executable code -- the end result. All the object code modules necessary for a program to function are linked together. On DOS/Windows systems, this usually has an extension of ".exe"
More About Compilers
The de facto standard in DOS and Windows-based compilers is Borland Pascal. Before it came out, most Pascal compilers were clumsy and slow, strayed from the Pascal standard, and cost several hundred dollars. In 1984, Borland introduced Turbo Pascal, which sold for less than $100, compiled an order of magnitude faster than existing compilers, and came with an abundance of source code and utility programs.
This product was a great success and was prominent for almost a decade. But in the 1990s, the world was moving to Windows. In 1993, the last version of Turbo Pascal, version 7 for DOS, came out. After that, the demand for DOS programs plummetted and Borland (renamed Inprise, then back to Borland) focused on producing Windows compilers.
This tutorial will only deal with console-based programming, where the computer prints lines of data to the screen and the user interacts with the program using a keyboard. The goal of the tutorial is to teach how to program in Pascal. Once you've learned that, you can easily look at a reference book or another web page and pick up graphics and windowing systems on your own.
Although old commercial Pascal compilers are often available for download, Turbo Pascal 5.5 from the Borland Museum and Symantec Think Pascal (Macintosh) linked from The Free Country's Free Pascal Compiler List, computers have progressed much since the 1980s and early 1990s. We are no longer stuck with 8.3 filenames on DOS or non-preemptive multitasking on Mac OS. Using an old compiler is fun in the same sense as playing an old game on an emulator, but the open source movement has produced good compilers for modern operating systems, and a beginner will find it much easier to use those.
Open Source Compilers
The two main open-source compiler projects are:
Free Pascal is generally considered friendlier for novices, and strives to emulate Borland Pascal in many ways, though both will serve fine for learning Pascal.
As most users of this tutorial will be running Windows, here's how to set up Free Pascal and get to the point where you're compiling a program on a modern Windows operating system:
- Download the Win32 version of Free Pascal from the Free Pascal download page. I recommend the most complete version, with a name of w32####full.zip where #### is the version number.
- When the download is done, open up Windows Explorer and locate w32####full.zip. Right-click on the file and select "Extract All ..." Go through the wizard to extract all the files out of this ZIP archive. Checking "Show extracted files" on the last page of the wizard will give you a head start on the next step.
- Now go to where you've extracted the files, and double-click on install.exe. If you're on Windows XP SP2, allow the program to Run if you get a warning.
- The installer will launch in a console window. You can accept the default install location for the compiler, or type in something different, say c:\Program Files\Free Pascal Compiler.
- The last screen of the install explains how to set the PATH and start the program
Often, very novice programmers have grown up in the world of windowing operating systems and will not know how do deal with PATHs and command prompts. In this case, you probably don't want to use the command-line compiler, but rather use the IDE (Integrated Development Environment). So, to get to the point of compiling a program:
- Create a shortcut. The IDE is located in the folder where you installed Free Pascal, under the bin and then the win32 subdirectories, and is called fp.exe. For added convenience, right-click the shortcut when done, select Properties, and set the startup directory to the place where you want to save your programs.
- Open Free Pascal using the shortcut.
- Type in a program (flip to the next lesson to get a "Hello, world." program).
- Save the file with File-Save As ...
- Run the program from the Run menu. This will automatically compile the program if you've made any changes, then run the program. It will also run the program without compiling if you've not made any changes since the last time you compiled.
With programs that don't expect user input, you'll see the program flash on a black screen. But the program completes in the blink of an eye and you are returned to the IDE without seeing the results of your work. There are two ways around this:
- Select User screen from the Debugmenu to see the results of the program.
- Add a readln statement at the end of every program. This will make the program wait for the user to press the Enter key before the program ends and returns to the IDE.
Note that a .exe file was created in the directory where you saved your program. This is the executable. You can go to the Command Prompt, change to the directory, and run this executable straight. You can also double-click on it in Windows Explorer (and it will still flash by quickly if it ends without requiring user input).
In the short history of computer programming, one enduring tradition is that the first program in a new language is a "Hello, world" to the screen. So let's do that. Copy and paste the program below into your IDE or text editor, then compile and run it.
If you have no idea how to do this, return to the Table of Contents. Earlier lessons explain what a compiler is, give links to downloadable compilers, and walk you through the installation of an open-source Pascal compiler on Windows.
program Hello; begin (* Main *) writeln ('Hello, world.') end. (* Main *)
The output on your screen should look like:
Hello, world.
The basic structure of a Pascal program is:
PROGRAM ProgramName (FileList); CONST (* Constant declarations *) TYPE (* Type declarations *) VAR (* Variable declarations *) (* Subprogram definitions *) BEGIN (* Executable statements *) END.
The elements of a program must be in the correct order, though some may be omitted if not needed. Here's a program that does nothing, but has all the required elements:
program DoNothing; begin end.
Comments are portions of the code which do not compile or execute. Pascal comments start with a (* and end with a *). You cannot nest comments:
(* (* *) *)
will yield an error because the compiler matches the first (* with the first *), ignoring the second (* which is between the first set of comment markers. The second *) is left without its matching (*. This problem with begin-end comment markers is one reason why many languages use line-based commenting systems.
(* (* *) *)
will yield an error because the compiler matches the first (* with the first *), ignoring the second (* which is between the first set of comment markers. The second *) is left without its matching (*. This problem with begin-end comment markers is one reason why many languages use line-based commenting systems.
Turbo Pascal and most other modern compilers support brace comments, such as {Comment}. The opening brace signifies the beginning of a block of comments, and the ending brace signifies the end of a block of comments. Brace comments are also used for compiler directives.
Commenting makes your code easier to understand. If you write your code without comments, you may come back to it weeks, months, or years later without a guide to why you coded the program that way. In particular, you may want to document the major design of your program and insert comments in your code when you deviate from that design for a good reason.
In addition, comments are often used to take problematic code out of action without deleting it. Remember the earlier restriction on nesting comments? It just so happens that braces {} take precedence over parentheses-stars (* *). You will not get an error if you do this:
{ (* Comment *) }
Whitespace (spaces, tabs, and end-of-lines) are ignored by the Pascal compiler unless they are inside a literal string. However, to make your program readable by human beings, you should indent your statements and put separate statements on separate lines. Indentation is often an expression of individuality by programmers, but collaborative projects usually select one common style to allow everyone to work from the same page.
Identifiers are names that allow you to reference stored values, such as variables and constants. Also, every program and unit must be named by an identifier.
Rules for identifiers:
- Must begin with a letter from the English alphabet.
- Can be followed by alphanumeric characters (alphabetic characters and numerals) and possibly the underscore (_).
- May not contain certain special characters, many of which have special meanings in Pascal.
~ ! @ # $ % ^ & * ( ) + ` - = { } [ ] : " ; ' < > ? , . / |
Different implementations of Pascal differ in their rules on special characters. Note that the underscore character (_) is usually allowed.
Several identifiers are reserved in Pascal as syntactical elements. You are not allowed to use these for your identifiers. These include but are not limited to:
and array begin case const div do downtoelse end file for forward function goto if inlabelmod nil not of or packed procedureprogram record repeat set then to type untilvar whilewith
Modern Pascal compilers ship with much functionality in the API (Application Programming Interfaces). For example, there may be one unit for handling graphics (e.g. drawing lines) and another for mathematics. Unlike newer languages such as C# and Java, Pascal does not provide a classification system for identifiers in the form of namespaces. So each unit that you use may define some identifiers (say DrawLine) which you can no longer use. Pascal includes a system unit which is automatically used by all programs. This provides baseline functionality such as rounding to integer and calculating logarithms. The system unit varies among compilers, so check your documentation. Here is the system unit documentation for Free Pascal Compiler.
Pascal is not case sensitive! (It was created in the days when all-uppercase computers were common.) MyProgram, MYPROGRAM, and mYpRoGrAm are equivalent. But for readability purposes, it is a good idea to use meaningful capitalization. Most programmers will be on the safe side by never using two capitalizations of the same identifiers for different purposes, regardless of whether or not the language they're using is case-sensitive. This reduces confusion and increases productivity.
Identifiers can be any length, but some Pascal compilers will only look at the first several characters. One usually does not push the rules with extremely long identifiers or loads of special characters, since it makes the program harder to type for the programmer. Also, since most programmers work with many different languages, each with different rules about special characters and case-sensitivity, it is usually best to stick with alphanumeric characters and the underscore character.
Constants are referenced by identifiers, and can be assigned one value at the beginning of the program. The value stored in a constant cannot be changed.
Constants are defined in the constant section of the program:
const Identifier1 = value; Identifier2 = value; Identifier3 = value;
For example, let's define some constants of various data types: strings, characters, integers, reals, and Booleans. These data types will be further explained in the next section.
const Name = 'Tao Yue'; FirstLetter = 'a'; Year = 1997; pi = 3.1415926535897932; UsingNCSAMosaic = TRUE;
Note that in Pascal, characters are enclosed in single quotes, or apostrophes (')! This contrasts with newer languages which often use or allow double quotes or Heredoc notation. Standard Pascal does not use or allow double quotes to mark characters or strings.
Constants are useful for defining a value which is used throughout your program but may change in the future. Instead of changing every instance of the value, you can change just the constant definition.
Typed constants force a constant to be of a particular data type. For example,
const a : real = 12;
would yield an identifier a which contains a real value 12.0 instead of the integer value 12.
Variables are similar to constants, but their values can be changed as the program runs. Variables must first be declared in Pascal before they can be used:
var IdentifierList1 : DataType1; IdentifierList2 : DataType2; IdentifierList3 : DataType3; ...
IdentifierList is a series of identifiers, separated by commas (,). All identifiers in the list are declared as being of the same data type.
The basic data types in Pascal include:
- integer
- real
- char
- Boolean
Standard Pascal does not make provision for the string data type, but most modern compilers do. Experienced Pascal programmers also use pointers for dynamic memory allocation, objects for object-oriented programming, and many others, but this gets you started.
More information on Pascal data types:
- The integer data type can contain integers from -32768 to 32767. This is the signed range that can be stored in a 16-bit word, and is a legacy of the era when 16-bit CPUs were common. For backward compatibility purposes, a 32-bit signed integer is a longint and can hold a much greater range of values.
- The real data type has a range from 3.4x10-38 to 3.4x1038, in addition to the same range on the negative side. Real values are stored inside the computer similarly to scientific notation, with a mantissa and exponent, with some complications. In Pascal, you can express real values in your code in either fixed-point notation or in scientific notation, with the character E separating the mantissa from the exponent. Thus,
452.13 is the same as 4.5213e2 - The char data type holds characters. Be sure to enclose them in single quotes, like so: 'a' 'B' '+' Standard Pascal uses 8-bit characters, not 16-bits, so Unicode, which is used to represent all the world's language sets in one UNIfied CODE system, is not supported.
- The Boolean data type can have only two values:
TRUE and FALSE
Once you have declared a variable, you can store values in it. This is called assignment.
To assign a value to a variable, follow this syntax:
variable_name := expression;
Note that unlike other languages, whose assignment operator is just an equals sign, Pascal uses a colon followed by an equals sign, similarly to how it's done in most computer algebra systems.
The expression can either be a single value:
some_real := 385.385837;
or it can be an arithmetic sequence:
some_real := 37573.5 * 37593 + 385.8 / 367.1;
The arithmetic operators in Pascal are:
Operator
|
Operation
|
Operands
|
Result
|
+
|
Addition or unary positive
|
real or integer
|
real or integer
|
-
|
Subtraction or unary negative
|
real or integer
|
real or integer
|
*
|
Multiplication
|
real or integer
|
real or integer
|
/
|
Real division
|
real or integer
|
real
|
div
|
Integer division
|
integer
|
integer
|
mod
|
Modulus (remainder division)
|
integer
|
integer
|
div and mod only work on integers. / works on both reals and integers but will always yield a real answer. The other operations work on both reals and integers. When mixing integers and reals, the result will always be a real since data loss would result otherwise. This is why Pascal uses two different operations for division and integer division. 7 / 2 = 3.5 (real), but 7 div 2 = 3 (and 7 mod 2 = 1 since that's the remainder).
Each variable can only be assigned a value that is of the same data type. Thus, you cannot assign a real value to an integer variable. However, certain data types will convert to a higher data type. This is most often done when assigning integer values to real variables. Suppose you had this variable declaration section:
var some_int : integer; some_real : real;
When the following block of statements executes,
some_int := 375; some_real := some_int;
some_real will have a value of 375.0.
Changing one data type to another is referred to as typecasting. Modern Pascal compilers support explicit typecasting in the manner of C, with a slightly different syntax. However, typecasting is usually used in low-level situations and in connection with object-oriented programming, and a beginning programming student will not need to use it. Here is information on typecasting from the GNU Pascal manual.
In Pascal, the minus sign can be used to make a value negative. The plus sign can also be used to make a value positive, but is typically left out since values default to positive.
Do not attempt to use two operators side by side, like in:
some_real := 37.5 * -2;
This may make perfect sense to you, since you're trying to multiply by negative-2. However, Pascal will be confused — it won't know whether to multiply or subtract. You can avoid this by using parentheses to clarify:
some_real := 37.5 * (-2);
The computer follows an order of operations similar to the one that you follow when you do arithmetic. Multiplication and division (* / div mod) come before addition and subtraction (+ -), and parentheses always take precedence. So, for example, the value of: 3.5*(2+3) will be 17.5.
Pascal cannot perform standard arithmetic operations on Booleans. There is a special set of Boolean operations. Also, you should not perform arithmetic operations on characters.
Pascal has several standard mathematical functions that you can utilize. For example, to find the value of sin of pi radians:
value := sin (3.1415926535897932);
Note that the sin function operates on angular measure stated in radians, as do all the trigonometric functions. If everything goes well, value should become 0.
Functions are called by using the function name followed by the argument(s) in parentheses. Standard Pascal functions include:
Function Name
|
Description
|
Argument type
|
Return type
|
abs
|
absolute value
|
real or integer
|
same as argument type
|
arctan
|
arctan in radians
|
real or integer
|
real
|
cos
|
cosine of a radian measure
|
real or integer
|
real
|
exp
|
e to the given power
|
real or integer
|
real
|
ln
|
natural logarithm
|
real or integer
|
real
|
round
|
round to nearest integer
|
real
|
integer
|
sin
|
sin of a radian measure
|
real or integer
|
real
|
sqr
|
square (power 2)
|
real or integer
|
same as argument type
|
sqrt
|
square root (power 1/2)
|
real or integer
|
real
|
trunc
|
truncate (round down)
|
real or integer
|
integer
|
For ordinal data types (integer or char), where the allowable values have a distinct predecessor and successor, you can use these functions:
Function
|
Description
|
Argument type
|
Return type
|
chr
|
character with given ASCII value
|
integer
|
char
|
ord
|
ordinal value
|
integer or char
|
integer
|
pred
|
predecessor
|
integer or char
|
same as argument type
|
succ
|
successor
|
integer or char
|
same as argument type
|
Real is not an ordinal data type! That's because it has no distinct successor or predecessor. What is the successor of 56.0? Is it 56.1, 56.01, 56.001, 56.0001?
However, for an integer 56, there is a distinct predecessor — 55 — and a distinct successor — 57.
The same is true of characters:
'b' Successor: 'c' Predecessor: 'a'
The above is not an exhaustive list, as modern Pascal compilers include thousands of functions for all sorts of purposes. Check your compiler documentation for more.
Since Pascal ignores end-of-lines and spaces, punctuation is needed to tell the compiler when a statement ends.
You must have a semicolon following:
- the program heading
- each constant definition
- each variable declaration
- each type definition (to be discussed later)
- almost all statements
The last statement in a BEGIN-END block, the one immediately preceding the END, does not require a semicolon. However, it's harmless to add one, and it saves you from having to add a semicolon if suddenly you had to move the statement higher up.
Indenting is not required. However, it is of great use for the programmer, since it helps to make the program clearer. If you wanted to, you could have a program look like this:
program Stupid; const a=5; b=385.3; var alpha,beta:real; begin alpha := a + b; beta:= b / a end.
But it's much better for it to look like this:
program NotAsStupid; const a = 5; b = 385.3; var alpha, beta : real; begin (* main *) alpha := a + b; beta := b / a end. (* main *)
In general, indent each block. Skip a line between blocks (such as between the const and var blocks). Modern programming environments (IDE, or Integrated Development Environment) understand Pascal syntax and will oten indent for you as you type. You can customize the indentation to your liking (display a tab as three spaces or four?).
Proper indentation makes it much easier to determine how code works, but is vastly aided by judicious commenting.
Now you know how to use variables and change their value. Ready for your first programming assignment?
But there's one small problem: you haven't yet learned how to display data to the screen! How are you going to know whether or not the program works if all that information is still stored in memory and not displayed on the screen?
So, to get you started, here's a snippet from the next few lessons. To display data, use:
writeln (argument_list);
The argument list is composed of either strings or variable names separated by commas. An example is:
writeln ('Sum = ', sum);
Here's the programming assignment for Chapter 1:
Find the sum and average of five integers. The sum should be an integer, and the average should be real. The five numbers are: 45, 7, 68, 2, and 34.
Use a constant to signify the number of integers handled by the program, i.e. define a constant as having the value 5.
Then print it all out! The output should look something like this:
Number of integers = 5 Number1 = 45 Number2 = 7 Number3 = 68 Number4 = 2 Number5 = 34 Sum = 156 Average = 3.1200000000E+01
As you can see, the default output method for real numbers is scientific notation. Chapter 2 will explain you how to format it to fixed-point decimal.
Input is what comes into the program. It can be from the keyboard, the mouse, a file on disk, a scanner, a joystick, etc.
We will not get into mouse input in detail, because that syntax differs from machine to machine. In addition, today's event-driven windowing operating systems usually handle mouse input for you.
The basic format for reading in data is:
read (Variable_List);
Variable_List is a series of variable identifiers separated by commas.
read treats input as a stream of characters, with lines separated by a special end-of-line character. readln, on the other hand, will skip to the next line after reading a value, by automatically moving past the next end-of-line character:
readln (Variable_List);
Suppose you had this input from the user, and a, b, c, and d were all integers.
45 97 3 1 2 3
If we executed one of the following statements, this is what would be stored in the appropriate variables.
Statement(s)
|
a
|
b
|
c
|
d
|
read (a);
read (b); |
45
|
97
| ||
readln (a);
read (b); |
45
|
1
| ||
read (a, b, c, d);
|
45
|
97
|
3
|
1
|
readln (a, b);
readln (c, d); |
45
|
97
|
1
|
2
|
When reading in integers, all spaces are skipped until a numeral is found. Then all subsequent numberals are read, until a non-numeric character is reached (including, but not limited to, a space).
8352.38
When an integer is read from the above input, its value becomes 8352. If, immediately afterwards, you read in a character, the value would be '.' since the read head stopped at the first alphanumeric character.
Suppose you tried to read in two integers. That would not work, because when the computer looks for data to fill the second variable, it sees the '.' and stops since it couldn't find any data to read.
With real values, the computer also skips spaces and then reads as much as can be read. However, there is one restriction: a real that has no whole part must begin with 0. So .678 is invalid, and the computer can't read in a real, but 0.678 is fine.
Make sure that all identifiers in the argument list refer to variables! Constants cannot be assigned a value, and neither can literal values.
For writing data to the screen, there are also two statements, one of which you've seen already in last chapter's programming assignment:
write (Argument_List);
writeln (Argument_List);
writeln (Argument_List);
The writeln statement skips to the next line when done.
You can use strings in the argument list, either constants or literal values. If you want to display an apostrophe within a string, use two consecutive apostrophes. Displaying two consecutive apostrophes would then requires you to use four. This use of a special sequence to refer to a special character is called escaping, and allows you to refer to any character even if there is no key for it on the keyboard.
No comments