|
In this section the following notions are discussed: |
||||||||||||||||||
Purpose of a File Management |
||||||||||||||||||
The file manager handles all files on secondary storage media. To perform these
tasks, file management must:
|
||||||||||||||||||
File names, naming conventions |
||||||||||||||||||
| In order to make users, programs and the file manager itself able to identify the different files they must be given a unique file name. | ||||||||||||||||||
The relative file name is what a user normally recognises as file name;
it consists of a name and an extension, for instance problem.txt
or forloop.cpp. Apart from some exceptions, relative file names
look the same in all operating systems.
|
||||||||||||||||||
| The name is normally given by the user, whereas the extension (which is separated from the name by a dot) generally indicates what kind of file it is. | ||||||||||||||||||
|
common file extensions
|
||||||||||||||||||
|
The absolute file name is normally much longer than the user thinks it is.
Here, the relative file name is preceeded by the place on disk it is stored, that
is: the drive name and the directory names in which to find that file. So the absolute file name consists of:
|
||||||||||||||||||
For instance, a file with the relative name syllabus.doc,
saved by the user Peter in the directory
data would look like that
|
||||||||||||||||||
Note that the absolute file name changes when the location is different. The
relative file name, however, stays the same. So, after saving that file
on a floppy disk, the absolute file name of the backup would be
|
||||||||||||||||||
| A relative file name is restricted in length. How this restriction exactly looks like again depends on the OS. DOS has the hardest restictions, allowing the file name and also all directory names only to be 8 characters long, and the extension 3. This is properly known as "8.3"-restiction (speak: eight-dot-three). All other OS's allow the relative file name to be at least 14, but most often up to 255 characters long. | ||||||||||||||||||
File allocation on storage media |
||||||||||||||||||
| On the storage medium a file is saven in blocks (sectors) of equal size. To access these files, device manager and file manager work together: The device manager "knows" where to find each sector on disk, but only the file manager has a list telling in what sectors either file is stored. This list is the File Allocation Table (FAT) | ||||||||||||||||||
There are different ways of allocating files. The main concern is to provide a
strategy that lets the FAT not grow too large, that makes it possible to
retrieve a special sector of a file, and that wastes not too much storage space.
|
||||||||||||||||||
| Contiguous file allocation | ||||||||||||||||||
| With contiguous file allocation a single set of blocks is allocated to a file at the time of file creation. Each file is stored contiguously, one sector after another. | ||||||||||||||||||
The advantage is that the FAT only has to have a single entry for each file,
indicating the name, the start sector, and the length. Moreover, it is easy
to get a single block because its address can simply be calculated: If a file
starts at sector c, and the nth
block is wanted, the location on secondary storage is simply
c+n.
|
||||||||||||||||||
| The disadvantage is that it may be difficult (if not impossible) to find a sufficiently large set of contiguous blocks. From time to time it will be neccessary to perform compaction. | ||||||||||||||||||
|
Contiguous file allocation is nowadays only used for tapes and recordable CDs.
One does not make use of compaction algorithms, though, because data there
is not supposed to be changed. It is rather overwritten/thrown away if
no longer needed.
|
||||||||||||||||||
| Non-contiguous file allocation (FAT) | ||||||||||||||||||
| With non-contiguous file allocation all blocks of a file can be distributed all over the storage medium. The File Allocation Table (FAT) lists not only all files, but has an entry for each sector the file occupies. Because all information is stored in the FAT, and no assumption on the distribution of the file is taken, this method of allocation is sometimes simply called FAT. | ||||||||||||||||||
| The advantage is that it is very easy to get a single block, because each block has its entry in the FAT. Additionally, it is a very simple allocation method where not much overhead is produced and no sophisticated search method for free blocks is needed. | ||||||||||||||||||
| The disadvantage is that the FAT can and will grow to an enormous size (imagine that a file of 1MB size must have 2000 entries in the FAT if each sector stores 512 Bytes of data). That slows the system down. Compaction will be needed from time to time. | ||||||||||||||||||
|
FAT has been in use under DOS for a long time, and some alternations of it are
still used by Win95 and Win98.
|
||||||||||||||||||
| Chained allocation | ||||||||||||||||||
| With chained file allocation only the first blocks of either file gets an entry in the FAT, and this first sector has got a pointer at its end that points to the next sector of it (or indicates that it was the last). | ||||||||||||||||||
| The advantage is again that the FAT only has to have a single entry for each file, indicating file name and position of the first sector. The files do not have to be stored contiguously. | ||||||||||||||||||
|
The disadvantage is it takes very long to retrieve a single block because that
information is neither stored nor can it be calculated. If a special sector is
needed, all preceeding sectors have to be read, all the time in order to get
information about where the next block is located.
|
||||||||||||||||||
| Indexed allocation | ||||||||||||||||||
| With indexed file allocation also only the first blocks of either file get an entry in the FAT. In this first sector, however, no data is stored but only pointers to where the file is on storage medium. That is why the first block is called the index block. | ||||||||||||||||||
| Here as well the FAT only has to have a single entry for each file, indicating file name and position of the first sector. Additionally, it is easy to retrieve a single block because the information about where it is stored is saved in the first block. | ||||||||||||||||||
| The disadvantage is that for each file an additional sector is needed. Even a very small file always occupies at least two blocks, where the data would easy fit in one. So some of the secondary storage space is wasted. | ||||||||||||||||||
| Indexed allocation is (in minor variations) implemented in all UNIXes. It is fast and reliable, and nowadays the waste of storage space does not matter so much anymore. | ||||||||||||||||||
Compaction |
||||||||||||||||||
| After many I/O operations files on a hard disk are usually distributed over many tracks. That slows the hdd's speed own, as it must reposition the head more often than normally. | ||||||||||||||||||
|
During compaction the file manager tells the device manager which sector's data
logically belong together (as files), and the device manager then
exchanges the sector's contents in a way that logically connected data are
stored in neighbouring sectors. The way to do that is called compaction
algorithm
|
||||||||||||||||||
| Example: file distribution on a hard disk before compaction | ||||||||||||||||||
|
||||||||||||||||||
| The same hard disk after compaction | ||||||||||||||||||
|