One or many Filesystems?

The question of whether to use a single filesystem or many filesystems comes up often in newsgroup comp.unix.solaris. My own preference is to go for several filesystems on servers, and I also do the same on stand-alone systems. Here are my reasons.

Stand-alone (home) machine

If you need to reinstall Solaris, you will lose everything in the filesystems containing /, /usr, and /var. For this reason, I strongly suggest you keep all modifications in these filesystems to a minimum.

Local additions to systems often go in /usr/local, but I suggest you put them in /opt/local instead, with a symbolic link from /usr/local if you so wish.

I'm a great believer in read-only filesystems, and /usr and /opt can be made read-only, so I do so. This very significantly reduces chances of corruption, and they won't need fscking if system goes down unexpectedly.

Keep your own working data on a separate filesystem. You will probably want to back this up more frequently than the operating system areas. You will be able to preserve it when upgrading, and for this reason I suggest you put it at the far end of the disk.

Server machine

All the points above are relevant to a server. However, there are several additional points to bear in mind too. These are the points I consider for the server I manage.

Different archiving strategies are usually appropriate for different data, depending on the frequency of change, commercial value (consequence of loss), ease of re-creation by other means, etc. Archiving frequency for different filesystems varies from daily for development and clerical work, weekly or monthly for other areas, through to never for areas which are copies of CD-ROMs or only hold binaries which can simply be rebuilt. This is all done with ufsdump. With one filesystem, I would have to archive all data with the archiving strategy appropriate to the most critical data, which in my case would be over twice as much archiving daily (twice as many tapes, half the life of the tape drives, etc).

Big slices take a long time to restore from archive. I try to keep critical data in slices no bigger than 2Gb, since at 3 hours to restore, that's about the longest downtime I'm prepared to accept for a filesystem. I use larger slices where loss of data would not be critical for a longer period. 9Gb would take pretty much a day to recover. (These timings were before I found fastfs to turn off synchronous metadata writes during restores.)

A random hit on my disks is a lot less likely to make whole machine unusable. Usually, only a few people are effected by the loss, with everyone else able to carry on working whilst I repair the effected filesystem, and the time take to recover their data is significantly less than the time it would take to recover everyone's data. Hits in system critical areas are also faster to fix, because those areas are smaller. On one occasion, 64 inodes in my /var got zeroed somehow (only time that's ever happened glad to say); downtime was about 10 minutes for me to newfs /var and restore the 100Mb of it from tape. If I had one big 2Gb filesystem, downtime would have been nearer 3 hours, and users would have lost up to a days work (back to the last archive); no one lost any work with /var being a separate filesystem.

Each slice is tuned differently, depending on use. In particular, the bytes per inode across all our filesystems varies from 2048 to 200000 at the extremes.

One project group suddenly filling one of their filing systems does not prevent all other project groups working.

Starting out

When I started out, I did some dummy installs to see how much space was required. Here is df -k output on my home machine, which might give you some helpful hints:
     Filesystem            kbytes    used   avail capacity  Mounted on
     /dev/dsk/c1t2d0s0      19190   10740    6540    63%    /
     /dev/dsk/c1t2d0s3     282104  218971   34923    87%    /usr
     /proc                      0       0       0     0%    /proc
     fd                         0       0       0     0%    /dev/fd
     /dev/dsk/c1t2d0s4      57593   36073   15770    70%    /var
     /dev/dsk/c1t2d0s6     252545  191040   36255    85%    /opt
     /dev/dsk/c1t2d0s5     115789   46813   68976    41%    /home
     /dev/dsk/c1t2d0s7      86918   60068   26850    70%    /var/spool/news
     /dev/dsk/c1t2d0s10    115480  112304    3176    98%    /src
     swap                   37104     248   36856     1%    /tmp
This system is not used for any large applications (or swap would need to be bigger). /opt has Sun's C and Java software development tools, the odd Answerbook, my /opt/local odds and ends - adjust size to suit your own requirements.

I also used the opportunity to practice archiving, by making archives of all the bits of the system, and then deliberately destroying various filing systems like / and /usr, and recovering from the archives. Another contingency I practiced was completely loosing the disc with / and /usr - a ufsdump is not useful by itself, and a reinstall takes too long. My solution was to write a pair of scripts to archive and restore the essential filesystems, and this is now available as sysbkup in Alpha form from my Solaris page.

It is well worth playing with a system in this way before it goes live.


© 1997 Andrew Gabriel. All Rights Reserved. / Last revision 28 March 1997