All 2 entries tagged Zfs

No other Warwick Blogs use the tag Zfs on entries | View entries tagged Zfs at Technorati | There are no images tagged Zfs on this blog

September 17, 2010

automated zfs snapshot of zone test

Writing about web page

While I was working on a perl script for automating ZFS snapshots I happened to discover the automatic ZFS snapshot service for opensolaris by Tim Foster.

Which is, of course, exactly what I need – the problem is that I need it to run on Solaris 10, for which it is not currently available. However, this thread on Tim’s blog (especially the comments) suggest that it is possible to get the service working, without the time slider element on Solaris 10:

So, I attempted yesterday to get this installed, compiled into an SVR4 package with pkgmk and pktrans and install it on Solaris 10. It was remarkably straightforward and early testing shows it to work very well.

First, get the ‘bits’ and make the pkg as described on Tim’s blog, using hg to clone and make to create the pkg tree.

$ hg clone ssh://

Before actually creating the package, there are 3 changes as detailed by Eli Kleinamm, again on the blog comment :

These changes are made in

# vi zfs-auto-snapshot


1) The shell to run in /usr/dt/bin/dtksh

2) Remove on line 511 and 517 the -o com.sun:auto-snapshot-desc=”$EVENT”,


3)Remove the space/tab on line 970 between $SWAPVOLS and
 SWAPVOLS=”$SWAPVOLS$(echo $swap | sed -e ’s#/dev/zvol/dsk/##’)”

Once done, run make – it should create a pkg format file tree for you in the current working directory – read the Makefile, its not long.

Next, create the pkg datastream for portability

$ pkgtrans -s . /var/spool/pkg/SUNWzfs-auto-snapshot.pkg SUNWzfs-auto-snapshot
Transferring <SUNWzfs-auto-snapshot> package instance

This will now simply pkgadd to a Solaris 10 system. I can’t help but think this is particularly awesome, but as a colleague pointed out – this now really has to go through some very extensive testing before any rollout. I still think its a lot better than my dangerous perl scripting attempts.

This will now (if I can get agreement internally) be added as a package to the standard build for all Solaris 10 hosts using ZFS within our control.

The service provides 5 SMF services for managing automatic snapshots. These are:


Enable these to have snapshots taken at a regularity as suggested by the name. Each has properties to tune, the most notable being how many snapshots to retain;

-bash-3.00$ svccfg -s auto-snapshot:daily listprop zfs/keep
zfs/keep  astring  31

-bash-3.00$ svccfg -s auto-snapshot:frequent listprop zfs/keep
zfs/keep  astring  4

-bash-3.00$ svccfg -s auto-snapshot:frequent listprop zfs/period
zfs/period  astring  15

The defaults for daily and frequent being 1 months worth of dailies and 4 frequents. Frequent is defined initially as once every 15 mins as shown by the zfs/period property.

Whether a ZFS dataset is snapshot’ed or not is controlled by a user level ZFS property com.sun:auto-snapshot which can be true or false for a dataset and com.sun:auto-snapshot:frequent, com.sun:auto-snapshot:daily, etc, etc properties for each periodicity as required. These are inherited, so it is probably a good idea to set com.sun:auto-snapshot to false at the root of a pool and then change the required sub-datasets to true as needed. Perhaps an example would help;

-bash-3.00$ pfexec zfs set com.sun:auto-snapshot=true datapool/zfstestz1_ds
-bash-3.00$ pfexec zfs set com.sun:auto-snapshot=false rpool              
-bash-3.00$ pfexec zfs set com.sun:auto-snapshot=true rpool/export
-bash-3.00$ pfexec zfs set com.sun:auto-snapshot:frequent=true rpool/export
-bash-3.00$ pfexec zfs set com.sun:auto-snapshot:frequent=true datapool/zfstestz1_ds

<getting bored with pfexec for each cmd>
-bash-3.00$ pfexec bash
bash-3.00# zfs set com.sun:auto-snapshot=false rpool/dump
bash-3.00# zfs set com.sun:auto-snapshot=false rpool/swap

bash-3.00# cat /etc/release
                       Solaris 10 10/09 s10x_u8wos_08a X86
           Copyright 2009 Sun Microsystems, Inc.  All Rights Reserved.
                        Use is subject to license terms.
                           Assembled 16 September 2009
bash-3.00# svcs -a |grep snapshot
disabled       17:59:54 svc:/system/filesystem/zfs/auto-snapshot:monthly
disabled       17:59:54 svc:/system/filesystem/zfs/auto-snapshot:weekly
disabled       17:59:54 svc:/system/filesystem/zfs/auto-snapshot:daily
disabled       17:59:54 svc:/system/filesystem/zfs/auto-snapshot:hourly
online         18:00:44 svc:/system/filesystem/zfs/auto-snapshot:event
online         18:09:18 svc:/system/filesystem/zfs/auto-snapshot:frequent

bash-3.00# zfs list -t snapshot | grep frequent
datapool@zfs-auto-snap_frequent-2010-09-23-1809                   0      -  28.0K  -
datapool/zfstestz1_ds@zfs-auto-snap_frequent-2010-09-23-1809      0      -  28.0K  -
rpool/export@zfs-auto-snap_frequent-2010-09-23-1809               0      -    23K  -
rpool/export/home@zfs-auto-snap_frequent-2010-09-23-1809          0      -   954K  -

bash-3.00# pkginfo SUNWzfs-auto-snapshot
application SUNWzfs-auto-snapshot ZFS Automatic Snapshot Service

bash-3.00# date
Thursday, 23 September 2010 18:15:48 BST

bash-3.00# zfs list -t snapshot | grep frequent
datapool@zfs-auto-snap_frequent-2010-09-23-1809                   0      -  28.0K  -
datapool@zfs-auto-snap_frequent-2010-09-23-1815                   0      -  28.0K  -
datapool/zfstestz1_ds@zfs-auto-snap_frequent-2010-09-23-1809      0      -  28.0K  -
datapool/zfstestz1_ds@zfs-auto-snap_frequent-2010-09-23-1815      0      -  28.0K  -
rpool/export@zfs-auto-snap_frequent-2010-09-23-1809               0      -    23K  -
rpool/export@zfs-auto-snap_frequent-2010-09-23-1815               0      -    23K  -
rpool/export/home@zfs-auto-snap_frequent-2010-09-23-1809          0      -   954K  -
rpool/export/home@zfs-auto-snap_frequent-2010-09-23-1815          0      -   954K  -

I think this should prove to be very useful for us, I would be very interested, as usual in hearing your comments.


September 15, 2010

Analysis of zone capacity on an x4170

This blog entry, wasn’t intended to be published. It was simple a place for me to collect my thoughts on the state of one of our servers and its remaining capacity for extra zones.

The server is an X4170 with 2 Quad-Core AMD Opterons and 64gb of ram. It is currently running 11 zones, each running an instance of apache and an instance of tomcat. The load currently looks reasonably light, certainly from a CPU viewpoint:

     0       61 1314M 1308M   2.0% 804:24:29 2.9% global
    13       58 1381M  936M   1.4%  35:16:01 0.5% zoneA
    16       45 1238M  496M   0.8%  11:20:16 0.2% zoneB
    29       55 1378M  576M   0.9%   4:30:13 0.1% zoneC
     9       47 1286M  496M   0.8%  21:08:49 0.1% zoneD
     2       60 1290M  464M   0.7%   7:35:19 0.0% zoneE
     7       51 1298M  460M   0.7%   6:50:11 0.0% zoneF
    24       57  835M  896M   1.4%  17:14:18 0.0% zoneG
    27       45  138M  191M   0.3%   7:52:47 0.0% zoneH
     4       47  345M  408M   0.6%  11:26:22 0.0% zoneI
Total: 573 processes, 2649 lwps, load averages: 0.38, 0.29, 0.27

Note that each zone is consuming several hundred meg of ram (see RSS column), this consumption will almost certainly be due to the java heap settings / sizings of the various tomcat instances.

Lets take a look at one of the zones

>  zlogin zoneA ps -futomcat
     UID   PID  PPID   C    STIME TTY         TIME CMD
  tomcat 29407 29406   0   Jun 07 ?        1294:57 /usr/java/bin/java -Djava.util.logging.config.file=/usr/local/tomcat/conf/loggi
  tomcat 29406 20781   0   Jun 07 ?           0:47 /usr/local/sbin/cronolog /usr/local/tomcat/logs/catalina.out.%Y-%m-%d

How to interrogate this process to find the resident set size? Well, there are a few options; hunt down the config entries that lauch this tomcat server (usually set as a var CATALINA_OPTS), take the easy route and use prstat or the ‘process tools’ such as pmap. Ever used pmap? pmap -x will show more detail about the address space mapping of a process than you ever wanted to know; shown below, we can see that the tomcat process in zoneA is using >700mb of ram.

bash-3.00# pmap -x 29407
29407:  /usr/java/bin/java -Djava.util.logging.config.file=/usr/local/tomcat/c
 Address  Kbytes     RSS    Anon  Locked Mode   Mapped File
08008000      12       -       -       - -----    [ anon ]
0803D000      44      44      44       - rwx--    [ stack ]
<... truncated ...>
FEFF0000       4       4       4       - rwx--    [ anon ]
FEFFB000       8       8       8       - rwx--
FEFFD000       4       4       4       - rwx--
-------- ------- ------- ------- -------
total Kb 1349932  780360  744800       -

Thought 1, there is only 6gb of free memory (as defined as actually on the freelist). Where is it all? I would expect that the ZFS arc cache will be consuming the lions share, and this blog entry will show you how to check. As applications use more ram, ZFS should behave and reluinquish the previously “unused” ram; but remember that reducing ZFS’s available allocation (by default 7/8ths of RAM) will in theory impact read performance as we will likely see more cache misses. ZFS should be pretty good at reducing the arc size as the system requests more and more ram. Let’s check the numbers, both of ram used for arc and the cache efficiency statistics.

Here is the mdb memstat that shows the memory usage detail:

bash-3.00# echo "::memstat" | mdb -k
Page Summary                Pages                MB  %Tot
------------     ----------------  ----------------  ----
Kernel                   13015620             50842   78%
Anon                      1416662              5533    8%
Exec and libs               42400               165    0%
Page cache                 695383              2716    4%
Free (cachelist)            27757               108    0%
Free (freelist)           1577146              6160    9%

Total                    16774968             65527
Physical                 16327309             63778

Rather a lot used by the kernel (indicative of ZFS arc usage), rather than anon, exec and libs, note also a currently a rather small freelist as a percentage of total physical ram, but as mentioned I suspect ZFS will be consuming a lot (perhaps 40g plus?) of the othewise unused ram.

The free memory values and zfs arc stats can conveniently be confirmed with a quick look at the kstats via arc_summary, which is an incredibly useful tool written by Ben Rockwood (thanks Ben) which saves us all an immense amount of time both remember how, where and which kstats to interrogate for memory and zfs statistics.

Get it here:

System Memory:
         Physical RAM:  65527 MB
         Free Memory :  4369 MB
         LotsFree:      996 MB

The arcsize, as suspected is large; at over 45gb

ARC Size:
         Current Size:             46054 MB (arcsize)

The ARC cache, is however doing a good job with that 45gb, a hit ratio of 96% isn’t bad.

ARC Efficency:
         Cache Access Total:             4384785832
         Cache Hit Ratio:      96%       4242292557     [Defined State for buffer]
         Cache Miss Ratio:      3%       142493275      [Undefined State for Buffer]
         REAL Hit Ratio:       92%       4060875546     [MRU/MFU Hits Only]

Notice also that the ARC is doing a fairly reasonable job of correctly predicting our prefetch requirements, 36% isn’t too bad compared to a directly demanded cache hit rate of 72%.

         Data Demand   Efficiency:    72%
         Data Prefetch Efficiency:    36%

OK, so in theory if we keep adding zones we have to be very careful about the available ram allocations given that these zones are running some reasonably memory-hungry (tomcat) java instances. As more and more memory is allocated to apache/tomcat and java within new zones, less and less ram will be available for the zfs arc. This will need some careful monitoring.

Currently the cpu usage on this server is also fairly light, but I have no view of the future levels of usage or traffic intended to run through these services. I’ll park this one for now.

Thought 2, disk space. Currently the zones have a zfs dataset allocated in the root pool for each zone root:

rpool/ROOT/s10x10-08/zones/zoneB                      6.23G  5.77G  6.23G  /zones/zoneB

Each of which has a 12gb quota, the output above (from zfs list) is typical of the usage for the zones, around 50% of the 12g quota. There are 11 zones, each with this 12gb quota, giving a potential total usage of 132gb. The entire root pool is only 136gb and already has just 32.5 gb available as shown by zfs list:

root > zpool list
datapool   816G  31.2G   785G     3%  ONLINE  -
rpool      136G   104G  32.5G    76%  ONLINE  -

It seems, then, rather than concerning myself about memory or cpu usage on this server, unless we find elsewhere for the zone root filesystems, we are likely to run out of local storage space way before we stretch the capabilities of this server.

Soon a blog entry on increasing the size of your root pool with some mirror juggling, I’ve done this successfully on a VM, I just need a test server to perfect the process.

Comments welcome.


August 2019

Mo Tu We Th Fr Sa Su
Jul |  Today  |
         1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31   

Search this blog



Most recent comments

  • Started sorting out new vers for sparc:… by Maria MacCallum on this entry
  • Solaris 11.1 is slightly different, I only had to do this before starting ipfilter: svccfg –s setpro… by Maria MacCallum on this entry
  • Really useful information, thanks a lot! I do a NAT using IPFILTER and all was working good, until I… by Nilton on this entry
  • Paul, Thanks for your information. It got me started quickly. I have discovered , thought I've not s… by Tom C on this entry
  • Are you familiar with the Monty Python sketch? by Ian Eiloart on this entry

Blog archive

RSS2.0 Atom
Not signed in
Sign in

Powered by BlogBuilder