All 1 entries tagged Mdb

No other Warwick Blogs use the tag Mdb on entries | View entries tagged Mdb at Technorati | There are no images tagged Mdb on this blog

October 27, 2010

Finding which processes are swapped / MDB tutorial

Today I encountered a server that had previously experienced some serious memory pressure, evidence of the page scanner running, a serious amount of anonymous paging and swapping of light weight processes out of memory. I’m sure some of you have had this experience before and wonder just which processes those are that are sat in the vmstat ‘w’ column.

The vmstat man page simply says:

                     w        the number of  swapped  out  light-
                              weight  processes  (LWPs)  that are
                              waiting for processing resources to
                              finish.

Which is vague, at best. Perhaps it’s just me. Anyway, this is what I saw when I ran vmstat:

bash-3.00# vmstat 1
 kthr      memory            page            disk          faults      cpu
 r b w   swap  free  re  mf pi po fr de sr vc vc -- --   in   sy   cs us sy id
 0 0 4 12857400 5644584 13 15 166 144 148 0 285 14 0 0 0 1619 1762 1574 2 1 98
 1 0 94 14134296 7000240 7 14 0 0  0  0  0  0  0  0  0  308  779   83  0  0 100
 0 0 94 14134040 6999984 4 5 0  0  0  0  0  0  0  0  0  306  766   79  0  0 100
 1 0 94 14134040 6999984 4 4 0  0  0  0  0  0  0  0  0  342  755  107  0  0 100
 1 0 94 14134040 6999984 4 4 0  0  0  0  0  0  0  0  0  310  776   83  0  0 100
 1 0 94 14134040 6999984 4 4 0  0  0  0  0  0  0  0  0  310  763   89  0  0 100
 1 0 94 14134040 6999984 4 4 0  0  0  0  0  0  0  0  0  301  757   75  0  0 100
 0 0 94 14134040 6999984 4 4 0  0  0  0  0  0  0  0  0  296  751   69  0  0 100
 1 0 94 14134040 6999984 4 4 0  0  0  0  0  0  0  0  0  352  766  111  0  0 100
 0 0 94 14134040 6999984 4 4 0  0  0  0  0  0  0  0  0  310  787   93  0  0 100
 1 0 94 14134040 6999984 4 4 0  0  0  0  0  0  0  0  0  316  772   87  0  0 100

Wow, 94 LWPs swapped out, that’s quite a lot. This generally won’t cause any issues on your system, as soon as an LWP is asked to do work it will be swapped back in, on demand. In fact, you could force them back in by tracing the process (p-commands such as pfiles etc) or sending the process a signal but in order to do that you would need to know ‘which’ threads are swapped out.

There are a number of ways to do this, and one of them is with mdb. So, I set myself the task of finding which processes are swapped out.

First, access to the kernel threads in the live system is possible through the ‘thread’ walker in mdb. To access the live running kernel, you need to fire up mdb with the ‘-k’ option.

The ::help dcmd in mdb shows us a brief description of each dcmd. Below, the walker and output is shown as an example.

# mdb -k
Loading modules: [ unix genunix specfs dtrace zfs sd pcisch ip hook neti sctp arp usba fcp fctl nca lofs md cpc random crypto wrsmd fcip logindmux ptm ufs sppp nfs ipc ]
> ::help walk       

NAME
  walk - walk data structure

SYNOPSIS
  [ addr ] ::walk name [variable]

ATTRIBUTES

  Target: kvm
  Module: mdb
  Interface Stability: Evolving

> 

So, lets start with the kthreads:

> ::walk thread thr
180e000
2a10001fca0
2a100017ca0
2a10000fca0
2a100007ca0
2a10003fca0

What we get back is a bunch of memory addresses that allow us access to the threads, one per thread, of course. Pipes are powerful constructs in mdb which allow us, much like unix commands, to pipe the output of one cmd (dcmd) into the input of another, in the first instance, we can pipe the output of the walker above straight into ::print to print the kthread_t schedflag.

I should mention that the addresses listed in step one will point to a data structure of type kthread_t which you can look up in /usr/include/sys/thread.h. This is important because it tells you how to read the structure and details what elements within we may be interested in. Take a look and read up on the schedflag.

> ::walk thread thr |::print kthread_t t_schedflag                              
t_schedflag = 0x3e03
t_schedflag = 0x3
t_schedflag = 0x3
t_schedflag = 0x3
t_schedflag = 0x3
t_schedflag = 0x3
....

OK, so I’ve truncated the output, but we can see that this gives us a list of all the schedflags for all the threads on the system. The list is long, so we need to work out which schedflag values we are interested in and ‘grep’ for that value in the output. Fortunately, mdb provides us with just what we need in the ::grep dcmd. Note that the ::grep dcmd uses the . notation which refers to the previous command output. The mdb documentation is suitably confusing on this one, I don’t know about you but I had to just ‘play’ with the dcmd to work out what this was trying to tell me:

     ::grep command

         Evaluate the specified command string,  and  then  print
         the  old  value  of  dot if the new value of dot is non-
         zero. If the command contains whitespace or  metacharac-
         ters,  it must be quoted. The ::grep dcmd can be used in
         pipelines to filter a list of addresses.

Again, output below truncated.

> ::walk thread thr |::print -x kthread_t t_schedflag|::grep .==0x03
3
3
3

As for my quest, I initially expected to be able to match/search for the value 0×0008, due to the following in thread.h:

#define TS_ON_SWAPQ     0x0008  /* thread is on the swap queue */

I will jump ahead here and show you that you can run mdb non-interactively and combine it with standard unix commands.

To my dismay, there are no threads with the 0×0008 schedflag value on my system:

bash-3.00# echo "::walk thread thr |::print kthread_t t_schedflag|::grep .==0008" | mdb -k |wc -l
       0

When there clearly are still threads on the run queue, 91 in fact:

bash-3.00# vmstat 1 3
 kthr      memory            page            disk          faults      cpu
 r b w   swap  free  re  mf pi po fr de sr vc vc -- --   in   sy   cs us sy id
 0 0 6 12892280 5681600 13 15 162 140 144 0 278 14 0 0 0 1584 1718 1534 2 1 98
 1 0 91 14172696 7055272 10 17 0 0 0  0  0  0  0  0  0  318  645   87  0  0 100

I came across this by trial and error, but it can’t be a coincidence that there are 91 threads in my system that have TS_RUNQMATCH set. I need to read through the code to work out why, but it would appear that setbackdq() has set/changed the value since the threads were swapped out. If anyone cares to explain in the comments section, I’d be very grateful :)

bash-3.00# grep RUNQ /usr/include/sys/thread.h
#define TS_RUNQMATCH    0x4000  /* exact run queue balancing by setbackdq() */

bash-3.00# echo "::walk thread thr |::print kthread_t t_schedflag|::grep .==4000" | mdb -k |wc -l
      91

So, going with a rather large assumption for now, lets look at the threads with that value set. First, print out the address of the threads in question, and recall that after the ::grep we have the value of the schedflag where the schedflag was equal to TS_RUNQMATCH. We can use ::eval to convert this back to an address.

> ::walk thread thr |::print kthread_t t_schedflag|::grep .==4000 |::eval <thr=K
                30007f36760     
                30007f21a80     
                30007f36aa0     
                30007f514e0     
                3000803af20     
                30007f2f780     
                30007f1d3e0     
                300076366a0     

Then, (finally), pipe each one back to ::print in order to interrogate the proc structure and in turn the p_user user structure (see sys/proc.h and sys/user.h) and in turn the value in p.user.u_comm. Output truncated.

> ::walk thread thr |::print kthread_t t_schedflag|::grep .==4000 |::eval <thr=K |::print -d kthread_t t_procp->p_user.u_comm
t_procp->p_user.u_comm = [ "svc.configd" ]
t_procp->p_user.u_comm = [ "svc.configd" ]
t_procp->p_user.u_comm = [ "svc.configd" ]
t_procp->p_user.u_comm = [ "svc.configd" ]
t_procp->p_user.u_comm = [ "svc.configd" ]
t_procp->p_user.u_comm = [ "svc.configd" ]
t_procp->p_user.u_comm = [ "svc.configd" ]
t_procp->p_user.u_comm = [ "svc.configd" ]
t_procp->p_user.u_comm = [ "svc.configd" ]
t_procp->p_user.u_comm = [ "svc.configd" ]
t_procp->p_user.u_comm = [ "nscd" ]
t_procp->p_user.u_comm = [ "svc.configd" ]
t_procp->p_user.u_comm = [ "svc.configd" ]
t_procp->p_user.u_comm = [ "svc.configd" ]
t_procp->p_user.u_comm = [ "svc.configd" ]
t_procp->p_user.u_comm = [ "svc.configd" ]
t_procp->p_user.u_comm = [ "svc.configd" ]
t_procp->p_user.u_comm = [ "svc.configd" ]
t_procp->p_user.u_comm = [ "svc.startd" ]
t_procp->p_user.u_comm = [ "svc.startd" ]
...

Alternatively, the proc structure contains a pointer to a pid structure, p_pidp which gives us access to the process id via pid_id:

::walk thread thr |::print kthread_t t_schedflag|::grep .==4000 |::eval <thr=K |::print -d kthread_t t_procp->p_pidp->pid_id
t_procp->p_pidp->pid_id = 0t10199
t_procp->p_pidp->pid_id = 0t10213
t_procp->p_pidp->pid_id = 0t10661
t_procp->p_pidp->pid_id = 0t10661
t_procp->p_pidp->pid_id = 0t10661
t_procp->p_pidp->pid_id = 0t10661
t_procp->p_pidp->pid_id = 0t10661

OK, so we got there, command names and process ids for the swapped out processes, if we go with the assumption around TS_RUNQMATCH. However, since drafting my notes for this blog entry I have found that a cleaner and possibly more reliable way of identifying the processes/threads that are swapped out is by looking for threads that are NOT in memory. The schedflag value TS_LOAD would be most helpful here, so the best option would be to search for threads whose schedflag is NOT equal to TS_LOAD.

#define TS_LOAD         0x0001  /* thread is in memory */

How? By using the bitwise AND operator in mdb (&) and AND’ing the schedflag with 1. If the result is zero, then the value in schedflag must have been zero also (TS_LOAD not set).

> ::walk thread thr |::print -d kthread_t t_schedflag | ::grep '(.&1)==0' |::eval <thr=K |::print -d kthread_t t_procp->p_user.u_comm
t_procp->p_user.u_comm = [ "svc.configd" ]
t_procp->p_user.u_comm = [ "svc.configd" ]
t_procp->p_user.u_comm = [ "svc.startd" ]
t_procp->p_user.u_comm = [ "drd" ]
t_procp->p_user.u_comm = [ "syseventd" ]
t_procp->p_user.u_comm = [ "syseventd" ]
t_procp->p_user.u_comm = [ "syseventd" ]

This appears to give the same answer as my previous attempt, and gives the same number of swapped out threads (91);

bash-3.00# echo "::walk thread thr |::print -d kthread_t t_schedflag | ::grep '(.&1)==0' |::eval <thr=K |::print -d kthread_t t_procp->p_user.u_comm" | mdb -k |wc -l
      91

bash-3.00# echo "::walk thread thr |::print kthread_t t_schedflag|::grep .==4000" | mdb -k |wc -l
      91

I really should write a wrapper around this, but for now the blog entry is enough. Hopefully this has been useful, if somewhat verbose. It has been as much about my education as anything else. I need to apologise too, this is even more dull than my last post.

Paul.


September 2020

Mo Tu We Th Fr Sa Su
Aug |  Today  |
   1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30            

Search this blog

Tags

Galleries

Most recent comments

  • Started sorting out new vers for sparc: http://blogs.warwick.ac.uk/mariamaccallum/entry/apache_249_i… by Maria MacCallum on this entry
  • Solaris 11.1 is slightly different, I only had to do this before starting ipfilter: svccfg –s setpro… by Maria MacCallum on this entry
  • Really useful information, thanks a lot! I do a NAT using IPFILTER and all was working good, until I… by Nilton on this entry
  • Paul, Thanks for your information. It got me started quickly. I have discovered , thought I've not s… by Tom C on this entry
  • Are you familiar with the Monty Python sketch? by Ian Eiloart on this entry

Blog archive

Loading…
RSS2.0 Atom
Not signed in
Sign in

Powered by BlogBuilder
© MMXX