Author Topic: netscaler dropping packets  (Read 6534 times)

Offline danix

  • Sr. Member
  • **
  • Posts: 13
  • Karma: 1
netscaler dropping packets
« on: March 17, 2009, 01:29:15 PM »
I've got a NS 7000 HA pair that has been working well for a while.
Currently we're doing around 45Mb each way through it, around 1500 reqs/sec total.

I've been noticing increasing site page timeouts, and started tracking them and testing.
I've eliminated the firewall by testing to the VIP directly, and I know that hitting the webservers directly doesn't produce an error, so clearly it's the NS.

Citrix said we're running out of memory and that's why the issue is occurring.  So I reduced IC from 250Mb to 100Mb, and memory utilization is below 60%.
But the problem persists, so they are now researching.  Meanwhile, my pages are timing out.

Any suggestions on other things to check and how I can confirm what's going on?
I have seen a major degradation in citrix support quality over the last few months and I'm about ready to find a consultant or buy F5.

Thanks!

Offline jmelika

  • Administrator
  • Hero Member
  • *****
  • Posts: 341
  • Karma: 7
Re: netscaler dropping packets
« Reply #1 on: March 17, 2009, 02:29:28 PM »
Danix,

I'm sorry to hear you're having bad experience with your netscaler support.  I heard they just did some layoffs a month or two ago, so I'm sure that has an effect on quality.  Anyway, I'm a little curious regarding the "type" of traffic you're delivering.

1) You're saying 45Mb[ps] each way.  Why is there as much download as there is upload?  You're housing web servers, so I can imagine your upstream is high, but if the servers are delivering objects, why are they receiving 45Mbps?

2) Do you have any type of policies other than your VIPs?  Do you do compression?  rewrites?  Anything that could add overhead to your NS?

3) Are you doing Load Balancing or Content Switching?  Please describe in more details how each is being used.

Thanks.  We'll try to help you as much as possible.

JM

Offline danix

  • Sr. Member
  • **
  • Posts: 13
  • Karma: 1
Re: netscaler dropping packets
« Reply #2 on: March 17, 2009, 02:47:48 PM »
There's only so much I can say about the type of traffic, but I will try to be as open as I can.

Quote
1) You're saying 45Mb[ps] each way.  Why is there as much download as there is upload?  You're housing web servers, so I can imagine your upstream is high, but if the servers are delivering objects, why are they receiving 45Mbps?
It's because of our business type.  We get a lot of traffic in, and send a lot of traffic right back out.  This is normal (for us).

Quote
2) Do you have any type of policies other than your VIPs?  Do you do compression?  rewrites?  Anything that could add overhead to your NS?
No compression.  We recently turned on keepalive, moving it from the webservers to the NS and this seems to have pushed us over the edge.
Lots of caching policies.

Quote
3) Are you doing Load Balancing or Content Switching?  Please describe in more details how each is being used.
Both.  Content switching and policy-based load balancing.

I have been working further with support and I don't see any evidence of memory issues after reducing cache.  But the timeouts persist.
They said that they see flapping, but I have no evidence of flapping on the webservers.  The NS seems to be dropping traffic or incorrectly seeing flapping.

I am trying to get details on the vserver to check for errors with: nsconmsg -s ConLb=2 -d oldconmsg but it's not quite what I am looking for.

Questions I have:
a) how do I actually check memory utilization?  the dashboard is unreliable.  How about nsconmsg -s ConMEM=2 -d oldconmsg
?  That shows me:
Quote
TotalMEM:  582446499     Allocated:  313714883(53.86%)   ActualInUse: 247588467(42.51%)    Free:  268731616   
b) with the addition of keepalive, I am wondering about turning on tcp buffering.  I think it might help and we have pretty low cpu utilzation.
c) zombie tcp is set to 120, zombie nontcp is 60.  What are these? Can't find references in the docs.

Thanks again.



Offline evildani

  • Administrator
  • Hero Member
  • *****
  • Posts: 389
  • Karma: 22
Re: netscaler dropping packets
« Reply #3 on: March 17, 2009, 05:47:28 PM »
Ok,
Try doing a "stat interface" to look for errors. Also try dmesg on the shell to see any network problems and others. Does this happen with any of the HA nodes?

What technique are you using to look for the timeouts, are you doing http debuging?
What hit ration do you have on the IC? can it be turned off for testing porpuses?
Is your traffic encrypted or plain http?
Do you use rewrite on the pages that are being timeout?
What is your server timeouyt config and your client timeout config?

Offline danix

  • Sr. Member
  • **
  • Posts: 13
  • Karma: 1
Re: netscaler dropping packets
« Reply #4 on: March 17, 2009, 06:13:59 PM »
stat interface shows packets being dropped on both interfaces:

Quote
Packets dropped in Rx (sw)                       2                31590

and

Quote
Packets dropped in Rx (sw)                       1                18445

I am using a curl-based test that checks the time to curl 5-6 different pages.
The IC has a very low hit ratio (7%):
Quote
nsroot@mynetscaler> sh cache stats

Integrated Cache Statistics - Summary

Rate (/s)                                   Total 
Hits                                           104              1260994
Misses                                        1150             15602534
Requests                                      1254             16863528
Hit ratio(%)                                    --                    7
Origin bandwidth saved(%)                       --                   20
Cached objects                                  --                13680
Marker objects                                  --                    2

Hits being served                               48
Misses being handled                             1

There's nothing special about the pages - some are php, some are gifs, some are dynamic, some are static.  All of them perform fine when hit directly on the servers, ie not through the netscaler.

Timeouts: server timeout is 360, client timeout 180.

Looking at traces from the netscaler as well as the client side, I see quite a few duplicate ACKS and some long RTO as well. 
Seems clear to me the netscaler is dropping traffic.  But Citrix is asking for more traces and newnslog...

Offline evildani

  • Administrator
  • Hero Member
  • *****
  • Posts: 389
  • Karma: 22
Re: netscaler dropping packets
« Reply #5 on: March 17, 2009, 06:51:10 PM »
A drop ratio of 2 per sec is pretty normal.
7% hit ratios is very low, so it is safe to assume that you can turn IC anytime.

Could youpost the curl options that you are using?
The pages that you are doing the curl to are dinamic os static?
What web server are you using?

Offline danix

  • Sr. Member
  • **
  • Posts: 13
  • Karma: 1
Re: netscaler dropping packets
« Reply #6 on: March 18, 2009, 07:57:30 AM »
Quote
Could youpost the curl options that you are using?
The pages that you are doing the curl to are dinamic os static?
What web server are you using?

It's curl based (php using curl libs), not curl. 
As I said, the pages vary, some are static, some are dynamic.  Eventually they all time out.
Bypassing the load balancer, none time out.
The server is apache, but that's unimportant.

Offline evildani

  • Administrator
  • Hero Member
  • *****
  • Posts: 389
  • Karma: 22
Re: netscaler dropping packets
« Reply #7 on: March 18, 2009, 09:11:16 AM »
The timeouts only happen to web pages? Does it happen to objects such as JS, JPG, etc...?
What Build are you running?

Offline danix

  • Sr. Member
  • **
  • Posts: 13
  • Karma: 1
Re: netscaler dropping packets
« Reply #8 on: March 18, 2009, 09:27:22 AM »
I said above:
Quote
There's nothing special about the pages - some are php, some are gifs, some are dynamic, some are static.  All of them perform fine when hit directly on the servers, ie not through the netscaler.

The problem has nothing to do with the servers or the content.
We're running 8.1, build 62.3.  I see there's a newer 8.1 version, I may try upgrading to that but still trying to get answers from Citrix.

Offline evildani

  • Administrator
  • Hero Member
  • *****
  • Posts: 389
  • Karma: 22
Re: netscaler dropping packets
« Reply #9 on: March 18, 2009, 09:41:09 AM »
could you post the list of features that are enabled on the device?

Offline jmelika

  • Administrator
  • Hero Member
  • *****
  • Posts: 341
  • Karma: 7
Re: netscaler dropping packets
« Reply #10 on: March 18, 2009, 10:46:23 AM »
danix, I'm confident that now evildani has jumped on this that you'll have your issue resolved in no time.  He is qualified to train Netscaler's support staff. :)

I'll assume your answer to ED's question about whether it happens on jpgs, etc as yes since you said you'd tried static pages in your tests.  I have a feeling the issue is at the physical or datalink layer.  There was a post here a while ago with someone having similar issues and it turned out that it was a bad switch port.  I couldn't locate that post though.

Try switching out the ethernet cable and plug it into a different switch port, preferably different switch (if you're stacking switches).  If that doesn't work, try failing over to your secondary netscaler and see if the issue occurs there too.

Offline evildani

  • Administrator
  • Hero Member
  • *****
  • Posts: 389
  • Karma: 22
Re: netscaler dropping packets
« Reply #11 on: March 18, 2009, 10:51:04 AM »
I would start by doing a force fail over and check that everyting is runnig smoothly on the secondary node.

And thanks for the confidence. BTW I have not received the Instructor status... I hope it will be soon...

Daniel

Offline danix

  • Sr. Member
  • **
  • Posts: 13
  • Karma: 1
Re: netscaler dropping packets
« Reply #12 on: March 18, 2009, 12:05:31 PM »
I appreciate the help, but please read what I've typed.
When I monitor the servers directly, there is no issue.  So it's not a switch or server error.
The netscaler shows no interface errors, no network errors, other than dup ACKs in the trace.  There is no physical layer problem at the netscaler.
I've rebooted both nodes on the HA cluster and failed between them, the problem is always there.

Citrix has come back and now agrees with me that the box is dropping traffic.  But I still don't have a solution.
I've engaged my account team as well.
Thanks.

Offline evildani

  • Administrator
  • Hero Member
  • *****
  • Posts: 389
  • Karma: 22
Re: netscaler dropping packets
« Reply #13 on: March 18, 2009, 12:37:31 PM »
Ok back again on the netscaler side, could you explain  how is traffic coming and going through your netscaler, what features are involved?

Offline danix

  • Sr. Member
  • **
  • Posts: 13
  • Karma: 1
Re: netscaler dropping packets
« Reply #14 on: March 18, 2009, 03:26:32 PM »
Citrix has confirmed a bug in 62.3.  I am upgrading to 63.7 and will update you.