Mailing List Archive

6500 series, FIB exception @ dfc
Hi all,

We've ran into a serious bug the other day with our Cat6504 with S2T.
This machine has 3 upstreams with full views. Wanted to get a
feedback from the list on what could it be and how to mitigate it.

So while running s2t54-adventerprisek9-mz.SPA.152-1.SY5.bin, with over
6 months uptime
the switch threw in the following:

%CFIB-DFC2-7-CFIB_EXCEPTION: FIB TCAM exception, Some entries will be
software switched routing issues @ DFC, S2T seems to be performing
correctly

#sh platform hardware cef exception status
Current IPv4 FIB exception state = FALSE
Current IPv6 FIB exception state = FALSE
Current MPLS FIB exception state = FALSE
Current EoM/VPLS FIB TCAM exception state = FALSE

#remote command mod 2 sh platform hard cef exception status detail
Current IPv4 FIB exception state = TRUE
...

#remote command mod 2 sh platform hardware cef resource-level
Global watermarks: apply to Fib shared area only.
Protocol watermarks: apply to protocols with non-default max-routes

Fib-size: 1024k (1048576), shared-size: 1016k (1040384), shared-usage:
877k(898387)

Global watermarks:
Red_WM: 95%, Greem_WM: 80%, Current usage: 86%

Protocol watermarks:

Protocol Red_WM(%) Green_WM(%) Current(%)
-------- --------- ---------- ----------
IPV4 -- -- 73% (of shared)
IPV4-MCAST -- -- 0 % (of shared)
IPV6 -- -- 12% (of shared)
IPV6-MCAST -- -- 0 % (of shared)
MPLS -- -- 0 % (of shared)
EoMPLS -- -- 0 % (of shared)
VPLS-IPV4-MCAST -- -- 0 % (of shared)
VPLS-IPV6-MCAST -- -- 0 % (of shared)

#remote command mod 2 show platform hardware cef maximum-routes usage


Fib-size: 1024k (1048576), shared-size: 1016k (1040384),
shared-usage: 874k(895227)

Protocol Max-routes Usage Usage-from-shared
------- ---------- ----- -----------------
IPV4 1017k 762763 (744 k) 761739 (743 k)
IPV4-MCAST 1017k 6 (0 k) 0 (0 k)
IPV6 1017k 134512 (131 k) 133488 (130 k)
IPV6-MCAST 1017k 4 (0 k) 0 (0 k)
MPLS 1017k 1 (0 k) 0 (0 k)
EoMPLS 1017k 1 (0 k) 0 (0 k)
VPLS-IPV4-MCAST 1017k 0 (0 k) 0 (0 k)
VPLS-IPV6-MCAST 1017k 0 (0 k) 0 (0 k)

Maximum Tcam Routes : 901021
Current Tcam Routes : 897288


The box did not hit any TCAM limits; Usage below the red watermark.
Message comes from a
DFC card, similar to this bug:
https://quickview.cloudapps.cisco.com/quickview/bug/CSCun81101


Right after this error we performed card reseat & IOS upgrade. Now it's running

Cisco IOS Software, s2t54 Software (s2t54-ADVENTERPRISEK9-M), Version 15.5(1)SY,
RELEASE SOFTWARE (fc6)
System image file is "bootdisk:s2t54-adventerprisek9-mz.SPA.155-1.SY.bin"

After a short while, we get an identical error:

%CFIB-DFC2-7-CFIB_EXCEPTION: FIB TCAM exception, Some entries will be
software switched

Previously, the box was still able to switch (process) packets, but
this time it's all froze,
all traffic was just dropped.

#sh inventory

NAME: "WS-C6504-E", DESCR: "Cisco Systems Cisco 6500 4-slot Chassis System"
PID: WS-C6504-E , VID: V01, SN: xxx

NAME: "1", DESCR: "VS-SUP2T-10G 5 ports Supervisor Engine 2T 10GE w/
CTS Rev. 1.8"
PID: VS-SUP2T-10G , VID: V05, SN: xxx

NAME: "msfc sub-module of 1", DESCR: "VS-F6K-MSFC5 CPU Daughterboard Rev. 1.6"
PID: VS-F6K-MSFC5 , VID: , SN: xxx

NAME: "VS-F6K-PFC4XL Policy Feature Card 4 EARL 1 sub-module of 1", DESCR:
"VS-F6K-PFC4XL Policy Feature Card 4 Rev. 1.0"
PID: VS-F6K-PFC4XL , VID: V01, SN: xxx

NAME: "2", DESCR: "WS-X6848-SFP CEF720 48 port 1000mb SFP Rev. 3.1"
PID: WS-X6848-SFP , VID: V02, SN: xxx

NAME: "WS-F6K-DFC4-AXL Distributed Forwarding Card 4 EARL 1 sub-module of 2",
DESCR: "WS-F6K-DFC4-AXL Distributed Forwarding Card 4 Rev. 2.0"
PID: WS-F6K-DFC4-AXL , VID: V04, SN: xxx

Qs:

1. What version of IOS has a fix for this bug?

2. If you encountered this bug, which cards/models were you using?

3. If these BGP peers are connected directly to VS-SUP2T card, is it correct to
assume this bug is not an issue? These TCAM entries won't affect the DFC card?

4. Since we are utilizing over 70% of TCAM, what is the recommended hardware
platform to move? 1M IPv4 routes are just around the corner... SUP6T-XL has the
same 1024K limitation. Anything capable of 10gb+, over 1M routes, 1 to 5U?

Cheers,
Igor Smolov
_______________________________________________
cisco-nsp mailing list cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: 6500 series, FIB exception @ dfc [ In reply to ]
Hi,

On Thu, May 02, 2019 at 11:54:59AM -0500, Igor Smolov wrote:
> 4. Since we are utilizing over 70% of TCAM, what is the recommended hardware
> platform to move? 1M IPv4 routes are just around the corner... SUP6T-XL has the
> same 1024K limitation. Anything capable of 10gb+, over 1M routes, 1 to 5U?

ASR9001, ASR1000-something (many models, 1RU/2RU, depending on ports), MX204

That the 6500BU decided to build an "XL" version of the 6T with only
1M FIB entries is, indeed, very annoying. Either declare the platform
dead and be done with it, or build new and interesting functionality
with proper FIB size. Or declare it "for DC only!" and do not even
offer an "XL" sup...

gert
--
"If was one thing all people took for granted, was conviction that if you
feed honest figures into a computer, honest figures come out. Never doubted
it myself till I met a computer with a sense of humor."
Robert A. Heinlein, The Moon is a Harsh Mistress

Gert Doering - Munich, Germany gert@greenie.muc.de
Re: 6500 series, FIB exception @ dfc [ In reply to ]
It appear you were hit by the "768k day":

https://www.youtube.com/watch?v=eTtriDf_2GU

https://motherboard.vice.com/en_us/article/vb9ez9/768k-day-is-as-overhyped-as-y2k-isp-says


--
Best regards,
Adrian Minta


_______________________________________________
cisco-nsp mailing list cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: 6500 series, FIB exception @ dfc [ In reply to ]
Hi,

On Thu, May 02, 2019 at 08:32:56PM +0300, Adrian Minta wrote:
> It appear you were hit by the "768k day":

"according to documentation", 2T/6T should be able to do 1M, and no
TCAM carving needed...

With IPv6 at ~70k today, there *should* be sufficient headroom... (or,
in other words, even a Sup720-XL would be fine if carved at 800k/100k)

gert
--
"If was one thing all people took for granted, was conviction that if you
feed honest figures into a computer, honest figures come out. Never doubted
it myself till I met a computer with a sense of humor."
Robert A. Heinlein, The Moon is a Harsh Mistress

Gert Doering - Munich, Germany gert@greenie.muc.de
Re: 6500 series, FIB exception @ dfc [ In reply to ]
Quick related question.

Can you not reduce the number of MPLS routes on this platform?

User configured :-
---------------
IPv4 - 768k
MPLS - 1k
IPv6 + IP multicast - 120k (default)

Upon reboot :-
-----------
IPv4 - 768k
MPLS - 16k (default)
IPv6 + IP multicast - 120k (default)

Notice that it appears to want to still allocate 16K for MPLS even though it's configured to 1k.



-----Original Message-----
From: cisco-nsp <cisco-nsp-bounces@puck.nether.net> On Behalf Of Gert Doering
Sent: Thursday, May 2, 2019 1:40 PM
To: Adrian Minta <adrian.minta@gmail.com>
Cc: cisco-nsp@puck.nether.net
Subject: Re: [c-nsp] 6500 series, FIB exception @ dfc

Hi,

On Thu, May 02, 2019 at 08:32:56PM +0300, Adrian Minta wrote:
> It appear you were hit by the "768k day":

"according to documentation", 2T/6T should be able to do 1M, and no TCAM carving needed...

With IPv6 at ~70k today, there *should* be sufficient headroom... (or, in other words, even a Sup720-XL would be fine if carved at 800k/100k)

gert
--
"If was one thing all people took for granted, was conviction that if you feed honest figures into a computer, honest figures come out. Never doubted it myself till I met a computer with a sense of humor."
Robert A. Heinlein, The Moon is a Harsh Mistress

Gert Doering - Munich, Germany gert@greenie.muc.de
_______________________________________________
cisco-nsp mailing list cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: 6500 series, FIB exception @ dfc [ In reply to ]
On Tue, 14 May 2019 at 14:28, Drew Weaver <drew.weaver@thenap.com> wrote:
>
> Quick related question.
>
> Can you not reduce the number of MPLS routes on this platform?
>
> User configured :-
> ---------------
> IPv4 - 768k
> MPLS - 1k
> IPv6 + IP multicast - 120k (default)
>
> Upon reboot :-
> -----------
> IPv4 - 768k
> MPLS - 16k (default)
> IPv6 + IP multicast - 120k (default)
>
> Notice that it appears to want to still allocate 16K for MPLS even though it's configured to 1k.

Hi Drew,

What config did you appply to generate that show command output?

You can't explicitly lower the MPLS allocation size (if memory serves
me), and instead you need to increase the size of something else to
implicitly remove allocated TCAM space from MPLS.

See this example output from a 7600:
https://null.53bits.co.uk/index.php?page=6500-7600-tcam-fib-allocation

Cheers,
James.
_______________________________________________
cisco-nsp mailing list cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/