FS#8387 — FS#12258 — VAC1
Attached to Project— Network
| Incident | |
| Whole Network | |
| CLOSED | |
![]() |
We are noticing instablities at the level
of the VAC infrastructur.
We are investigating.
We had an interruption on the 20th of December
2 times and on the 21st of December one time.
Date: Monday, 05 January 2015, 10:32AMof the VAC infrastructur.
We are investigating.
We had an interruption on the 20th of December
2 times and on the 21st of December one time.
Reason for closing: Done
RSS for all categories

The VAC1 is instable. One of the elements does not
work correcctly on the OSPF sessions and does some UP/DOWN.
Sometimes, not always.
When it happens, the trafic in the VAC1 is cleanded but
it's not rejected correctly in the internal network.
We have just cut VAC1 in order to find the origin
of the problem. The trafic is cleaned by VAC2 and VAC3
We are going to restrt the SUP1 which is in stand-by
and then wait the synchronization and then restart
the active sup. This will allow us to hot-swap
on a new sup and have again the logs
in all the VDCs . Maybe.
admin# reload module 1
This command will reboot standby supervisor module. (y/n)? [n] y
about to reset standby sup
This command will reboot standby supervisor module. (y/n)? [n] y
about to reset standby sup
admin# sh module
Mod Ports Module-Type Model Status
--- ----- ----------------------------------- ------------------ ----------
1 0 Supervisor Module-2 powered-up
2 0 Supervisor Module-2 N7K-SUP2E active *
admin# sh system redundancy status
Redundancy mode
---------------
administrative: HA
operational: HA
This supervisor (sup-2)
-----------------------
Redundancy state: Active
Supervisor state: Active
Internal state: Active with HA standby
Other supervisor (sup-1)
------------------------
Redundancy state: Standby
Supervisor state: HA standby
Internal state: HA standby
I love mihaps :)
admin# reload module 2 ?
<CR>
force-dnld Reboot a specific module to force NetBoot and image download
admin# reload module 2
Active sup reload is not supported.
We restarted the vac1-3-n7 which has problems.
admin# reload vdc vac1-3-n7
Are you sure you want to reload this vdc (y/n)? [no] y
2014 Dec 21 19:11:11 admin %$ VDC-1 %$ %VDC_MGR-2-VDC_OFFLINE: vdc 3 is now offline
2014 Dec 21 19:11:11 admin %$ VDC-1 %$ %SYSMGR-STANDBY-2-SHUTDOWN_SYSTEM_LOG: vdc 3 will shut down soon.
admin# 2014 Dec 21 19:11:53 admin %$ VDC-1 %$ %VDC_MGR-2-VDC_ONLINE: vdc 3 has come online
It is UP and we have the logs again.
We are going to do a switchover of the SUP
admin# system switchover
admin#
All the VDCs restarted their L3.
They are all UP. Here my feeling is telling me
that the vac1-3-n7 is good again. We look in depth.
Well, everything seems okay now. This is probably a bug.
An upgrade to the latest version of NX-OS is required
in the coming days.
We are going to reactivate VAC1. The DDoS are going to cleaned on the 3 VAC:
VAC1 (RBX/GRA) VAC2 (SBG) and VAC3 (BHS).
OSPF is down on vac1-3-n7.
2014 Dec 22 21:42:27 vac1-3-n7 %OSPFV3-5-ADJCHANGE: ospfv3-16276 [18598] on port-channel1 went DOWN
2014 Dec 22 21:42:29 vac1-3-n7 %OSPF-5-ADJCHANGE: ospf-16276 [18599] on port-channel1 went DOWN
2014 Dec 22 21:43:13 vac1-3-n7 %OSPFV3-4-SYSLOG_SL_MSG_WARNING: OSPF-4-NEIGH_ERR: message repeated 84 times in last 14 sec
2014 Dec 22 21:43:19 vac1-3-n7 %OSPF-4-SYSLOG_SL_MSG_WARNING: OSPF-4-NEIGH_ERR: message repeated 16 times in last 11 sec
Once again, we are disabling VAC1.
We have added a more resources on VDC and rebooted.
admin# reload vdc vac1-2-n7
Are you sure you want to reload this vdc (y/n)? [no] yes
2014 Dec 22 22:37:23 admin %$ VDC-1 %$ %VDC_MGR-2-VDC_OFFLINE: vdc 5 is now offline
2014 Dec 22 22:37:23 admin %$ VDC-1 %$ %SYSMGR-STANDBY-2-SHUTDOWN_SYSTEM_LOG: vdc 5 will shut down soon.
admin# 2014 Dec 22 22:38:11 admin %$ VDC-1 %$ %VDC_MGR-2-VDC_ONLINE: vdc 5 has come online
vac1-2-n7 configuration failed.. I like surprises.
Well, it is being pushed from scratch.
The vac1-2-n7 henceforth has enough RAM to take any configuration. it is possible that it is the original the problem even if it is rather odd since the problem is located vac1-3-n7.
We re-enabled VAC1.
If VAC1 still causes any issues, we will then disable it to run deeper maintenance.
VAC 1 was disabled. On vac1-3-n7 BGP configuration is no longer in "sh run", sh ip bgp sum but shows all sessions are still running.
We will update the chassis with a new version of NX-OS
It feel like a series of bugs.
It's still crazy:
vac1-3-n7 # sh ip bgp sum
BGP table version is 240527, IPv4 Unicast config peers 4 capable peers 4
vac1-3-n7 # sh run | i bgp
vac1-3-n7 #
This is the first time that I've seen commands
lost while everything remains in place. I think
the problem comes from this. The configuration appears in place
but probably it is no longer fully programmed.
The pieces that are missing are probably the origin of
dysfunctions.
We look at the configuration backups and updates from the same time frame.
All the necessary files to be debuged were taken. We try to reapply the settings and restart conf to return to a stable state.
The moment of truth: we will shift the VAC1 in ISSU (6.2.2 to 6.2.8a and 6.2.10)
Go!
Compatibility check is done:
Module bootable Impact Install-type Reason
------ -------- -------------- ------------ ------
1 yes non-disruptive reset
2 yes non-disruptive reset
3 yes non-disruptive rolling
4 yes non-disruptive rolling
6 yes non-disruptive rolling
mages will be upgraded according to following table:
Module Image Running-Version(pri:alt) New-Version Upg-Required
------ ---------- ---------------------------------------- -------------------- ------------
1 system 6.2(2) 6.2(8a) yes
1 kickstart 6.2(2) 6.2(8a) yes
1 bios v2.12.0(05/29/2013):v2.12.0(05/29/2013) v2.12.0(05/29/2013) no
2 system 6.2(2) 6.2(8a) yes
2 kickstart 6.2(2) 6.2(8a) yes
2 bios v2.12.0(05/29/2013):v2.12.0(05/29/2013) v2.12.0(05/29/2013) no
3 lc1n7k 6.2(2) 6.2(8a) yes
3 bios v2.0.22(06/03/13):v2.0.22(06/03/13) v2.0.32(12/16/13) yes
4 lc1n7k 6.2(2) 6.2(8a) yes
4 bios v2.0.22(06/03/13):v2.0.22(06/03/13) v2.0.32(12/16/13) yes
6 lc1n7k 6.2(2) 6.2(8a) yes
6 bios v2.0.22(06/03/13):v2.0.22(06/03/13) v2.0.32(12/16/13) yes
8 lc1n7k 6.2(2) 6.2(8a) yes
8 bios v2.0.22(06/03/13):v2.0.22(06/03/13) v2.0.32(12/16/13) yes
Do you want to continue with the installation (y/n)? [n] y8 yes non-disruptive rolling
Module 8 crashed:
Module 8: Non-disruptive upgrading.
[# ] 0%2014 Dec 23 18:50:51 vac1-1-n7 %$ VDC-4 %$ %VMM-2-VMM_TIMEOUT: VDC4: Service SAP RPM Ctrl MTS queue for slot 8 timed out in UPGRADE_READY_SEQ sequence
2014 Dec 23 18:50:51 admin %$ VDC-1 %$ %MODULE-2-LCM_UPGRADE_READY_FAIL: Upgrade ready message returned 1 0x41850006 for SAP Im SAP
2014 Dec 23 18:50:51 admin %$ VDC-1 %$ %MODULE-2-LCM_UPGRADE_READY_GENERAL_FAIL: Upgrade ready message fails SAP Im SAP [# ] 0% -- FAIL.
Return code 0x41850006 (Sequence timeout).
Install has failed. Return code 0x40930020 (Non-disruptive upgrade of a module failed).
Please identify the cause of the failure, and try 'install all' again.
User Access Verification
admin login: 2014 Dec 23 18:51:07 vac1-2-n7 %$ VDC-5 %$ %PLATFORM-2-MOD_DETECT: Module 8 detected (Serial number JAF1702AJHC) Module-Type 1/10 Gbps Ethernet Module Model N7K-F248XP-25E
2014 Dec 23 18:51:08 vac1-2-n7 %$ VDC-5 %$ %PLATFORM-2-MOD_PWRUP: Module 8 powered up (Serial number JAF1702AJHC)
2014 Dec 23 18:51:07 vac1-1-n7 %$ VDC-4 %$ %PLATFORM-2-MOD_DETECT: Module 8 detected (Serial number JAF1702AJHC) Module-Type 1/10 Gbps Ethernet Module Model N7K-F248XP-25E
2014 Dec 23 18:51:08 vac1-1-n7 %$ VDC-4 %$ %PLATFORM-2-MOD_PWRUP: Module 8 powered up (Serial number JAF1702AJHC)
2014 Dec 23 18:51:07 vac1-3-n7 %$ VDC-3 %$ %PLATFORM-2-MOD_DETECT: Module 8 detected (Serial number JAF1702AJHC) Module-Type 1/10 Gbps Ethernet Module Model N7K-F248XP-25E
2014 Dec 23 18:51:08 vac1-3-n7 %$ VDC-3 %$ %PLATFORM-2-MOD_PWRUP: Module 8 powered up (Serial number JAF1702AJHC)
2014 Dec 23 18:51:07 vac1-ext-n7 %$ VDC-7 %$ %PLATFORM-2-MOD_DETECT: Module 8 detected (Serial number JAF1702AJHC) Module-Type 1/10 Gbps Ethernet Module Model N7K-F248XP-25E
2014 Dec 23 18:51:08 vac1-ext-n7 %$ VDC-7 %$ %PLATFORM-2-MOD_PWRUP: Module 8 powered up (Serial number JAF1702AJHC)
2014 Dec 23 18:51:07 vac1-fw-n7 %$ VDC-6 %$ %PLATFORM-2-MOD_DETECT: Module 8 detected (Serial number JAF1702AJHC) Module-Type 1/10 Gbps Ethernet Module Model N7K-F248XP-25E
2014 Dec 23 18:51:07 vac1-bgp-n7 %$ VDC-2 %$ %PLATFORM-2-MOD_DETECT: Module 8 detected (Serial number JAF1702AJHC) Module-Type 1/10 Gbps Ethernet Module Model N7K-F248XP-25E
2014 Dec 23 18:51:08 vac1-fw-n7 %$ VDC-6 %$ %PLATFORM-2-MOD_PWRUP: Module 8 powered up (Serial number JAF1702AJHC)
2014 Dec 23 18:51:08 vac1-bgp-n7 %$ VDC-2 %$ %PLATFORM-2-MOD_PWRUP: Module 8 powered up (Serial number JAF1702AJHC)
2014 Dec 23 18:51:07 vac1-email-n7 %$ VDC-8 %$ %PLATFORM-2-MOD_DETECT: Module 8 detected (Serial number JAF1702AJHC) Module-Type 1/10 Gbps Ethernet Module Model N7K-F248XP-25E
2014 Dec 23 18:51:08 vac1-email-n7 %$ VDC-8 %$ %PLATFORM-2-MOD_PWRUP: Module 8 powered up (Serial number JAF1702AJHC)
2014 Dec 23 18:51:07 admin %$ VDC-1 %$ %PLATFORM-2-MOD_DETECT: Module 8 detected (Serial number JAF1702AJHC) Module-Type 1/10 Gbps Ethernet Module Model N7K-F248XP-25E
2014 Dec 23 18:51:08 admin %$ VDC-1 %$ %PLATFORM-2-MOD_PWRUP: Module 8 powered up (Serial number JAF1702AJHC)
A VDC is not sending its LACP packs due to an issue failure.
We will launch a second issue in order to properly update module 8.
The set of modules are in version 6.2.8a. However, we still have LACP problems.
We are launching an issue in order to move from version 6.2.8a to version 6.2.10.
Card 8 seems to have problems after the crash. LACP problems may be a consequence.
We will proceed to unplug/plug the card in the chassis.
vac1-1-n7 is UP again. We had to make multiple modifications on the configuration then reload all of it .. fun on chocolate bar ..
It is started for 8.2.10 and we will achieve the task ..
The upgrade to 6.2.10 was done without problems (Module 8 took pity on us and has not crashed this time: it's Christmas in advance!).
We are moving to EPLD, first on the standby sup
admin# install module 1 epld bootflash:n7000-s2-epld.6.2.10.img
Copy complete, now saving to disk (please wait)...
EPLD image signature verification passed
Compatibility check:
Module Type Upgradable Impact Reason
------ ---- ---------- ---------- ------
1 SUP Yes disruptive Module Upgradable
Retrieving EPLD versions... Please wait.
Images will be upgraded according to following table:
Module Type EPLD Running-Version New-Version Upg-Required
------ ---- ------------- --------------- ----------- ------------
1 SUP Power Manager SPI 34.000 37.000 Yes
1 SUP IO SPI 1.012 1.013 Yes
The above modules require upgrade.
Do you want to continue (y/n) ? [n] y
Starting Module 1 EPLD Upgrade
Module 1 : Power Manager SPI [Upgrade Started ]
Module 1 : Power Manager SPI [Erasing ] : 100.00%
Module 1 : Power Manager SPI [Programming ] : 100.00% (1464788 of 1464788 total bytes)
Module 1 : IO SPI [Upgrade Started ]
Good :
Module 1 EPLD upgrade is successful.
On lance le switchover :
admin# system switchover
admin#
All EPLDs are now updated!
We made a last switchover for the route.
done:
admin# sh redundancy status
Redundancy mode
---------------
administrative: HA
operational: HA
This supervisor (sup-2)
-----------------------
Redundancy state: Active
Supervisor state: Active
Internal state: Active with HA standby
Other supervisor (sup-1)
------------------------
Redundancy state: Standby
Supervisor state: HA standby
Internal state: HA standby
We will monitor tonight before switching VAC1 back to production.
VAC1 is okay. We are verifying that the configuration is complete.
It is complete.
We will go back in production for a minute to validate thar the configuration
is completed well. It is going gooid.
We disactivate.
We are going to reactivate tomorrow, on Dec.25 and see if VAC1 still poses problems.
VAC1 is UP again. We will be monitoring for 48H before confirming that it was "fixed".
no more instabilities. all is OK.
we will close the thread. we're also going to update the VAC2 and VAC3 with the same settings (VAC3 is already partly on VAC1 configuration) but there's still the RAM taken up by the vacX-2-n7 and the version of NX-OS. we will do this at the beginning of January.
there's been another incident on the VAC1. the ports went down, OSPF/BGP as well on the 3 vac1-1/2/3-n7.
we've now got the logs and it seems to be a hardware issue on the 2 cards of the VAC1 chassis.
1 January at 15:53 until 15:57 (GMT +1)
2014 Dec 24 13:30:17 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:6 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:27,48 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 24 13:30:42 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:8 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:24,27 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 24 16:50:17 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:6 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:4 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 24 16:50:17 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:8 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:4,48 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 24 20:10:17 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:6 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:28 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 24 20:10:17 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:8 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:28 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 24 23:30:16 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:6 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:5 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 24 23:30:24 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:8 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:5 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 25 02:50:17 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:8 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:6 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 25 02:50:18 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:6 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:6 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 25 06:10:16 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:6 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:7 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 25 06:10:17 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:8 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:7 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 25 09:30:17 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:6 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:8 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 25 09:30:21 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:8 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:8 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 25 12:50:16 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:6 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:9 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 25 12:50:17 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:8 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:9 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 25 16:10:16 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:6 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:33 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 25 16:10:16 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:8 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:33 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 25 19:30:15 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:6 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:10 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 25 19:30:18 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:8 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:10 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 25 22:50:16 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:6 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:34 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 25 22:50:16 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:8 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:34 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 26 02:10:15 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:6 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:11 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 26 02:10:16 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:8 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:11 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 26 05:30:15 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:6 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:35 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 26 05:30:18 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:8 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:35 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 26 08:50:15 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:6 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:12 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 26 08:50:15 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:8 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:12 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 26 12:10:15 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:6 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:36 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 26 12:10:15 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:8 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:36 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 26 15:30:15 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:6 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:13 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 26 15:30:16 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:8 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:13 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 26 18:50:14 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:6 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:14 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 26 18:50:15 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:8 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:14 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 26 22:10:15 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:8 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:15 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 26 22:10:15 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:6 Test:SnakeLoopback failed 10 consecutive times. Faulty module:Module 6 affected ports:15 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 27 01:30:14 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:6 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:39 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 27 01:30:14 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:8 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:39 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 27 04:50:15 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:8 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:16 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 27 04:50:16 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:6 Test:SnakeLoopback failed 10 consecutive times. Faulty module:Module 6 affected ports:16 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 27 08:10:14 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:6 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:40 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 27 08:10:14 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:8 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:40 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 27 11:30:13 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:6 Test:SnakeLoopback failed 10 consecutive times. Faulty module:Module 6 affected ports:17 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 27 11:30:14 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:8 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:17 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 27 14:50:13 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:6 Test:SnakeLoopback failed 10 consecutive times. Faulty module:Module 6 affected ports:18 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 27 14:50:14 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:8 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:41 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 27 18:10:13 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:6 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:19 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 27 18:10:14 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:8 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:18 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 27 21:30:13 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:6 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:20 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 27 21:30:14 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:8 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:42 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 28 00:50:13 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:6 Test:SnakeLoopback failed 10 consecutive times. Faulty module:Module 6 affected ports:45 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 28 00:50:13 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:8 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:19 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 28 04:09:12 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:6 Test:SnakeLoopback failed 10 consecutive times. Faulty module:Module 6 affected ports:46 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 28 04:10:13 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:8 Test:SnakeLoopback failed 10 consecutive times. Faulty module:Module 8 affected ports:43 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 28 07:30:22 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:8 Test:SnakeLoopback failed 10 consecutive times. Faulty module: affected ports:20 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 28 10:50:13 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:8 Test:SnakeLoopback failed 10 consecutive times. Faulty module:Module 8 affected ports:44 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 28 14:10:13 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:8 Test:SnakeLoopback failed 10 consecutive times. Faulty module:Module 8 affected ports:45 Error:Error in Forwarding ASIC between DE and GD block
2014 Dec 28 17:29:13 admin %$ VDC-1 %$ %DIAG_PORT_LB-2-SNAKE_TEST_LOOPBACK_TEST_FAIL: Module:8 Test:SnakeLoopback failed 10 consecutive times. Faulty module:Module 8 affected ports:46 Error:Error in Forwarding ASIC between DE and GD block
2015 Jan 1 15:53:00 vac1-1-n7 %ETH_PORT_CHANNEL-5-PORT_DOWN:
port-channel4: Ethernet8/19 is down
2015 Jan 1 15:53:00 vac1-1-n7 %ETH_PORT_CHANNEL-5-FOP_CHANGED:
port-channel4: first operational port changed from Ethernet8/19 to
Ethernet6/19
2015 Jan 1 15:53:00 vac1-1-n7 %ETHPORT-5-IF_BANDWIDTH_CHANGE: Interface
port-channel4,bandwidth changed to 30000000 Kbit
2015 Jan 1 15:53:00 vac1-1-n7 %ETHPORT-5-IF_DOWN_INITIALIZING: Interface
Ethernet8/19 is down (Initializing)
2015 Jan 1 15:53:01 vac1-1-n7 %ETH_PORT_CHANNEL-5-PORT_DOWN:
port-channel4: Ethernet8/20 is down
2015 Jan 1 15:53:01 vac1-1-n7 %ETHPORT-5-IF_BANDWIDTH_CHANGE: Interface
port-channel4,bandwidth changed to 20000000 Kbit
2015 Jan 1 15:53:01 vac1-1-n7 %ETHPORT-5-IF_DOWN_INITIALIZING: Interface
Ethernet8/20 is down (Initializing)
2015 Jan 1 15:53:04 vac1-1-n7 %ETH_PORT_CHANNEL-5-PORT_UP: port-channel4:
Ethernet8/19 is up
2015 Jan 1 15:53:04 vac1-1-n7 %ETH_PORT_CHANNEL-5-PORT_UP: port-channel4:
Ethernet8/20 is up
2015 Jan 1 15:53:05 vac1-1-n7 %ETHPORT-5-IF_BANDWIDTH_CHANGE: Interface
port-channel4,bandwidth changed to 30000000 Kbit
2015 Jan 1 15:53:05 vac1-1-n7 %ETHPORT-5-IF_UP: Interface Ethernet8/19 is
up in mode access
2015 Jan 1 15:53:07 vac1-1-n7 %ETHPORT-5-IF_BANDWIDTH_CHANGE: Interface
port-channel4,bandwidth changed to 40000000 Kbit
2015 Jan 1 15:53:07 vac1-1-n7 %ETHPORT-5-IF_UP: Interface Ethernet8/20 is
up in mode access
2015 Jan 1 15:53:00 vac1-2-n7 %ETH_PORT_CHANNEL-5-PORT_DOWN:
port-channel5: Ethernet8/36 is down
2015 Jan 1 15:53:00 vac1-2-n7 %ETHPORT-5-IF_BANDWIDTH_CHANGE: Interface
port-channel5,bandwidth changed to 30000000 Kbit
2015 Jan 1 15:53:00 vac1-2-n7 %ETHPORT-5-IF_DOWN_INITIALIZING: Interface
Ethernet8/36 is down (Initializing)
2015 Jan 1 15:53:00 vac1-2-n7 %ETH_PORT_CHANNEL-5-PORT_DOWN:
port-channel4: Ethernet8/33 is down
2015 Jan 1 15:53:00 vac1-2-n7 %ETH_PORT_CHANNEL-5-FOP_CHANGED:
port-channel4: first operational port changed from Ethernet8/34 to
Ethernet6/33
2015 Jan 1 15:53:00 vac1-2-n7 %ETHPORT-5-IF_BANDWIDTH_CHANGE: Interface
port-channel4,bandwidth changed to 30000000 Kbit
2015 Jan 1 15:53:00 vac1-2-n7 %ETHPORT-5-IF_DOWN_INITIALIZING: Interface
Ethernet8/33 is down (Initializing)
2015 Jan 1 15:53:00 vac1-2-n7 %ETH_PORT_CHANNEL-5-PORT_DOWN:
port-channel4: Ethernet8/34 is down
2015 Jan 1 15:53:01 vac1-2-n7 %ETHPORT-5-IF_BANDWIDTH_CHANGE: Interface
port-channel4,bandwidth changed to 20000000 Kbit
2015 Jan 1 15:53:01 vac1-2-n7 %ETHPORT-5-IF_DOWN_INITIALIZING: Interface
Ethernet8/34 is down (Initializing)
2015 Jan 1 15:53:03 vac1-2-n7 %ETH_PORT_CHANNEL-5-PORT_UP: port-channel5:
Ethernet8/36 is up
2015 Jan 1 15:53:05 vac1-2-n7 %ETH_PORT_CHANNEL-5-PORT_UP: port-channel4:
Ethernet8/33 is up
2015 Jan 1 15:53:05 vac1-2-n7 %ETHPORT-5-IF_BANDWIDTH_CHANGE: Interface
port-channel4,bandwidth changed to 30000000 Kbit
2015 Jan 1 15:53:05 vac1-2-n7 %ETHPORT-5-IF_UP: Interface Ethernet8/33 is
up in Layer3
2015 Jan 1 15:53:06 vac1-2-n7 %ETH_PORT_CHANNEL-5-PORT_DOWN:
port-channel5: Ethernet8/36 is down
2015 Jan 1 15:53:07 vac1-2-n7 %ETH_PORT_CHANNEL-5-PORT_UP: port-channel4:
Ethernet8/34 is up
2015 Jan 1 15:53:07 vac1-2-n7 %ETHPORT-5-IF_BANDWIDTH_CHANGE: Interface
port-channel4,bandwidth changed to 40000000 Kbit
2015 Jan 1 15:53:07 vac1-2-n7 %ETHPORT-5-IF_UP: Interface Ethernet8/34 is
up in Layer3
2015 Jan 1 15:53:10 vac1-2-n7 %ETH_PORT_CHANNEL-5-PORT_UP: port-channel5:
Ethernet8/36 is up
2015 Jan 1 15:53:10 vac1-2-n7 %ETHPORT-5-IF_BANDWIDTH_CHANGE: Interface
port-channel5,bandwidth changed to 40000000 Kbit
2015 Jan 1 15:53:10 vac1-2-n7 %ETHPORT-5-IF_UP: Interface Ethernet8/36 is
up in Layer3
2015 Jan 1 15:53:00 vac1-3-n7 %ETH_PORT_CHANNEL-5-PORT_DOWN:
port-channel5: Ethernet4/20 is down
2015 Jan 1 15:53:00 vac1-3-n7 %ETHPORT-5-IF_BANDWIDTH_CHANGE: Interface
port-channel5,bandwidth changed to 30000000 Kbit
2015 Jan 1 15:53:00 vac1-3-n7 %ETHPORT-5-IF_DOWN_INITIALIZING: Interface
Ethernet4/20 (description:vac1-2-n7) is down (Initializing)
2015 Jan 1 15:53:03 vac1-3-n7 %ETH_PORT_CHANNEL-5-PORT_UP: port-channel5:
Ethernet4/20 is up
2015 Jan 1 15:53:03 vac1-3-n7 %ETHPORT-5-IF_BANDWIDTH_CHANGE: Interface
port-channel5,bandwidth changed to 40000000 Kbit
2015 Jan 1 15:53:03 vac1-3-n7 %ETHPORT-5-IF_UP: Interface Ethernet4/20
(description:vac1-2-n7) is up in Layer3
2015 Jan 1 15:53:06 vac1-3-n7 %ETH_PORT_CHANNEL-5-PORT_DOWN:
port-channel5: Ethernet4/20 is down
2015 Jan 1 15:53:06 vac1-3-n7 %ETHPORT-5-IF_BANDWIDTH_CHANGE: Interface
port-channel5,bandwidth changed to 30000000 Kbit
2015 Jan 1 15:53:06 vac1-3-n7 %ETHPORT-5-IF_DOWN_INITIALIZING: Interface
Ethernet4/20 (description:vac1-2-n7) is down (Initializing)
2015 Jan 1 15:53:10 vac1-3-n7 %ETH_PORT_CHANNEL-5-PORT_UP: port-channel5:
Ethernet4/20 is up
2015 Jan 1 15:53:10 vac1-3-n7 %ETHPORT-5-IF_BANDWIDTH_CHANGE: Interface
port-channel5,bandwidth changed to 40000000 Kbit
2015 Jan 1 15:53:10 vac1-3-n7 %ETHPORT-5-IF_UP: Interface Ethernet4/20
(description:vac1-2-n7) is up in Layer3
VAC1 has been disabled.
We've opened a Cisco TAC case to get spare cards.
In the meantime, we've added static routes to the the VAC1 so as to debug it more effectively.
If the OSPF goes down again, the customer's traffic will therefore not get cut off. We will then be able to debug as we wish and find the source of the issue without impacting customers.
VAC1 has been disabled. Packet forwarding stops occasionally and so OSPF is down, not the other way around. We're waiting for the spare Cisco cards.
We're reloading the VAC1 chassis
admin# reload |
!!!WARNING! there is unsaved configuration in VDC!!!
This command will reboot the system. (y/n)? [n] y
the VAC1 has been rebooted. So all the VDC have been rebooted. No old codes are left on the VAC1 which could have stayed as a result of the hot swap with the ISSU. Everything has been reprogrammed since the beginning, so everythings been explored.
VAC1 is now operating again.
At the same time, we should receive the cards shortly.
Scheduled delivery date: 02-JAN-2015 15:36 (GMT +1)
Line: 1.1 Product: N7K-F248XP-25E= Quantity: 1
Line: 2.1 Product: N7K-F248XP-25E= Quantity: 1
The cards have been delivered and we're configuring the software.
After having fully rebooted the chassis, we're not sure that it's a hardware issue. It could be that some configurations remained on the cards and we just needed to reboot them to clear them fully. This sort of thing is quite common with hot swaps.
We've therefore inserted the new F2 cards into slots 7 and 9, and if there's still an issue, we will insert and reconfigure all the ports of card 6 and 9 and then send back cards 6 and 8.
We think, no, we feel (feeling) that the problem is a programmation
bug of TCAM in the case where we use a VDC with a mix of the cards
M2 and F2 . The concerned VDC does not tell anything.
Only the OSPF is cut and then is back.
We removed all the programmed routed on the vac1-2-n7
we removed the "soft" of BGP to avoid to take from the RAM
and we interrupted the BGP sessions with the mixed VDC.
"sh resources" is however correct.
We reboot all the VAC1 entirely.
The VAC1 is rebooted. We let it stabilize.
The VAC is back online.
We disconfigured the IPV6 which can also problems
via the OSPFv3.
The problem is still there.
We interrupt the VAC1.
We insert the spare card and we will reconfigure the 2 F2 cards.
Jan 4 09:23:30 %PLATFORM-2-MOD_REMOVE: Module 6 removed
Jan 4 09:25:23 %PLATFORM-2-MOD_REMOVE: Module 8 removed
The VAC1 is back online. w8
Deactivated. We ask the TAC to delicer the spares
of M2 and FABs on the Cisco Nexus 7009 in order
to replace everything.
watching the traffic that goes to the CPU and MAV1 MAV2
it was apparent that there is traffic that should not
be. Som traffic should be dropped on the input on the backbone
because it is intended for the backbone. basically it is the DDoS
to routers.
According to research on the backbone, it was perceived that
LAG with TPIX in Warsaw was not reconfigured properly:
missing an ACL that protects the backbone.
So the attacks from Poland to the destination
equipment networks passed without problem.
I am going to add (5:14 p.m.)
and reconfigure the LAG .
Forgetting the ACL:
On 30-31 October 2013, we had a concern
on the router in Warsaw var-1-6k and we
Spent 2 days to stabilize.
http://status.ovh.net/?do=details&id=5678
The reconfiguration of LAGs was
made at that time .. and 14 months after it has
been exploited: from December 20 infra
MAV1 and MAV2, itself, has been the target of
DDoS and experienced instability 10-20
about 25-30 seconds after 3 weeks.
Normally, ACLs block these DDoS.
We have reconfigured permanent mitigation.
Mitigationperm-ipv4 # ls | wc -l
17049
IPv4 has default protection again.
The issue is fixed.
We are sorry that it took as long as it did to resolve the issue.