OVHcloud Network Status

Current status
Legend
  • Operational
  • Degraded performance
  • Partial Outage
  • Major Outage
  • Under maintenance
FS#7477 — NX-OS 5.2.1
Scheduled Maintenance Report for Network & Infrastructure
Completed
Following the update of Nexus 5000 switches with the latest version 5.2.1 we encountred a lot of random problems that caused several failures.

We worked with Cisco teams and we found the origin of the problem. In the version 5.2.X Cisco added new tools to the vlan. Thus a vlan can have 2 different types.
When we update to 5.2.1 we should do the upgrade through a special version that fixes the type of vlan.
Otherwise, vlan type is not fixed and it is unknown.

# sh vlan internal info | i UNKNOWN
sdb_vlan_type UNKNOWN(16777216, err 0x0), oper =up
sdb_vlan_type UNKNOWN(67108864, err 0x0), oper =up
sdb_vlan_type UNKNOWN(50331648, err 0x0), oper =up
sdb_vlan_type UNKNOWN(50331648, err 0x0), oper =up
sdb_vlan_type UNKNOWN(67108864, err 0x0), oper =up
sdb_vlan_type UNKNOWN(16777216, err 0x0), oper =up
sdb_vlan_type UNKNOWN(16777216, err 0x0), oper =up
sdb_vlan_type UNKNOWN(16777216, err 0x0), oper =up
sdb_vlan_type UNKNOWN(16777216, err 0x0), oper =up


Result : when a robot will move on the switches to configure a new client, everything will crash. This is what happened on the night of Friday to Saturday on a HG network and today on a pCC network:
http://status.ovh.co.uk/?do=details&id=3520
http://status.ovh.co.uk/?do=details&id=3527
http://status.ovh.co.uk/?do=details&id=3529
http://status.ovh.co.uk/?do=details&id=3528

To fix it:we should make a
no vlan XX, XX, XX
then reapply the vlan configuration
vlan XX, XX, XX


Update(s):

Date: 2012-10-16 00:21:03 UTC
64 bytes from 172.16.XX.1: icmp_seq=59. time=0.334 ms
64 bytes from 172.16.XX.1: icmp_seq=60. time=0.433 ms

64 bytes from 172.16.XX.1: icmp_seq=66. time=0.371 ms
64 bytes from 172.16.XX.1: icmp_seq=67. time=0.344 ms
64 bytes from 172.16.XX.1: icmp_seq=68. time=0.484 ms
64 bytes from 172.16.XX.1: icmp_seq=69. time=0.337 ms
64 bytes from 172.16.XX.1: icmp_seq=70. time=0.721 ms
^C
----172.16.XX.1 PING Statistics----
95 packets transmitted, 90 packets received, 5% packet loss
round-trip (ms) min/avg/max/stddev = 0.228/0.6535/6.17/0.866
pcc-19-n5
done.

We didn't notice any impact on the performance of PCC.

The bug of all performing vlans of pcc storage switches has been fixed.
We will conduct some corrections on the maintenance vlans tomorrow during the day.

Date: 2012-10-16 00:20:33 UTC
64 bytes from 172.16.XX.1: icmp_seq=11. time=0.307 ms
64 bytes from 172.16.XX.1: icmp_seq=12. time=0.361 ms

64 bytes from 172.16.XX.1: icmp_seq=18. time=0.299 ms
64 bytes from 172.16.XX.1: icmp_seq=19. time=0.345 ms
64 bytes from 172.16.XX.1: icmp_seq=20. time=0.960 ms
^C
----172.16.XX.1 PING Statistics----
21 packets transmitted, 16 packets received, 23% packet loss
round-trip (ms) min/avg/max/stddev = 0.299/0.4337/0.960/0.182
pcc-118-n5 fait.

The last switch pcc: pcc-119-n5

Date: 2012-10-16 00:20:17 UTC
64 bytes from 172.20.XX.1: icmp_seq=13. time=0.341 ms
64 bytes from 172.20.XX.1: icmp_seq=14. time=0.616 ms

64 bytes from 172.20.XX.1: icmp_seq=31. time=0.312 ms
64 bytes from 172.20.XX.1: icmp_seq=32. time=0.299 ms
^C
----172.20.XX.1 PING Statistics----
33 packets transmitted, 17 packets received, 48% packet loss
round-trip (ms) min/avg/max/stddev = 0.299/0.4072/0.616/0.109
pcc-113-n5 fait.

Next: pcc-118-n5

Date: 2012-10-16 00:19:53 UTC
64 bytes from 172.19.XXX.1: icmp_seq=54. time=0.312 ms

64 bytes from 172.19.XXX.1: icmp_seq=66. time=0.414 ms
64 bytes from 172.19.XXX.1: icmp_seq=67. time=0.304 ms
64 bytes from 172.19.XXX.1: icmp_seq=68. time=0.273 ms
64 bytes from 172.19.XXX.1: icmp_seq=69. time=0.310 ms
64 bytes from 172.19.XXX.1: icmp_seq=70. time=0.337 ms
64 bytes from 172.19.XXX.1: icmp_seq=71. time=0.321 ms
^C
----172.19.XXX.1 PING Statistics----
72 packets transmitted, 61 packets received, 15% packet loss
round-trip (ms) min/avg/max/stddev = 0.273/0.3487/0.879/0.0768
pcc-112-n5 fait.

Next : pcc-113-n5

Date: 2012-10-16 00:19:38 UTC
64 bytes from 172.19.XXX.1: icmp_seq=97. time=0.320 ms
64 bytes from 172.19.XXX.1: icmp_seq=98. time=0.339 ms

64 bytes from 172.19.XXX.1: icmp_seq=115. time=0.321 ms
64 bytes from 172.19.XXX.1: icmp_seq=116. time=0.310 ms
^C
----172.19.XXX.1 PING Statistics----
117 packets transmitted, 101 packets received, 13% packet loss
round-trip (ms) min/avg/max/stddev = 0.278/0.33181/0.538/0.04253
pcc-107-n5 fait.

Next: pcc-112-n5

Date: 2012-10-16 00:19:20 UTC
64 bytes from 172.19.XX.1: icmp_seq=215. time=0.330 ms
64 bytes from 172.19.XX.1: icmp_seq=216. time=0.341 ms
64 bytes from 172.19.XX.1: icmp_seq=217. time=0.320 ms

64 bytes from 172.19.XX.1: icmp_seq=226. time=0.709 ms
64 bytes from 172.19.XX.1: icmp_seq=227. time=0.347 ms
64 bytes from 172.19.XX.1: icmp_seq=228. time=0.335 ms
64 bytes from 172.19.XX.1: icmp_seq=229. time=0.365 ms
^C
----172.19.XX.1 PING Statistics----
230 packets transmitted, 222 packets received, 3% packet loss
round-trip (ms) min/avg/max/stddev = 0.256/0.37774/1.37/0.1168
pcc-105-n5 fait.

Next: pcc-107-n5

Date: 2012-10-16 00:19:00 UTC
64 bytes from 172.18.XX.1: icmp_seq=277. time=0.515 ms
64 bytes from 172.18.XX.1: icmp_seq=278. time=0.416 ms
64 bytes from 172.18.XX.1: icmp_seq=279. time=0.377 ms
64 bytes from 172.18.XX.1: icmp_seq=280. time=0.404 ms

64 bytes from 172.18.XX.1: icmp_seq=292. time=0.360 ms
64 bytes from 172.18.XX.1: icmp_seq=293. time=0.356 ms
64 bytes from 172.18.XX.1: icmp_seq=294. time=0.291 ms
64 bytes from 172.18.XX.1: icmp_seq=295. time=0.326 ms
^C
----172.18.XX.1 PING Statistics----
296 packets transmitted, 285 packets received, 3% packet loss
round-trip (ms) min/avg/max/stddev = 0.272/0.45126/2.74/0.2082
pcc-104-n5 fait.

Next: pcc-105-n5

Date: 2012-10-16 00:18:42 UTC
We are starting with pcc-104-n5.

Date: 2012-10-16 00:18:23 UTC
It remains the N5 pCC storage, we will do it tonight.
then: rps, backup,3 pairs of N5-15, 27.31 storage to be done tomorrow morning with all the storage team.

Date: 2012-10-15 23:34:42 UTC
178.33.122.248/24 done

Date: 2012-10-15 23:34:25 UTC
188.165.14.0/24 done

Date: 2012-10-15 23:34:12 UTC
188.165.15.0/24 done
Posted Oct 15, 2012 - 23:21 UTC