OVHcloud Web Hosting Status

Current status
Legend
  • Operational
  • Degraded performance
  • Partial Outage
  • Major Outage
  • Under maintenance
FS#10289 — Collect IP Orange National
Incident Report for Web Cloud
Resolved
We have detected some packet loss for some customer access on the collect IP Orange National port on the Roubaix LNS. A ticket has been opened with the operator.

Update(s):

Date: 2014-09-30 11:03:40 UTC
The work-around solution implemented at the beginning of the summer has proven to be efficient. It lets impacted customers use their line normally.

We will therefore close the task.

However, we will actively follow up our work to resolve the incident at the source with Orange on the 15 impacted access ports. An team of an Orange provider will be dedicated to identifying the source cause and resolving the incident.

Date: 2014-08-12 11:36:49 UTC
Orange technical teams refused the possibility to create a second technical line connected to DSLAM-GE. The team are unable to force the connection with the DSLAM-GEs.

Today, we're re-requesting the quote that we've not yet managed to get.

Date: 2014-07-09 17:06:02 UTC
Following Orange's internal reunion on Tuesday 08/07, the conclusions are:

The DN that was testes is resolved after changing the IP/ATM conversion card.
This did not help us to identify the cause of the issue.

Orange will ask national experts to intervene to identify the source cause to find a global and not individual solution.

An estimate will be sent to OVH at the end of July for work that will start at the end of August/beginning of September.

In the meantime, OVH will ask for manual intervention on the DN in packet loss.

We will weigh up the possibility to create a 2nd line on the DNs potentially connectable to DSLAM-GE and to force the connection on the DSLAM-GEs.
This option will affect nearly 70% of the DNs involved, and customers will keep their current line.

Caution - the result is not guaranteed.

Orange will come back to us on this option, at the end of next week.

Meanwhile, each DN involved can manage its flow limitation via manager v6. This will improve connection fluidity.

Date: 2014-06-30 11:28:24 UTC
We chased up Orange last week and we have tried to organise a telephone meeting to get conclusions on the first resolution.

However, Orange wants to have an internal meeting on progress that will only take place on July 8th.

We are therefore waiting for their internal meeting befor we can continue with our joint investigations.


Date: 2014-06-19 11:52:11 UTC
We have carried out new tests in collaboration with Orange over the last two weeks on a customer line with packet loss.
Our customer has installed diagnostic tools requested by Orange on their machine and set up the requested test conditions: graphs show packet loss.
Tuesday afternoon, an Orange business engineer carried out tests over four hours and has:
- Eliminated the local loop
- Detected that the problem was in fact located further along the ATM part between the DSLAM and the BAS.
Orange had to move an entire ATM device card onto another card.
The problem disappeared following the Orange and OVH tests.
Customer tests are being finalised and initial results are encouraging.

Date: 2014-06-06 08:50:04 UTC
Summary of the conference call with Orange on Thursday.
An Orange DSLAM expert and an ATM expert were present.

Tests were performed on a line undergoing packet loss.
OVH is connected to the customer PC and only detects packet loss when the line receives traffic, which leads to download failure.
Orange do not see ANY abnormalities at DSLAM and ATM level.

In addition, when packet loss occurs, the Orange expert indicates that the congestion management at DSLAM level is disabled, as the traffic is beyond their limits.
This point has thus been disregarded.

Conclusion of Orange
No loss detected. Investigations have been suspended.
Nonetheless, Orange has offered to dedicate a national expert to carry out new tests, on quotation.
OVH is awaiting this quotation which is expected within a few weeks time.

OVH has also requested Orange to go further regarding the packet loss that was detected by Orange at BAS counter level.

We will maintain the workaround solution on the lines affected by the problem.


Date: 2014-05-28 03:40:45 UTC
Applying the workaround once again was positive.
Indeed, the results show virtually no more packet loss after appliying an increasing QoS from OVH's side.
We will deploy it on the lines impacted by the packet loss reported by incident ticket.

Regarding Orange, an \"access\" expert has been named to join the working group.
An appointment was taken for Thursday, June 5th for a call conference to keep looking for an optimal fix of the problem.

Date: 2014-05-26 06:29:13 UTC
Here is the report of the meeting with Orange about the packet loss .

1 - Synchronization falling of Orange had no impact.
2 - OVH has identified a customer who responds to all conditions namely an ND with packet loss and one without onthe same DSLAM.
3 - OVH and Orange have agreed on a test to check if a line is subject of the problem.
Generating 30Mbps UDP to the subscriber and check if the connection remains stable
( there may be loss of packets but not a full cut-off ) .
4 - The list of compatible modems is no longer valid for the moment.

After this test day in collaboration with Orange on a vacuum line subject to packet loss , here is the result:

The first step was to isolate precisely the conditions that lead to the loss of connection .
We noticed that when the line is heavily congested (eg launch of TCP download or more simultaneous downloads ) it systematically leads to a complete loss of connection at the IP level , a few seconds each time and with irregular intervals .
As we can see on the following trace :

[...]
[ 3] 267.0-268.0 sec 1.80 MBytes 15.1 Mbits/sec 0.204 ms 1270/ 2553 (50%)
[ 3] 268.0-269.0 sec 1.85 MBytes 15.5 Mbits/sec 0.249 ms 1236/ 2553 (48%)
[ 3] 269.0-270.0 sec 1.83 MBytes 15.3 Mbits/sec 0.251 ms 1229/ 2532 (49%)
[ 3] 270.0-271.0 sec 1.78 MBytes 14.9 Mbits/sec 0.288 ms 1279/ 2548 (50%)
[ 3] 271.0-272.0 sec 1.86 MBytes 15.6 Mbits/sec 0.376 ms 1243/ 2569 (48%)
[ 3] 272.0-273.0 sec 1.84 MBytes 15.4 Mbits/sec 0.214 ms 1243/ 2555 (49%)
[ 3] 273.0-274.0 sec 1.71 MBytes 14.4 Mbits/sec 0.184 ms 1145/ 2368 (48%)
[ 3] 274.0-275.0 sec 0.00 Bytes 0.00 bits/sec 0.184 ms 0/ 0 (-nan%)
[ 3] 275.0-276.0 sec 0.00 Bytes 0.00 bits/sec 0.184 ms 0/ 0 (-nan%)
[ 3] 276.0-277.0 sec 0.00 Bytes 0.00 bits/sec 0.184 ms 0/ 0 (-nan%)
[ 3] 277.0-278.0 sec 0.00 Bytes 0.00 bits/sec 0.184 ms 0/ 0 (-nan%)
[ 3] 278.0-279.0 sec 0.00 Bytes 0.00 bits/sec 0.184 ms 0/ 0 (-nan%)
[ 3] 279.0-280.0 sec 0.00 Bytes 0.00 bits/sec 0.184 ms 0/ 0 (-nan%)
[ 3] 280.0-281.0 sec 0.00 Bytes 0.00 bits/sec 0.184 ms 0/ 0 (-nan%)
[ 3] 281.0-282.0 sec 0.00 Bytes 0.00 bits/sec 0.184 ms 0/ 0 (-nan%)
[ 3] 282.0-283.0 sec 0.00 Bytes 0.00 bits/sec 0.184 ms 0/ 0 (-nan%)
[ 3] 283.0-284.0 sec 1.34 MBytes 11.2 Mbits/sec 0.217 ms 24736/25692 (96%)
[ 3] 284.0-285.0 sec 1.70 MBytes 14.2 Mbits/sec 0.176 ms 163/ 2373 (49%)
[ 3] 285.0-286.0 sec 1.98 MBytes 16.6 Mbits/sec 0.213 ms 1322/ 2731 (48%)
[ 3] 286.0-287.0 sec 1.82 MBytes 15.3 Mbits/sec 0.220 ms 1250/ 2549 (49%)
[ 3] 287.0-288.0 sec 1.84 MBytes 15.4 Mbits/sec 0.178 ms 1228/ 2541 (48%)
[ 3] 288.0-289.0 sec 1.85 MBytes 15.5 Mbits/sec 0.185 ms 1242/ 2563 (48%)
[ 3] 289.0-290.0 sec 1.85 MBytes 15.5 Mbits/sec 0.216 ms 1232/ 2549 (48%)
[...]


The consequence for the user can be a failure of current download.

The second step is to find a temporary solution that we can use on our own equipment.
We apply a QoS on PPP session of the test line at the LNS to limit upstream traffic to the subscriber.
Possible congestion is thus managed at the LNS instead of Orange backbone.
The test will continue this weekend and Monday we will assess the impact on the packet loss.

On the other hand, on Monday an access interlocutor (DSLAM) Orange will be named to integrate the working group.
The next meeting will be set on Monday.


Date: 2014-05-22 07:57:36 UTC
Following the telephone review this evening, the next actions will be as follows:

1- Orange will temporarily lower the synchronisation of 0x xx xx xx xx which
is undergoing packet loss on the Oleane FTP server.

2- OVH will identify a customer with an ND undergoing packet loss, and one
without (other than FT if possible) and seek their permission to
carry out new tests.

3- OVH will suggest a set of tests to Orange in order to obtain a version
*approved* by both parties, by the next conf call.

4- OVH and Orange will define a list of modems for carrying out these tests.
This list will be ready by the next conf call.

5- Orange will suspend the investigations on the list of NDs provided by
OVH in order to focus on these new actions.

The objective is agreeing on which tests are indisputable and ensuring that
future movements on site will be managed by a representative that is following
things at Orange. We can't implicate customers and have them travelling for
cancelled meetings or incomplete tests.

The next review will take place on Friday morning at 10:30.

Date: 2014-05-21 05:42:12 UTC
Here is the result of telephone point lead today with Orange about packet loss correction.

Orange has reminded us of the results of tests that show the packet loss on our FTP server,an FTP server which is independent but not on Oleane FTP server. Orange doubts on FTP servers used.

Each of us must take the following actions to be discussed Wednesday night.

Orange should report to us the results of the analysis of the list of ND transmitted

Orange must run tests from the client connection.

OVH will run tests on the Oleane FTP server to confirm or not their diagnosis

In parallel with BAS , a new lead has been identified: the faulty routers.network contacts of Orange and OVH will join the issue tomorrow.

Date: 2014-05-07 11:49:31 UTC
Orange has intervened on our client in order to carry out more tests.
At each intervention, the tests carried out compliment the previous ones and new representaives are added to the loop.

At the same time, Orange has carried out ATM and GE (Giga Ethernet) FTP download lab tests. In these two cases, gel during download (packet loss) was detected.
Orange has indicated that the problem may be located on their BAS or the OVH download servers (which OVH contests).
On the other hand, Orange has also performed ping tests and reports that a response is sometimes absent.
However, Orange states that the machine ping isn't a priority and thus doesn't apply to this test.

Future actions listed by Orange are as follows:
-OVH must provide an exhaustive list of customers with packet loss (already done).
-Orange is looking for a common point for domain names of this list.
-Orange will mount a download test server so as to avoid using the OVH server.
-Orange is putting its QoS teams in the loop.
-Orange will make a new, more realistic model in the lab and will relaunch the tests.

The malfunction is complex to analyse, which explains the many interventions.

Date: 2014-05-06 11:11:02 UTC
We have been contacted by the Orange collect network service about working together.

After various tests, the Orange engineer was unable to reproduce the bug on the Orange network.

We have arranged another meeting on site with a client so as to use their line undergoing packet loss, to analyse and study the access.

Date: 2014-05-06 02:58:38 UTC
Last week, Orange has managed changing BAS but no effect was noticed on the packet loss.

In agreement with Orange, we opened two incident tickets for two different customers with packet loss. Orange planned interventions.

On the other hand,following the request of Orange, we have established an access list with and without packet loss. This list contains information such as number of BAS connecting.
Our Orange speaker is analyzing and testing the access.

A reminder was made ​​that day.

Finally, last Monday the National Paris door was opened. It is functional and now used for back up but will have no impact on the packet loss issue, as the LNS is cleared of blame.

Date: 2014-04-25 10:34:31 UTC
Here is the outcome of the meeting that took place yesterday between the Orange manager and their technician, the OVH XDSL incident manager and our technician, our customer's technical manager and their technician, and Orange engineers from the Rennes site.

The intervention was brought to the NRA.
The following actions have been carried out:
- 2 plot changes on the line tested.
- Connection of a LiveBox with Orange ID directly on the strip.
- Change of the FT LAC.
- Change of the FT LNS.

After various tests were carried out, we detected that a packet loss issue was still present behind each action.
- The local loop and the OVH device are not causing the issue.
- The issue lies with the Orange connection from start to finish.

The BAS was replaced by a BAS of the same model.
A replacement with a different module is scheduled.

We are calling Orange today to confirm an intervention date for changing the BAS and to plan follow-up operations.


Date: 2014-04-24 10:53:05 UTC
Following our meeting with Orange last week, which enabled them to assess the packet loss and eliminate the OVH LNS from the equation,
another meeting is scheduled for today on the customer's Orange collect site to push the Orange network engineers' tests.
Orange will use the intervention to carry out investigations on its network: analysing the mounted session in more detail in order to debug it.
We won't be involved after this meeting.


Date: 2014-04-24 07:22:48 UTC
All regional customers are up on lns-1-rbx, migration completed.

Date: 2014-04-24 07:22:42 UTC
We will shut down the regional gateway on lab and switch onto lns-1-rbx.

Date: 2014-04-24 07:22:26 UTC
We will start preparing the migration.


Date: 2014-04-23 10:44:03 UTC
We're postponing the works for tonight.

Date: 2014-04-22 15:06:57 UTC
Tonight at around 0:00 we will restore the Roubaix-Regionale gateway on the live LNS.

Date: 2014-04-16 13:50:17 UTC
We have some feedback on the joint investigation with Orange. By using a Livebox and an Orange PPP account, the problem persists. The OVH LNS and the collect gateway are behind the issue. The problem has been found on the Orange network. Orang is continuing its investigations with the supervision engineers.

As for the new collect gateway, we have a commissioning date for the beginning of May. This will require a joint meeting.

Date: 2014-04-14 11:27:45 UTC
Three actions are in progress:

- Orange continues with the investigations on their side and together with OVH. A meeting is actually planned between FT and OVH for tomorrow with a customer undergoing packet loss. The tests will be carried out by two operators simultaneously, using our respective devices. Comparison of the results will give a precise diagnostic.

- We continue to analyse customers that have been migrated from the Roubaix port to the labo LNS. The initial results were not conclusive. (Please note - not all Roubaix customers have been migrated.)

- The new collect port order is ongoing. We will relaunch today.

Date: 2014-04-09 02:43:33 UTC
FT regional sessions are up on the LNS of the lab.

Date: 2014-04-09 02:43:02 UTC
We will start the operation.

Date: 2014-04-08 10:10:33 UTC
We will move the Roubaix port onto the lab LNS tonight directly (without passing via the Roubaix LNS). The aim is to eliminate the Roubaix LNS.

Date: 2014-04-04 03:36:35 UTC
Yesterday agreed with Orange about packet loss.
They managed to see and capture on access with a loss in their BAS of 20% of IP packets.

However, this detection is risky and since they are unable to reproduce the problem. To get more details, Orange and us are monitoring access to the DSLAM where the problem was seen in order to identify the cause.

We continue our investigation together. Elements no conclusive evidence has been found.

Date: 2014-03-31 14:53:00 UTC
We have received some feedback from Orange, they have not been able to identify the problem on their network. On our side we have changed the LNS card and the optics. There are no errors on the transports between Orange and OVH. We have also opened a ticket with the manufacturer but the tests have not been conclusive.

At Orange, the packet loss issue has increased. They have to come back to us this week.
Regarding the new port, we have no feedback on booking the slot. We're waiting for Orange. We hope to open it in the next two weeks.

Date: 2014-03-20 00:10:27 UTC
Migration done successfully, clients up on the new card.



Date: 2014-03-19 23:42:37 UTC
We will start the migration////

Date: 2014-03-19 16:05:57 UTC
As part of the search for a packet loss solution, we will switch the FT collect tonight at around 0:00 to another card on the Roubaix LNS.
A few minutes connection loss is expected.

Date: 2014-03-18 13:04:53 UTC
We are chasing it up with Orange.

Date: 2014-03-12 13:59:16 UTC
We have received some feedback from Orange.

The Orange infrastructure and network team is currently starting thorough investigations in order to analyse this problem.

We will wait until the end of the week for them to get back to us.

Date: 2014-03-10 16:35:24 UTC
We have chased Orange regarding the progress of their investigation. We are awaiting their response.

Date: 2014-03-07 11:52:43 UTC
Details have been passed to Orange. They're investigating. They should get back to us by mid-week.

Date: 2014-03-06 14:40:27 UTC
Following our discussion with Orange, we will work together on a list of DNs raised in order to investigate the isues further.

They are currently analysing the PPP sessions of these customers on their network; and we will give them feedback on the synchronisation data.

Date: 2014-03-05 12:08:22 UTC
We have an interview with Orange on Thrusday 6th so that we can work together on the issues.

Date: 2014-03-04 15:16:51 UTC
We haven't received any feedback from Orange, we will chase it up.

Date: 2014-02-28 14:01:32 UTC
We have carried out various tests from the LNS over the last few days.

We are opening a ticket with Orange for the packet loss issues.

Date: 2014-02-24 23:51:25 UTC
We will continue monitoring the access, tomorrow during the day.



Date: 2014-02-24 23:50:58 UTC
The LNS update is done. Check the intervention http://status.ovh.net/?do=details&id=6412. The customers are back on the LNS.



Date: 2014-02-24 23:40:47 UTC
http://status.ovh.net/?do=details&id=6412

Date: 2014-02-24 23:37:28 UTC
We will take advantage of the rebooting the control cards of the LNS in Roubaix to update the chassis. The intervention remains scheduled for the 25th February 2014.



Date: 2014-02-24 13:07:33 UTC
We are still having some packet loss issues. We will reboot the LNS control cards tonight, which will shut down all Orange and SFR PP sessions. The intervention is planned for February 25th 2014 from 01:00.

Date: 2014-02-22 07:34:31 UTC
The sessions are back. Today, we are monitoring the connections, it seems OK now.



Date: 2014-02-22 06:15:24 UTC
We are going to reboot the LNS



Date: 2014-02-21 21:26:22 UTC
Orange isn't detecting the issue from its side. We will reboot the card of the collection Orange National on the LNS in Roubaix on the 22 February at 05:00am.


Date: 2014-02-21 15:08:37 UTC
Orange is processing the ticket.
Posted Feb 21, 2014 - 15:07 UTC