Myreader.co.uk  
uk news, chat and community
   home   |   control panel login   |   archive   |  
 
net
net
news.announce
news.config
news.management
news.moderation
providers
providers.aaisp
web.authoring
  
 
date: Thu, 22 Oct 2009 09:00:00 +0100,    group: uk.net.providers.aaisp        back       
[Status] [Update #6] [closed] 22:58 LNS restart   
Posted at 2009-10-21 22:58 BST by RevK
Update #6: 2009-10-22 09:00 BST

  This time it was us and all graph are lost. We're investigating the
  cause now.
  
  Lines coming back. 20CN and 21CN affected. Be lines were only off a few
  seconds.
  
  Seeing as this major issue has happened we are updating LNS code
  tonight anyway. Lines will go on to "B" LNS now as the reconnect and
  will be moved back to "A" over night.
  
  Update: Again, 20CN lines very slow to reconnect. We are clearing stuck
  sessions again.
  Update: We have identified some extra slowness this time, which is some
  unexplained RADIUS issue our end.
  Update: We are also seeing IPv6 issues all of a sudden
  
  This is crazy!!
  
  Update: Our RADIUS being slow is sorted, but BT are still being slow
  sending sessions to us.
  
  Update: It is 00:20, and we have most people on, and have almost
  finished moving people to "B". We have an issue with native IPv6 on "B"
  which means we will move everyone back to "A" shortly (over night).
  
  Update: Nearly 1am, and we are moving lines back to the "A" LNS. IPv6
  is working properly, and lines are moving quickly and cleanly over. We
  are also picking up any remaining stuck sessions.
  
  We have lots of post-mortem to do tomorrow.
  
  Most people for most of this incident have been on-line. 21CN customers
  had outage of a few minutes, and 20CN customers much longer as it takes
  a while to get stuck sessions cleared on BT.
  
  Update: 07:46 Two of us have had a look at the code over night to try
  and find the cause of the original crash - we think we have it now, so
  a further update over the weekend is expected.[IMAGE]

URL: http://aaisp.blogspot.com/2009/10/open-2258-major-blip.html

-- 
AAISP Status Blog
URL:http://aaisp.blogspot.com/
date: Thu, 22 Oct 2009 09:00:00 +0100   author:   RevK

Re: [Status] [Update #6] [closed] 22:58 LNS restart   
On Thu, 22 Oct 2009 12:10:50 +0100, John Devereux wrote:

> We have had this for months now, with these problems for many or most 
> weekends and many evenings. I am getting a lot of pressure to change 
> providers, I think one more "weekend" blip should do it. (For us the 
> evenings and weekends are the *worst* times for you to to schedule your 
> maintenance, since noone is there to reboot the router and of course 
> there is no technical support either).

Very good points. Maybe an weekly "at risk" period from 1000 to 1100
on a Tuesday morning would be better?

As a short term quick 'n dirty solution how about puting the router
on a time switch that turns it off for a minute in the we small hours
every night? Or for a longer term soultion feed it through a power
switch that you can dial into over the phone line. I think there are
quite cheap single socket ones available that respond to DTMF tones,
so you could do it from your mobile whilst down the pub...

-- 
Cheers
Dave.
date: Thu, 22 Oct 2009 14:27:27 +0100 (BST)   author:   Dave Liquorice

Re: [Status] [Update #6] [closed] 22:58 LNS restart   
My Zyxel router always used to manage to re-connect eventually without a power 
cycle, and for the overnight interruptions recently it has still done so after 
an hour or two. So, what is different about the daytime interruptions over the 
last few days? Would the Zyxel have re-connected eventually if I had not power 
cycled it?

To those people who find their office routers locked out: have you checked what 
state it was in before the power cycle? Was it connected to one of BT's dreaded 
"parking" WAN addresses? If so, get a router that allows you to specify a 
particular WAN address and refuse all others (and check that it implements this 
-- I've had at least one that allowed me to set a specific WAN address but 
would still connect to whatever it was given).
date: Thu, 22 Oct 2009 14:56:59 +0100   author:   Alfred E Neuman

Re: [Status] [Update #6] [closed] 22:58 LNS restart   
John Devereux wrote:
> "Dave Liquorice"  writes:
> 
>> On Thu, 22 Oct 2009 12:10:50 +0100, John Devereux wrote:
>>
>>> We have had this for months now, with these problems for many or most 
>>> weekends and many evenings. I am getting a lot of pressure to change 
>>> providers, I think one more "weekend" blip should do it. (For us the 
>>> evenings and weekends are the *worst* times for you to to schedule your 
>>> maintenance, since noone is there to reboot the router and of course 
>>> there is no technical support either).
>> Very good points. Maybe an weekly "at risk" period from 1000 to 1100
>> on a Tuesday morning would be better?
>>
>> As a short term quick 'n dirty solution how about puting the router
>> on a time switch that turns it off for a minute in the we small hours
>> every night?
> 
> Historically the problem has been that the router does not always
> reconnect - it stays stuck on 0.0.0.0. Recently I manually set the WAN
> address, and that seemed to cure that problem for a week. But last night
> it again did not reconnect until power-cycled this morning.
> 
>> Or for a longer term soultion feed it through a power
>> switch that you can dial into over the phone line. I think there are
>> quite cheap single socket ones available that respond to DTMF tones,
>> so you could do it from your mobile whilst down the pub...
> 
> Hmmm, good idea. Did not know this was available, will look into it.

I've been using manual adsl stop then restart without power
cycling but was away from system at start of both outages so
need scripts to automate this for aaisp supplied Conexant
r-adsl-c1 and/or Billion 5200S. If someone has already got
this working it would save me some hacking.

David
date: Thu, 22 Oct 2009 15:49:29 +0000   author:   David Lord

Re: [Status] [Update #6] [closed] 22:58 LNS restart   
On Thu, 22 Oct 2009 14:56:59 +0100, Alfred E Neuman wrote:

> My Zyxel router always used to manage to re-connect eventually without a 
> power cycle, 

So did mine, normally within a few seconds. But recently it has being
getting "stuck" with a WAN IP of 0.0.0.0. It has sync and connection
to the exchange but can't do anything useful with it.

I have noticed that the stabilty of this P660R-61C has increased
though. Before it would struggle to reach 100hrs before a
spontanteous reboot. I now see greater than several hundred hours
fairly often.

-- 
Cheers
Dave.
date: Thu, 22 Oct 2009 17:10:08 +0100 (BST)   author:   Dave Liquorice

Re: [Status] [Update #6] [closed] 22:58 LNS restart   
On 23 Oct 2009, John Devereux stated:
> Well, right on schedule, it has gone off at 20:15 (Friday). And stayed
> off. 

20:15? That's interesting. It went off for a hunk around 19:30 for me,
but came back *on* at 20:11.

> But since the usual suggestion is "turn the router off for 20 minutes"
> probably not :(

Get another router? Mine has reconnected flawlessly every time, even
after the big LNS double-messup recently. Nothing special about it, just
an AAISP standard one.

If it happens with several routers, perhaps it's something BT-related?
date: Sat, 24 Oct 2009 10:59:38 +0100   author:   Nix

Re: [Status] [Update #6] [closed] 22:58 LNS restart   
In article , "John Devereux" wrote:
> 
> Nix  writes:
> 
> > On 23 Oct 2009, John Devereux stated:
> >> Well, right on schedule, it has gone off at 20:15 (Friday). And stayed
> >> off. 
> >
> > 20:15? That's interesting. It went off for a hunk around 19:30 for me,
> > but came back *on* at 20:11.
> >
> >> But since the usual suggestion is "turn the router off for 20 minutes"
> >> probably not :(
> >
> > Get another router? Mine has reconnected flawlessly every time, even
> > after the big LNS double-messup recently. Nothing special about it, just
> > an AAISP standard one.
> 
> Hi,
> 
> It's already another router, changed it a couple of weeks ago, for ~3rd
> time. New cable, connected directly to test socket.
> 
> > If it happens with several routers, perhaps it's something BT-related?
> 
> I think so, at least this time. It has been hard to prove up until
> recently. It has been the combination of the AAISP disconnects plus the
> failure by BT(?) to reconnect properly (0.0.0.0 IP).
> 


During the 20CN IPSC trial, I had a similar problem with failure to 
login again after PPP drop.  Shaun investigated this with me and it was 
determined that the encrypted password sent to AAISP from BT was 
corrupt. This resulted in a "password incorrect" failure to login.  My 
routers (Billion or ZyXEL) then went through what looked like a time-out 
interval before trying again.  Sometimes this cycle was repeated for 
several minutes before a good password was received.  Perhaps some 
routers drop out of this retry cycle and fall back to needing a manual 
restart?  We tried several types of password (short, long, complex, 
simple) but the only way of ensuring a quick login was for Shaun to set 
up for no password required at his end.

I understood this problem was in the BT network and had been identified 
and fixed before the end of the trial.  I've seen one or two long login 
periods since, but assumed these where due to "genuine" problems, like 
stuck sessions. :-) Most PPP drops and logins are now just a few 
seconds.  I've also found setting my router's WAN address rather than 
waiting for DHCP shaves a few seconds off the cycle.

Does your Status/Log page at AAISP tell you anything significant?

-- 
John W
I you want to mail me, replace the obvious with co.uk twice
date: Sat, 24 Oct 2009 15:43:53 +0100   author:   John Weston lid

Re: [Status] [Update #6] [closed] 22:58 LNS restart   
On Sat, 24 Oct 2009 16:29:58 +0100, John Devereux wrote:

>Oct 23 20:14:56 clueless bgpfeed: xxxxxx@a Route 81.187.19.108/32 now down
>Oct 23 20:14:56 clueless bgpfeed: xxxxxx@a Route 81.187.19.0/26 now down
>Oct 23 20:14:56 clueless radius-acct: BBIP17484856 Stopped xxxxxx@a AdminReset 

There are no connection tries in here at all. Have you got it set to
reconnect after a disconnect or whatever it's called in your modem?

>Oct 24 11:54:00 clueless radius-auth: BBIP17484856 Platform b.gormless 217.41.221.142 xxxxxx@a 
>Oct 24 11:54:00 clueless radius-auth: BBIP17484856 Accept 217.41.221.142 213.120.187.234 xxxxxx@a b.gormless txrate=7000000bps*95% linerate=7915000/7915000 MTU=1500 
>Oct 24 11:54:00 clueless bgpfeed: xxxxxx@a Route 81.187.19.0/26 now 90.155.53.12
>Oct 24 11:54:00 clueless bgpfeed: xxxxxx@a Route 81.187.19.108/32 now 90.155.53.12
>Oct 24 11:54:03 clueless radius-acct: BBIP17484856 Start xxxxxx@a 213.120.187.234 MRU=1500 linerate=6650000/7915000

-- 
Regards - Rodney Pont
The from address exists but is mostly dumped,
please send any emails to the address below
e-mail	ngpsm4 (at) infohitsystems (dot) ltd (dot) uk
date: Sat, 24 Oct 2009 17:13:58 +0100 (BST)   author:   Rodney Pont

Re: [Status] [Update #6] [closed] 22:58 LNS restart   
John Devereux wrote:
> John Weston <invalid@earlsway.invalid> writes:
> 
>> In article , "John Devereux" wrote:
>>> Nix  writes:
>>>
>>>> On 23 Oct 2009, John Devereux stated:
>>>>> Well, right on schedule, it has gone off at 20:15 (Friday). And stayed
>>>>> off. 
>>>> 20:15? That's interesting. It went off for a hunk around 19:30 for me,
>>>> but came back *on* at 20:11.
>>>>
>>>>> But since the usual suggestion is "turn the router off for 20 minutes"
>>>>> probably not :(
>>>> Get another router? Mine has reconnected flawlessly every time, even
>>>> after the big LNS double-messup recently. Nothing special about it, just
>>>> an AAISP standard one.
>>> Hi,
>>>
>>> It's already another router, changed it a couple of weeks ago, for ~3rd
>>> time. New cable, connected directly to test socket.
>>>
>>>> If it happens with several routers, perhaps it's something BT-related?
>>> I think so, at least this time. It has been hard to prove up until
>>> recently. It has been the combination of the AAISP disconnects plus the
>>> failure by BT(?) to reconnect properly (0.0.0.0 IP).
>>>
>>
>> During the 20CN IPSC trial, I had a similar problem with failure to 
>> login again after PPP drop.  Shaun investigated this with me and it was 
>> determined that the encrypted password sent to AAISP from BT was 
>> corrupt. This resulted in a "password incorrect" failure to login.  My 
>> routers (Billion or ZyXEL) then went through what looked like a time-out 
>> interval before trying again.  Sometimes this cycle was repeated for 
>> several minutes before a good password was received.  Perhaps some 
>> routers drop out of this retry cycle and fall back to needing a manual 
>> restart?  We tried several types of password (short, long, complex, 
>> simple) but the only way of ensuring a quick login was for Shaun to set 
>> up for no password required at his end.
> 
> Thanks, perhaps we can try that too.
> 
>> I understood this problem was in the BT network and had been identified 
>> and fixed before the end of the trial.  I've seen one or two long login 
>> periods since, but assumed these where due to "genuine" problems, like 
>> stuck sessions. :-) Most PPP drops and logins are now just a few 
>> seconds.  I've also found setting my router's WAN address rather than 
>> waiting for DHCP shaves a few seconds off the cycle.
>>
>> Does your Status/Log page at AAISP tell you anything significant?
> 
> Hi John,
> 
> Nothing significant to me...
> 
> Oct 23 20:14:56 clueless bgpfeed: xxxxxx@a Route 81.187.19.108/32 now down
> Oct 23 20:14:56 clueless bgpfeed: xxxxxx@a Route 81.187.19.0/26 now down
> Oct 23 20:14:56 clueless radius-acct: BBIP17484856 Stopped xxxxxx@a AdminReset 
> Oct 24 11:54:00 clueless radius-auth: BBIP17484856 Platform b.gormless 217.41.221.142 xxxxxx@a 
> Oct 24 11:54:00 clueless radius-auth: BBIP17484856 Accept 217.41.221.142 213.120.187.234 xxxxxx@a b.gormless txrate=7000000bps*95% linerate=7915000/7915000 MTU=1500 
> Oct 24 11:54:00 clueless bgpfeed: xxxxxx@a Route 81.187.19.0/26 now 90.155.53.12
> Oct 24 11:54:00 clueless bgpfeed: xxxxxx@a Route 81.187.19.108/32 now 90.155.53.12
> Oct 24 11:54:03 clueless radius-acct: BBIP17484856 Start xxxxxx@a 213.120.187.234 MRU=1500 linerate=6650000/7915000
> 
> (xxxxxx represents our login)
> 
> We are working again for now, had to send someone in to flick the switch
> since MD was jumping up and down.
> 

Up to this past few weeks I've seen very many ppp reconnects and
they were never a big problem, sometimes a short delay while
browsing. Logs show reconnects within a few seconds. Otherwise
I've twice had adsl uptimes > 130 days and a few others > 100 days
but recently I've struggled to stay up online for a full week. The
instant reconnects are replaced with manual adsl restart being
required as attempting ppp reconnects has always failed.

David
date: Sat, 24 Oct 2009 17:27:43 +0000   author:   David Lord

Re: [Status] [Update #6] [closed] 22:58 LNS restart   
On Sat, 24 Oct 2009 17:27:40 +0100, John Devereux wrote:

>"Rodney Pont"  writes:
>
>> On Sat, 24 Oct 2009 16:29:58 +0100, John Devereux wrote:
>>
>>>Oct 23 20:14:56 clueless bgpfeed: xxxxxx@a Route 81.187.19.108/32 now down
>>>Oct 23 20:14:56 clueless bgpfeed: xxxxxx@a Route 81.187.19.0/26 now down
>>>Oct 23 20:14:56 clueless radius-acct: BBIP17484856 Stopped xxxxxx@a AdminReset 
>>
>> There are no connection tries in here at all. Have you got it set to
>> reconnect after a disconnect or whatever it's called in your modem?
>
>Think so, but will check. 
>
>
>>>Oct 24 11:54:00 clueless radius-auth: BBIP17484856 Platform b.gormless 217.41.221.142 xxxxxx@a 

The logs from the modem for this period might shed some light on it as
well. Do you have those still?

-- 
Regards - Rodney Pont
The from address exists but is mostly dumped,
please send any emails to the address below
e-mail	ngpsm4 (at) infohitsystems (dot) ltd (dot) uk
date: Sat, 24 Oct 2009 18:40:01 +0100 (BST)   author:   Rodney Pont

Re: [Status] [Update #6] [closed] 22:58 LNS restart   
On Sat, 24 Oct 2009 19:41:04 +0100, John Devereux wrote:

>>>>>Oct 23 20:14:56 clueless bgpfeed: xxxxxx@a Route 81.187.19.108/32 now down
>>>>>Oct 23 20:14:56 clueless bgpfeed: xxxxxx@a Route 81.187.19.0/26 now down
>>>>>Oct 23 20:14:56 clueless radius-acct: BBIP17484856 Stopped xxxxxx@a AdminReset 
>>>>
>>>> There are no connection tries in here at all. Have you got it set to
>>>> reconnect after a disconnect or whatever it's called in your modem?
>>>
>>>Think so, but will check. 
>
>There appears to be no such setting. (It's a USR "SureConnect", I am not
>that familiar with it since only just swapped it in to try to resolve
>issues).
>
>>>
>>>
>>>>>Oct 24 11:54:00 clueless radius-auth: BBIP17484856 Platform b.gormless 217.41.221.142 xxxxxx@a 
>>
>> The logs from the modem for this period might shed some light on it as
>> well. Do you have those still?
>
>None configured unfortunately.

All I can really say at the moment then is that AAISP are not seeing
any attempt to reconnect. Your next step has to be to assure yourself
that the modem is trying to and take it from there.

-- 
Regards - Rodney Pont
The from address exists but is mostly dumped,
please send any emails to the address below
e-mail	ngpsm4 (at) infohitsystems (dot) ltd (dot) uk
date: Sat, 24 Oct 2009 22:15:25 +0100 (BST)   author:   Rodney Pont

Google
 
Web myreader.co.uk


    COPYRIGHT 2007, YARDI TECHNOLOGY LIMITED, ALL RIGHT RESERVE  |   contact us