[Issues] Some AW_UTIL functions cause lost RPC connection errors

Hugo Buddelmeijer astrowise at hugo.doemaarwat.nl
Wed Mar 5 17:22:00 CET 2008


Hi all,

D.R. Boxhoorn schreef:
 > Could you please try again - with a 35 minute delay - with a new
 > connection and report whether the problem still occurs?

The same error occured, both in AWBASE and current.

I found a workaround though. A lot of functions in astro.main classes 
use database.connect() before doing any database queries, but most of 
them never use disconnect(). I updated my testscript to disconnect from 
the database after every query, and the problem went away. See 
rpcTimeOutTest.3.withdisconnect.py .

Perhaps some other programs assume that the database connection is 
always there. I suppose it will suffice to call disconnect() and 
connect() directly after each 'hazardous' operation (e.g. making an 
association), but haven't tested this.

Should we keep this discussion on the mailing list?

Hugo

> On Tue, Mar 04, 2008 at 11:41:32AM +0100, Hugo Buddelmeijer wrote:
>> Hi all,
>>
>> It appears that some of the functions in AW_UTIL cause the ORA-28576
>> error: lost RPC connection to external procedure agent. This means that
>> scripts (or sessions) using these functions cannot last longer than 30
>> minutes.
>>
>> E.g. running this query twice with a 35 minute delay will raise above
>> error: "SELECT * FROM TABLE(AWOPER.AW_UTIL.RADIUSTEST(15, 243.0, 27.0,
>> 5.0/3600.0))". Other queries, such as simple SourceList queries, do not
>> raise this error.
>>
>> The AssociatList class uses above function to find associations. The
>> consequences of this is that it is not possible to create several
>> AssociateLists in a script/session if the time between their creation is
>> more than 30 minutes.
>>
>> Is there a way around this problem? Am I doing something wrong?
>>
>> Attached scripts show the timeout in question. rpcTimeOutTest.py
>> compares several queries which are specifically crafted for this test
>> and would not be used in a regular session. testAL8.py tries two
>> associations with an (artificial) delay, such a situation can be quite
>> common.
>>
>> Greetings,
>> Hugo
>>
>>
>>
> 
>> #!/usr/bin/env awe
>> from astro.main.AssociateList import AssociateList
>> from astro.main.SourceList import *
>> import time
>>
>> slid1 = 135751 # 2df_R_13
>> slid2 = 136161 # 2df_V_13
>> slid3 = 136121 # 2df_I_13
>>
>> sl1 = (SourceList.SLID == slid1)[0]
>> sl2 = (SourceList.SLID == slid2)[0]
>> sl3 = (SourceList.SLID == slid3)[0]
>>
>> # commenting out either the first association
>> # or the sleep results in no error
>> al1 = AssociateList()
>> al1.input_lists = [sl1, sl2]
>> al1.make()
>> al1.commit()
>>
>> time.sleep(35*60)
>>
>> al2 = AssociateList()
>> al2.input_lists = [sl1, sl3]
>> al2.make()
>> al2.commit()
>>
>>
>>
>> # output:
>> """
>> virgo15:~/phd/awe>awe testAL8.py
>> [virgo15] 13:38:31 - Preparing for the matching
>> [virgo15] 13:40:01 - Found 2893 sources in SourceList with SLID = 135751
>> [virgo15] 13:44:54 - Found 2901 sources in SourceList with SLID = 136161
>> [virgo15] 13:44:54 - Looking for pairs
>> [virgo15] 13:45:34 - Looking for closest pairs
>> [virgo15] 13:45:35 - Filtered out 79 pairs
>> [virgo15] 13:45:35 - Found 2456 pairs
>> [virgo15] 13:45:35 - Inserting first half of pairs
>> [virgo15] 13:45:35 - Inserting second half of pairs
>> [virgo15] 13:45:36 - Inserting null associations from last input list
>> [virgo15] 13:45:36 - Inserting null associations from first input list
>> [virgo15] 13:45:36 - Created Chain AssociateList with ALID = 62441, name =  and 3528 associates!
>> [virgo15] 14:20:37 - Preparing for the matching
>> [virgo15] 14:21:44 - Found 2893 sources in SourceList with SLID = 135751
>> [virgo15] 14:22:46 - Found 6590 sources in SourceList with SLID = 136121
>> [virgo15] 14:22:46 - Looking for pairs
>> Traceback (most recent call last):
>>   File "testAL8.py", line 23, in ?
>>     al2.make()
>>   File "/Users/users/buddel/phd/awe/cvs/opipe/astro/main/AssociateList.py", line 146, in make
>>     self.associate_sourcelists()
>>   File "/Users/users/buddel/phd/awe/cvs/opipe/astro/main/AssociateList.py", line 207, in associate_sourcelists
>>     self.associate_lists(list1=self.input_lists[0], list2=self.input_lists[1])
>>   File "/Users/users/buddel/phd/awe/cvs/opipe/astro/main/AssociateList.py", line 1131, in associate_lists
>>     c.execute(Tquery)
>> cx_Oracle.DatabaseError: ORA-28576: lost RPC connection to external procedure agent
>> """
>>
>>
> 
>> from common.database.Database import database
>> import sys,time
>>
>> # Simple function to do queries
>> def do_query(q):
>>     database.connect()
>>     c = database.cursor()
>>     c.execute(q)
>>     results = c.fetchall()
>>     c.close()
>>     return results
>>
>> # Determine what query we want to test
>> if not len(sys.argv) == 2:
>>     print """Usage: "awe %s <number>" where number is
>>   1 for simple query of a SourceList
>>   2 for NeighBoursTest query
>>   3 for RadiusTest query""" % (sys.argv[0])
>>     sys.exit()
>>
>> if sys.argv[1] == '2':
>>     # This query will fail the second time
>>     query = 'SELECT * FROM TABLE(AWOPER.AW_UTIL.NEIGHBOURSTEST(4067390, 9, 9))'
>>     
>> elif sys.argv[1] == '3':
>>     # This query will also fail
>>     query = 'SELECT * FROM TABLE(AWOPER.AW_UTIL.RADIUSTEST(15, 243.000000, 27.000000, 5.000000/3600.0))'
>>     
>> else:
>>     # This query will succeed
>>     query = 'SELECT "SLID","SID","HTM" FROM AWOPER."SOURCELIST*SOURCES" T WHERE T.SLID = 136111 AND T.SID = 10'
>>
>>
>> # Tell the user about the query
>> print "query:",query
>>
>> # Do the query for the first time, will always work
>> data1 = do_query(query)
>> print "data1: %i rows" % (len(data1))
>>
>> # The RPC Timeout will occur after about half an hour
>> print "sleeping for 35 minutes"
>> minutes = 35
>> for i in range(minutes):
>>         print "sleeping minute %i/%i" % (i,minutes)
>>         time.sleep(60)
>>
>>
>> # Try again, it will fail in query 2 and 3
>> data2 = do_query(query)
>> print "data2: %i rows" % (len(data2))
>>
>> # cx_Oracle.DatabaseError: ORA-28576: lost RPC connection to external procedure agent
>>
>>
>>
>>
> 
>> _______________________________________________
>> Issues mailing list
>> Issues at astro-wise.org
>> http://listman.astro-wise.org/mailman/listinfo/issues
> _______________________________________________
> Issues mailing list
> Issues at astro-wise.org
> http://listman.astro-wise.org/mailman/listinfo/issues

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: rpcTimeOutTest.3.withdisconnect.py
Url: http://listman.astro-wise.org/pipermail/issues/attachments/20080305/1ea39112/attachment.pl 


More information about the Issues mailing list