[Issues] Some AW_UTIL functions cause lost RPC connection errors

John P. McFarland mcfarland at astro.rug.nl
Wed Mar 5 13:09:03 CET 2008


Hi Danny,

As you know, I too have been experiencing ORA-28576 error during the 
association stage of GAstrometric.  This is an ongoing issue you said was 
related to the HTM shared library and was under investigation of some form.

I am still getting the ORA-28576 errors during the association of a 
GAstrometric task preceded by another long GAstrometric task, but now I more 
often am getting "ORA-03113: end-of-file on communication channel" errors 
during the association and even at other times.  If I retry in the same 
session I get "ORA-03114: not connected to ORACLE" errors at the first 
database access indicating a database disconnection.

Are all these error messages related?

Cheers,


-=John


On Wed, 5 Mar 2008, D.R. Boxhoorn wrote:

>
> Hoi Hugo,
>
> Could you please try again - with a 35 minute delay - with a new connection
> and report whether the problem still occurs?
>
> Thanks,
>
>                                                   Danny
>
> On Tue, Mar 04, 2008 at 11:41:32AM +0100, Hugo Buddelmeijer wrote:
>> Hi all,
>>
>> It appears that some of the functions in AW_UTIL cause the ORA-28576
>> error: lost RPC connection to external procedure agent. This means that
>> scripts (or sessions) using these functions cannot last longer than 30
>> minutes.
>>
>> E.g. running this query twice with a 35 minute delay will raise above
>> error: "SELECT * FROM TABLE(AWOPER.AW_UTIL.RADIUSTEST(15, 243.0, 27.0,
>> 5.0/3600.0))". Other queries, such as simple SourceList queries, do not
>> raise this error.
>>
>> The AssociatList class uses above function to find associations. The
>> consequences of this is that it is not possible to create several
>> AssociateLists in a script/session if the time between their creation is
>> more than 30 minutes.
>>
>> Is there a way around this problem? Am I doing something wrong?
>>
>> Attached scripts show the timeout in question. rpcTimeOutTest.py
>> compares several queries which are specifically crafted for this test
>> and would not be used in a regular session. testAL8.py tries two
>> associations with an (artificial) delay, such a situation can be quite
>> common.
>>
>> Greetings,
>> Hugo
>>
>>
>>
>
>> #!/usr/bin/env awe
>> from astro.main.AssociateList import AssociateList
>> from astro.main.SourceList import *
>> import time
>>
>> slid1 = 135751 # 2df_R_13
>> slid2 = 136161 # 2df_V_13
>> slid3 = 136121 # 2df_I_13
>>
>> sl1 = (SourceList.SLID == slid1)[0]
>> sl2 = (SourceList.SLID == slid2)[0]
>> sl3 = (SourceList.SLID == slid3)[0]
>>
>> # commenting out either the first association
>> # or the sleep results in no error
>> al1 = AssociateList()
>> al1.input_lists = [sl1, sl2]
>> al1.make()
>> al1.commit()
>>
>> time.sleep(35*60)
>>
>> al2 = AssociateList()
>> al2.input_lists = [sl1, sl3]
>> al2.make()
>> al2.commit()
>>
>>
>>
>> # output:
>> """
>> virgo15:~/phd/awe>awe testAL8.py
>> [virgo15] 13:38:31 - Preparing for the matching
>> [virgo15] 13:40:01 - Found 2893 sources in SourceList with SLID = 135751
>> [virgo15] 13:44:54 - Found 2901 sources in SourceList with SLID = 136161
>> [virgo15] 13:44:54 - Looking for pairs
>> [virgo15] 13:45:34 - Looking for closest pairs
>> [virgo15] 13:45:35 - Filtered out 79 pairs
>> [virgo15] 13:45:35 - Found 2456 pairs
>> [virgo15] 13:45:35 - Inserting first half of pairs
>> [virgo15] 13:45:35 - Inserting second half of pairs
>> [virgo15] 13:45:36 - Inserting null associations from last input list
>> [virgo15] 13:45:36 - Inserting null associations from first input list
>> [virgo15] 13:45:36 - Created Chain AssociateList with ALID = 62441, name =  and 3528 associates!
>> [virgo15] 14:20:37 - Preparing for the matching
>> [virgo15] 14:21:44 - Found 2893 sources in SourceList with SLID = 135751
>> [virgo15] 14:22:46 - Found 6590 sources in SourceList with SLID = 136121
>> [virgo15] 14:22:46 - Looking for pairs
>> Traceback (most recent call last):
>>   File "testAL8.py", line 23, in ?
>>     al2.make()
>>   File "/Users/users/buddel/phd/awe/cvs/opipe/astro/main/AssociateList.py", line 146, in make
>>     self.associate_sourcelists()
>>   File "/Users/users/buddel/phd/awe/cvs/opipe/astro/main/AssociateList.py", line 207, in associate_sourcelists
>>     self.associate_lists(list1=self.input_lists[0], list2=self.input_lists[1])
>>   File "/Users/users/buddel/phd/awe/cvs/opipe/astro/main/AssociateList.py", line 1131, in associate_lists
>>     c.execute(Tquery)
>> cx_Oracle.DatabaseError: ORA-28576: lost RPC connection to external procedure agent
>> """
>>
>>
>
>>
>> from common.database.Database import database
>> import sys,time
>>
>> # Simple function to do queries
>> def do_query(q):
>>     database.connect()
>>     c = database.cursor()
>>     c.execute(q)
>>     results = c.fetchall()
>>     c.close()
>>     return results
>>
>> # Determine what query we want to test
>> if not len(sys.argv) == 2:
>>     print """Usage: "awe %s <number>" where number is
>>   1 for simple query of a SourceList
>>   2 for NeighBoursTest query
>>   3 for RadiusTest query""" % (sys.argv[0])
>>     sys.exit()
>>
>> if sys.argv[1] == '2':
>>     # This query will fail the second time
>>     query = 'SELECT * FROM TABLE(AWOPER.AW_UTIL.NEIGHBOURSTEST(4067390, 9, 9))'
>>
>> elif sys.argv[1] == '3':
>>     # This query will also fail
>>     query = 'SELECT * FROM TABLE(AWOPER.AW_UTIL.RADIUSTEST(15, 243.000000, 27.000000, 5.000000/3600.0))'
>>
>> else:
>>     # This query will succeed
>>     query = 'SELECT "SLID","SID","HTM" FROM AWOPER."SOURCELIST*SOURCES" T WHERE T.SLID = 136111 AND T.SID = 10'
>>
>>
>> # Tell the user about the query
>> print "query:",query
>>
>> # Do the query for the first time, will always work
>> data1 = do_query(query)
>> print "data1: %i rows" % (len(data1))
>>
>> # The RPC Timeout will occur after about half an hour
>> print "sleeping for 35 minutes"
>> minutes = 35
>> for i in range(minutes):
>>         print "sleeping minute %i/%i" % (i,minutes)
>>         time.sleep(60)
>>
>>
>> # Try again, it will fail in query 2 and 3
>> data2 = do_query(query)
>> print "data2: %i rows" % (len(data2))
>>
>> # cx_Oracle.DatabaseError: ORA-28576: lost RPC connection to external procedure agent
>>
>>
>>
>>
>
>> _______________________________________________
>> Issues mailing list
>> Issues at astro-wise.org
>> http://listman.astro-wise.org/mailman/listinfo/issues
> _______________________________________________
> Issues mailing list
> Issues at astro-wise.org
> http://listman.astro-wise.org/mailman/listinfo/issues
>
>
>
> ** CRM114 Whitelisted by: From: "D.R. Boxhoorn" <danny at astro.rug.nl **
>
> ** ACCEPT: CRM114 Whitelisted by: From: "D.R. Boxhoorn" <danny at astro.rug.nl **
>
>


More information about the Issues mailing list