[opencms-dev] Repeated (daily) crashes of OpenCms 7.0.5

Tim Howland th at wdogsystems.com
Fri Feb 20 13:42:26 CET 2009


MySQL has an unhealthy habit of killing old connections after a certain
amount of idle time. If the pool doesn't test them, then it thinks they're
still good, and the pool can become exhausted.

Try uncommenting the following statement (and comment out the empty one) - I
believe this will cause the evictor thread to test each connection in the
pool periodically.

db.pool.default.testQuery=SELECT STRUCTURE_ID FROM CMS_OFFLINE_STRUCTURE
WHERE RESOURCE_PATH = '/'

You can also set test on borrow to true; this may solve the issue, but will
add a performance penalty.

HTH,

Tim

To: "opencms-dev at opencms.org" <opencms-dev at opencms.org>
Date: Thu, 19 Feb 2009 12:07:06 -0500
Subject: Re: [opencms-dev] Repeated (daily) crashes of OpenCms 7.0.5
All:  Thank you for the replies to my message yesterday.  Let me provide a
little more information about the problem that we discovered this morning,
which might be of assistance.  After that, I'll answer all the questions I
can that you've asked so far (see below), in one email for convenience.


NEW INFORMATION

- The issue with the missing module name in the URI
("/system/modules/resources/images/featurephoto.jpg") is probably not
relevant to the problem.  That URI was being calculated dynamically from a
custom folder property, which contains the module name to use.  It seems
that OpenCms is unable (due to timeout) to read that property, and it
defaulted to the empty string.  In other words, this is a consequence of the
problem, not a symptom.


- There's some evidence that the threads are waiting for something to happen
with MySQL.  We observe this behavior:  our sites remain up and running, but
no one can log into the Workplace.  The stacktrace in opencms.log is shown
below (much clipped to save space.)  Our IT people say there is nothing
unusual in the MySQL logs that correspond to these exceptions.

  org.opencms.db.CmsDbSqlException: An SQL error occurred when executing the
following query: .
       at
org.opencms.db.generic.CmsUserDriver.readUserInfos(CmsUserDriver.java:1473)
       [snip]
       at java.lang.Thread.run(Thread.java:619)
Caused by: org.apache.commons.dbcp.SQLNestedException: Cannot get a
connection, pool error: Timeout waiting for idle object
       at
org.apache.commons.dbcp.PoolingDriver.connect(PoolingDriver.java:184)
       at java.sql.DriverManager.getConnection(DriverManager.java:582)
       at java.sql.DriverManager.getConnection(DriverManager.java:207)
       at
org.opencms.db.CmsSqlManager.getConnectionByUrl(CmsSqlManager.java:104)
       at
org.opencms.db.generic.CmsSqlManager.getConnection(CmsSqlManager.java:231)
       at
org.opencms.db.generic.CmsUserDriver.readUserInfos(CmsUserDriver.java:1445)
       ... 47 more
Caused by: java.util.NoSuchElementException: Timeout waiting for idle object
       at
org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:825)
       at
org.apache.commons.dbcp.PoolingDriver.connect(PoolingDriver.java:176)
       ... 52 more


ANSWERS TO YOUR QUESTIONS

Manfred Schenk wrote:
>
>Hi, could you describe your installation a bit further?
>Do you use a "Standard" installation or use some extra-stuff like
"stripping
>away the opencms prefixes of the URLs" or a combination of apache and
tomcat?
>
>I also had a stack overflow some months ago when I misconfigured the 440-
>handler so that it had been called recursively until the stack overflowed.
>

It's a "strip" implementation, but nothing too out of the ordinary.  We
installed OpenCms as the ROOT to eliminate the application /opencms prefix.
 But, instead of getting rid of the second /opencms prefix, for the main
servlet, we just renamed it to /scee.  Our customers are perfectly happy
with having a URL of, say, http://www.foo.com/scee/mypage.htm so there was
no urgency in getting rid of that second prefix.

If one of our domains is served by opencms, then Apache just passes it to
Tomcat.  Our rewrite rules are pretty simple:

  www.foo.com/ -> transform to www.foo.com/scee/en/
  */scee/* -> forward to Tomcat
  */export/* -> forward to Tomcat

Hmm...the 440 handler seems like a good place to check out the stack
overflow issue.  We have not configured that at all.  Certainly I would not
want any handler to be used on a JPG, and in particular not on a JPG that is
part of the site template (which is the case with that featurephoto.jpg file
above).  Otherwise, if I try to show a nice 404 message in the user's site
template, then that would seem to cause a recursive 404 and a stack
overflow.  Can anyone tell me how to fix that?


Christian Steinert wrote:
>
>Sadly, I have no idea about your actual problem, but you have your
>memory parameters the wrong way around: "-Xms1024m -Xmx512m " would
>mean: start with 1024M but don't use more than 512M.
>
>Do you have any idea, whether the crashes always happen around the
>same time of the night? Do you maybe have any jobs running around then
> - inside or outside of OpenCms?
>

My fault on the memory parameters; I mistranscribed them in the email. :-)
 They are in fact x1024m and s512m, just as you'd expect.

The crashes don't seem to happen at any particular time - it is more like
18-24 hours after each reboot.  The only job that we do have running
overnight is the system backup service, which does both the Tomcat directory
and the MySQL database.  The crashes don't seem to coincide with that.


Farnaz Fotrousi (and similarly Georgi Naplatanov) wrote:
>
>I had similar problem and increasing max_connection in mysql worked for me.
>You can edit  "max_connections=100" in "my.ini or my.cnf" ( mysql
configuration file).
>

Thank you Farnaz and Georgi, we'll try that.  I see that there is no
max_connection parameter in our my.cnf file (max_connections=200 is
commented out.)


OTHER THOUGHTS:

- Like Manfred, I am now starting to wonder about the possibility of a 404
recursion.  A few of those over time would certainly tie up threads to the
point where we'd see the behavior we're seeing.  Certainly I do not want any
404 processing on an image file of any sort; just send a 404 response and be
done with it.  How can I check to see if that's the issue?

- Also, is there any way to configure OpenCms for the maximum number of
MySQL database threads it supports?  Perhaps that's a Tomcat setting?  Keep
in mind that we have a separate Tomcat also hitting MySQL, and it has no
problems at all - it continues to run just fine while OpenCms repeatedly
crashes.  Thus, I think that if we're running out of threads/connections,
it's inside our Tomcat and not in MySQL (else, the other application would
be crashing too, correct?)

Thanks again,

Nick

-- 
Tim Howland
http://wdogsystems.com/
978-225-8494
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opencms.org/pipermail/opencms-dev/attachments/20090220/f8376b59/attachment.html>


More information about the opencms-dev mailing list