HOAB

History of a bug

Tomcat, NIO, Hanging et CLOSE_WAIT

Rédigé par gorki Aucun commentaire

Problem :

We are testing a springboot application in AWS with ELB in front.

After a while of load-testing, the application was hanging :

  • HTTP 504 error code from Jmeter client
  • HTTP 502 if we raise ELB timeout
  • Once logged on the server :
    • telnet localhost 8080 was OK
    • sending GET / on this socket was not responding
    • plenty of CLOSE_WAIT socket
    • wget was also hanging (normal)
    • connection was established during wget hang
    • nothing in the log

 

Solution :

 

I initially think about the keepAlive timeout and pool of tomcat but

  1. SpringBoot copy the connectionTimeout parameter to keepAliveTimeout
  2. new socket is accepted and established
  3. CLOSE_WAIT wasn't shutdown after hour

Doing the test many times, I finally so a classical "Too many open files" in the log. That's why I could not see more log during the hang.

So we change the nproc and nofile in /etc/security/limits.conf

And taadaaaa ! Nothing change in :

cat /proc/<$PID>/limits

Thanks to blogs over the world like this one :

  • the service is start with systemd
  • to override ressources limits with systemd :
[Service]
...
LimitNOFILE=500000
LimitNPROC=500000

At last but not least, the value of Tomcat NIO socket queue is around 10000 + other files + other process... choose wisely your limit

@EJB @Singleton et ConcurrencyManagement

Rédigé par gorki Aucun commentaire

Le problème :

Un EJB singleton qui sert de cache remonte une erreur avec JBoss 7.1.1 :

EJB Invocation failed on component XXXX for method public java.lang.Object:
javax.ejb.EJBTransactionRolledbackException: JBAS014373: EJB 3.1 PFD2 4.8.5.5.1
concurrent access timeout on org.jboss.invocation.InterceptorContext$Invocation@398e7b17 
- could not obtain lock within 5000MILLISECONDS

Solution :

Tout est déjà écrit bien sur, encore aurait-il fallu le lire :)
Par défaut les EJB @Singleton sont

  1. @ConcurrencyManagement(ConcurrencyManagementType.CONTAINER)

  2. protégés par un @Lock(LockType.WRITE)

donc les méthodes sont synchronisés par le container. On peut

  • soit passer la méthode ou la classe en LockType.READ,
  • soit passer en Bean-managed (@ConcurrencyManagement(ConcurrencyManagementType.BEAN)). Là c'est vous qui gérer le tout.

Pour reproduire systématiquement, il suffit de mettre un sleep de 6s dans la méthode problématique.

Note:

  • Si le pool des ejb n'est pas assez grand on a un permit timeout et non un concurrent ascess timeout.
  • Le timeout sous JBoss est configuré dans cette section très intéressante :
<subsystem xmlns="urn:jboss:domain:ejb3:1.2">
            <session-bean>
                <stateless>
                    <bean-instance-pool-ref pool-name="slsb-strict-max-pool"/>
                </stateless>
                <stateful default-access-timeout="5000" cache-ref="simple"/>
                <singleton default-access-timeout="5000"/>
            </session-bean>
            <pools>
                <bean-instance-pools>
                    <strict-max-pool name="slsb-strict-max-pool" max-pool-size="20" instance-acquisition-timeout="5" instance-acquisition-timeout-unit="MINUTES"/>
                    <strict-max-pool name="mdb-strict-max-pool" max-pool-size="20" instance-acquisition-timeout="5" instance-acquisition-timeout-unit="MINUTES"/>
                </bean-instance-pools>
            </pools>
            <caches>
                <cache name="simple" aliases="NoPassivationCache"/>
                <cache name="passivating" passivation-store-ref="file" aliases="SimpleStatefulCache"/>
            </caches>
            <passivation-stores>
                <file-passivation-store name="file"/>
            </passivation-stores>
            <async thread-pool-name="default"/>
            <timer-service thread-pool-name="default">
                <data-store path="timer-service-data" relative-to="jboss.server.data.dir"/>
            </timer-service>
            <remote connector-ref="remoting-connector" thread-pool-name="default"/>
            <thread-pools>
                <thread-pool name="default">
                    <max-threads count="10"/>
                    <keepalive-time time="100" unit="milliseconds"/>
                </thread-pool>
            </thread-pools>
        </subsystem>

 

 

Fil RSS des articles de ce mot clé