Since upgrading the version of the maven-surefire-plugin we often see projects containing Pax Exam based tests taking a long time to execute. In these cases we can attribute the extra time to a (default) timeout of 30 seconds used by Surefire to eventually continue when the forked process did not terminate.
After analysis we have concluded that the regular shutdown protocol between Surefire plugin and forked process is halted due to an invalid command being read. The Thread dies and the 'BYE_ACK' from Surefire plugin towards forked process is never handled in the forked process, falling back to the timeout that forces the fork to stop.
I believe the source of the corruption of the communication between Surefire maven plugin and forked process can be attributed to Pax Exam's own method of forking a test container. Specifically the way the process input/output streams are handled using 'Pipes'. I see both the Surefire fork (running the test) and the Forked Test Container reading from the same System.in input stream. This means that some of the data on this InputStream intended for the Surefire forked logic is 'stolen' by the Pipe logic used by ops4j-base-exec's DefaultJavaRunner. This results in the input meant for Surefire logic running in the Surefire's forked process being incorrect.
I see that the current status of ops4j-base-exec's DefaultJavaRunner no longer contains this Pipe logic and when re-managing the version of this library to a locally build SNAPSHOT does seem to solve this issue, however with some loss of functionality: the output of the Forked Test Container is no longer reported in the stdout of surefire plugin execution and the following warning being reported:
[WARNING] Corrupted STDOUT by directly writing to native stream in forked JVM 1. See FAQ web page and the dump file /Users/arnoud/fork-test/target/surefire-reports/2018-09-26T22-43-26_347-jvmRun1.dumpstream
I have created a project that can be used to reproduce the issue: https://github.com/glimmerveen/fork-test . Note that the issue can be seen as the Maven build being blocked for about 30 seconds after the following line in the build output:
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.872 s - in org.glimmerveen.forktest.OsgiTest
Pax Exam 4.12.0