Consider Generating Parsers for Performance

Description

We have been evaluating DADL for a few days. We already have an in-house code that generates Pojo's and parsers. DADL's support for bit resolution can be an important factor for us to ditch our code.

But our tests with DADL shows significantly poor performance.

Our serialization code:

Writing 10.000.000 messages took 2812 msecs
Reading 10.000.000 messages took 4662 msecs

DADL generated code:

Writing 1.000.000 messages took 32547 msecs
Reading 1.000.000 messages took 95888 msecs

(Not that there is a factor 10 difference in message count)

Our in-house code performs better since it also generates a serializer/parser for each message type. DADL accesses field-type information at runtime, and with little caching.

I think generating parsers (maybe optionally) wouldn't bother anyone that already been generating Pojo's.

The Model File used during test is something like this:

Environment

None

Activity

Show:
Harald Wellmann
December 18, 2015, 9:33 PM

Thanks for this benchmark! I have been fully aware that performance is not optimal, and it's very interesting to see some specific figures.

So far, the focus for DADL has been on creating a working solution for some specific use cases with minimum development effort. In our context, the deserialization time is dominated by the latency of an external data source.

Not using generated parsers was a deliberate decision (e.g. compared to DataScript) - I expect the source code for DADL itself would be at least twice as large if we tried to generate parsers.

Much of DADL is modelled on JAXB, which doesn't generate parsers either. However, JAXB is doing a lot to optimize performance by caching metadata and avoiding reflection during (un)marshalling.

It would be interesting to see how much performance can be gained by optimizing the current dynamic approach and whether there would still be a significant gap compared to generated parsers.

OPS4J is an open community and we love contributions and new committers. If you'd like to start optimizing or have a go at an alternative generator with per-type generated parsers, then just go ahead...

Assignee

Harald Wellmann

Reporter

Bahri Gencsoy

Labels

None

Affects versions

Priority

Major
Configure