JDBC Fetch Size
- biwhy
- May 30, 2024
- 2 min read
Impact on Data Transfer Performance
Problems:
Data transfer takes considerable time, for example, due to frontend and backend are located in different data centers.
Solution:
High network bandwidth is not the only factor influencing fast data transfer. How communicating programs send information is also crucial.
If data is sent in small chunks, all network layer optimization techniques apply to each chunk, and time is spent waiting for each chunk’s data transfer to complete.
The rough time taken can be summarized as follows:
Worst case (small chunks): Round-trip time (approximation of one chunk's data transfer) * number of chunks
Best case (one big chunk): (Data to transfer) / (network bandwidth)
In the case of JDBC, chunks can be regulated by the Array Fetch Size parameter.
Note that different JDBC driver implementations might treat this parameter differently. Additionally, some engineers might be hesitant to use large numbers.
Fortunately, it is easy to test with BiWhy (it took only a couple of hours to add this functionality).
Please see screenshots with the results below:
The default BOBJ limit of 1k rows is three times as bad as the optimal parameter!
SAP Note: 1464707 - How To Increase The Array Fetch Size Above The Maximum
A frequently set value of 100 rows takes 9 minutes and 10 seconds instead of 20 seconds, which is about 30 times worse than the optimal setting!
We did not even test the frequently set, if not default, value of 10 rows for apparent reasons.
For our test, optimal range is:
0 – value is not set, JDBC decides
>10k rows
From a performance perspective, it is unclear why one would set the Fetch Size to low numbers unless it is for a real-time service or there is requirement to abort after fetching a certain limit (in this case, the suggested minimum fetch size would be limit + 1 rows).
PS: We once encountered a performance problem with a third-party ETL tool (not SAP). Despite being a large company by market cap and boasting of leveraging AI on their website, network traces showed that data was transferred in very small chunks (~10 rows), and array fetch size configuration was NOT available!


