As a software guy, I was always curious to know how the things work at the hardware level and how to apply the knowledge for more advanced optimizations in applications. Take Java Memory Model for instance. The model grounds its memory consistency and visibility properties on keywords such as volatile or synchronize. But these are just the language keywords and you start looking around how JVM engineers could turn the model in life. At some point, you will breathe out revealing that the model utilizes a low-level instruction set for mutexes and memory barriers at the very bottom of the software pie running on physical machines. Nice, these are the instructions a CPU understands but the curiosity drives you further because it is still vague how all the memory consistency guarantees can be satisfied on multi-CPU machines with several CPU registers and caches. Well, the hardware guys took care of this by supporting the cache coherence protocol. And finally you, as a software guy, can develop highly-performant applications that halt CPUs and invalidate their caches only on purpose with all these volatile, synchronize and final keywords.
Apache Ignite veterans tapped into the knowledge above and, undoubtedly, could deliver one of the fastest in-memory database and computational platform. Presently, the same people are optimizing Ignite Native Persistence - Ignite's distributed and transactional persistence layer. Being a part of that community, let me share some tips about solid-state drives (SSDs) that you, as a software guy, can exploit in Ignite or other disk-based databases deployments.
In simple words, an SSD stores data in pages. Pages are grouped in blocks (usually 128/256 pages per block). The SSD driver can write data directly into an empty page but can clean the whole blocks only. Thus, to reclaim the space occupied by invalid data, all the valid data from one block has to be first copied into empty pages of another block. Once this happens, the driver will purge all the data from the first block giving more space for new data arriving from your applications.
This process happens in the background and called with a familiar term - garbage collection (GC).
So, if you suddenly observe a performance drop under a steady load like it's shown in Figure 1. below, do not be trapped blaming your application or Apache Ignite. The drop might be caused by SSD GC routines.
Let me give you several hints on how to decrease the impact of the SSD GC on the performance of applications.
However, referring to Figure 2., every data (1) that is received by Apache Ignite cluster node will be stored in RAM and persisted (2) in a write-ahead log (WAL) first. This is done by performance reasons and once the update is in the WAL, your application will get the acknowledgment and be able to execute its logic. Then, in the background, the checkpointing process will update the partition files by copying dirty pages from RAM to disk (4). Specific WAL files will be archived over the time and can be safely removed because all the data will be already in the partition files.
So, what's the performance hint here? Consider using separate SSDs for the partition files and the WAL. Apache Ignite actively writes to both places, thus, by having separate physical disk devices for each you may double the overall write throughput. See how to tweak the configuration for that.
Here are you, as a software guy, should keep in mind that the performance of random writes on a 50% filled disk is much better than on a 90% filled disk because of the SSDs over-provisioning and GC. Consider buying SSDs with higher over-provisioning rate and make sure a manufacturer supports the tools to adjust it.
Apache Ignite veterans tapped into the knowledge above and, undoubtedly, could deliver one of the fastest in-memory database and computational platform. Presently, the same people are optimizing Ignite Native Persistence - Ignite's distributed and transactional persistence layer. Being a part of that community, let me share some tips about solid-state drives (SSDs) that you, as a software guy, can exploit in Ignite or other disk-based databases deployments.
SSD Level Garbage Collection
Garbage collection (GC) term is used not only by Java developers to describe the process of purging dead objects from Java heap residing in RAM. Hardware guys use the same term for the same purpose but in relation to SSDs.In simple words, an SSD stores data in pages. Pages are grouped in blocks (usually 128/256 pages per block). The SSD driver can write data directly into an empty page but can clean the whole blocks only. Thus, to reclaim the space occupied by invalid data, all the valid data from one block has to be first copied into empty pages of another block. Once this happens, the driver will purge all the data from the first block giving more space for new data arriving from your applications.
This process happens in the background and called with a familiar term - garbage collection (GC).
So, if you suddenly observe a performance drop under a steady load like it's shown in Figure 1. below, do not be trapped blaming your application or Apache Ignite. The drop might be caused by SSD GC routines.
Figure 1.
Let me give you several hints on how to decrease the impact of the SSD GC on the performance of applications.
Separate Disk Devices for WAL and Data/Index Files
Apache Ignite arranges data and indexes in special partition files on disk. This type of architecture does not require you to have all the data in RAM, if something is missing there Apache Ignite will find the data on disk in these files.
Figure 2.
However, referring to Figure 2., every data (1) that is received by Apache Ignite cluster node will be stored in RAM and persisted (2) in a write-ahead log (WAL) first. This is done by performance reasons and once the update is in the WAL, your application will get the acknowledgment and be able to execute its logic. Then, in the background, the checkpointing process will update the partition files by copying dirty pages from RAM to disk (4). Specific WAL files will be archived over the time and can be safely removed because all the data will be already in the partition files.
So, what's the performance hint here? Consider using separate SSDs for the partition files and the WAL. Apache Ignite actively writes to both places, thus, by having separate physical disk devices for each you may double the overall write throughput. See how to tweak the configuration for that.
SSD Over-provisioning
As the Java heap, SSD requires free space to perform efficiently and to avoid significant performance drops due to the GC. All SSD manufactures reserve some amount of space for that purpose. This is called over-provisioning.Here are you, as a software guy, should keep in mind that the performance of random writes on a 50% filled disk is much better than on a 90% filled disk because of the SSDs over-provisioning and GC. Consider buying SSDs with higher over-provisioning rate and make sure a manufacturer supports the tools to adjust it.
Timely upgrading iot solutions was the main task due to which I had found the iot consulting services offered by this article to be the best provider.
ReplyDeleteBig data solutions developer should understand the need of Data, and they should work to build more appropriate services to meet the requirements of their clients.
ReplyDelete