java堆 数据结构 堆_Java中的紧凑堆外结构/组合
java堆 數據結構 堆
在上一篇文章中,我詳細介紹了代碼對主內存的訪問方式的含義。 從那時起,我就在Java中可以做什么以實現更可預測的內存布局提出了很多疑問。 有些模式可以使用數組支持的結構來應用,我將在另一篇文章中討論。 這篇文章將探討如何模擬Java中非常缺少的功能-類似于C所提供的結構數組。結構在堆棧和堆上都非常有用。 據我所知,不可能在Java堆棧上模擬此功能。 無法在堆棧上執(zhí)行此操作真是令人遺憾,因為它極大地限制了某些并行算法的性能,但這又是另一回事。
在Java中,所有用戶定義的類型都必須存在于堆中。 在一般情況下,Java堆由垃圾收集器管理,但是Java進程中的堆更大。 通過引入直接ByteBuffer ,可以分配內存,該內存不會被垃圾回收器跟蹤,因為本機代碼可以將其用于任務,例如避免為IO向內核復制數據或從內核復制數據。 因此,一種管理結構的方法是在ByteBuffer中偽造它們,這是一種合理的方法。 這可以允許緊湊的數據表示,但是具有性能和大小限制。 例如,不可能有大于2GB的ByteBuffer,并且所有訪問都經過邊界檢查,這會影響性能。 存在使用Unsafe的替代方法,它不僅速度更快而且不受ByteBuffer的大小限制。
我要詳細介紹的方法不是傳統(tǒng)的Java。 如果您的問題空間正在處理大數據或極高的性能,那么就會有好處。 如果您的數據集很小,并且性能不是問題,那么請立即逃避以避免陷入本機內存管理的黑暗技巧。
我將詳細介紹的方法的好處是:
所有選擇都有其后果。 通過采用下面詳述的方法,您自己負責一些內存管理。 弄錯了會導致內存泄漏,或者更糟的是,您可能使JVM崩潰! 謹慎行事...
合適的例子– 貿易數據
財務應用程序面臨的一個共同挑戰(zhàn)是捕獲和處理大量的訂單和交易數據。 對于該示例,我將創(chuàng)建一個很大的內存貿易數據表,可以對它進行分析查詢。 該表將使用兩種對比方法構建。 首先,我將采用傳統(tǒng)的Java方法來創(chuàng)建大型數組并引用單個Trade對象。 其次,我保持用法代碼相同,但是用可通過Flyweight模式操作的堆外結構數組替換大數組和Trade對象。
如果對于傳統(tǒng)的Java方法,我使用其他一些數據結構(例如Map或Tree),則內存占用量將更大,而性能會更低。
傳統(tǒng)的Java方法
public class TestJavaMemoryLayout {private static final int NUM_RECORDS = 50 * 1000 * 1000;private static JavaMemoryTrade[] trades;public static void main(final String[] args){for (int i = 0; i < 5; i++){System.gc();perfRun(i);}}private static void perfRun(final int runNum){long start = System.currentTimeMillis();init();System.out.format('Memory %,d total, %,d free\n',Runtime.getRuntime().totalMemory(),Runtime.getRuntime().freeMemory());long buyCost = 0;long sellCost = 0;for (int i = 0; i < NUM_RECORDS; i++){final JavaMemoryTrade trade = get(i);if (trade.getSide() == 'B'){buyCost += (trade.getPrice() * trade.getQuantity());}else{sellCost += (trade.getPrice() * trade.getQuantity());}}long duration = System.currentTimeMillis() - start;System.out.println(runNum + ' - duration ' + duration + 'ms');System.out.println('buyCost = ' + buyCost + ' sellCost = ' + sellCost);}private static JavaMemoryTrade get(final int index){return trades[index];}public static void init(){trades = new JavaMemoryTrade[NUM_RECORDS];final byte[] londonStockExchange = {'X', 'L', 'O', 'N'};final int venueCode = pack(londonStockExchange);final byte[] billiton = {'B', 'H', 'P'};final int instrumentCode = pack( billiton);for (int i = 0; i < NUM_RECORDS; i++){JavaMemoryTrade trade = new JavaMemoryTrade();trades[i] = trade;trade.setTradeId(i);trade.setClientId(1);trade.setVenueCode(venueCode);trade.setInstrumentCode(instrumentCode);trade.setPrice(i);trade.setQuantity(i);trade.setSide((i & 1) == 0 ? 'B' : 'S');}}private static int pack(final byte[] value){int result = 0;switch (value.length){case 4:result = (value[3]);case 3:result |= ((int)value[2] << 8);case 2:result |= ((int)value[1] << 16);case 1:result |= ((int)value[0] << 24);break;default:throw new IllegalArgumentException('Invalid array size');}return result;}private static class JavaMemoryTrade{private long tradeId;private long clientId;private int venueCode;private int instrumentCode;private long price;private long quantity;private char side;public long getTradeId(){return tradeId;}public void setTradeId(final long tradeId){this.tradeId = tradeId;}public long getClientId(){return clientId;}public void setClientId(final long clientId){this.clientId = clientId;}public int getVenueCode(){return venueCode;}public void setVenueCode(final int venueCode){this.venueCode = venueCode;}public int getInstrumentCode(){return instrumentCode;}public void setInstrumentCode(final int instrumentCode){this.instrumentCode = instrumentCode;}public long getPrice(){return price;}public void setPrice(final long price){this.price = price;}public long getQuantity(){return quantity;}public void setQuantity(final long quantity){this.quantity = quantity;}public char getSide(){return side;}public void setSide(final char side){this.side = side;}} }緊湊型堆外結構
import sun.misc.Unsafe;import java.lang.reflect.Field;public class TestDirectMemoryLayout {private static final Unsafe unsafe;static{try{Field field = Unsafe.class.getDeclaredField('theUnsafe');field.setAccessible(true);unsafe = (Unsafe)field.get(null);}catch (Exception e){throw new RuntimeException(e);}}private static final int NUM_RECORDS = 50 * 1000 * 1000;private static long address;private static final DirectMemoryTrade flyweight = new DirectMemoryTrade();public static void main(final String[] args){for (int i = 0; i < 5; i++){System.gc();perfRun(i);}}private static void perfRun(final int runNum){long start = System.currentTimeMillis();init();System.out.format('Memory %,d total, %,d free\n',Runtime.getRuntime().totalMemory(),Runtime.getRuntime().freeMemory());long buyCost = 0;long sellCost = 0;for (int i = 0; i < NUM_RECORDS; i++){final DirectMemoryTrade trade = get(i);if (trade.getSide() == 'B'){buyCost += (trade.getPrice() * trade.getQuantity());}else{sellCost += (trade.getPrice() * trade.getQuantity());}}long duration = System.currentTimeMillis() - start;System.out.println(runNum + ' - duration ' + duration + 'ms');System.out.println('buyCost = ' + buyCost + ' sellCost = ' + sellCost);destroy();}private static DirectMemoryTrade get(final int index){final long offset = address + (index * DirectMemoryTrade.getObjectSize());flyweight.setObjectOffset(offset);return flyweight;}public static void init(){final long requiredHeap = NUM_RECORDS * DirectMemoryTrade.getObjectSize();address = unsafe.allocateMemory(requiredHeap);final byte[] londonStockExchange = {'X', 'L', 'O', 'N'};final int venueCode = pack(londonStockExchange);final byte[] billiton = {'B', 'H', 'P'};final int instrumentCode = pack( billiton);for (int i = 0; i < NUM_RECORDS; i++){DirectMemoryTrade trade = get(i);trade.setTradeId(i);trade.setClientId(1);trade.setVenueCode(venueCode);trade.setInstrumentCode(instrumentCode);trade.setPrice(i);trade.setQuantity(i);trade.setSide((i & 1) == 0 ? 'B' : 'S');}}private static void destroy(){unsafe.freeMemory(address);}private static int pack(final byte[] value){int result = 0;switch (value.length){case 4:result |= (value[3]);case 3:result |= ((int)value[2] << 8);case 2:result |= ((int)value[1] << 16);case 1:result |= ((int)value[0] << 24);break;default:throw new IllegalArgumentException('Invalid array size');}return result;}private static class DirectMemoryTrade{private static long offset = 0;private static final long tradeIdOffset = offset += 0;private static final long clientIdOffset = offset += 8;private static final long venueCodeOffset = offset += 8;private static final long instrumentCodeOffset = offset += 4;private static final long priceOffset = offset += 4;private static final long quantityOffset = offset += 8;private static final long sideOffset = offset += 8;private static final long objectSize = offset += 2;private long objectOffset;public static long getObjectSize(){return objectSize;}void setObjectOffset(final long objectOffset){this.objectOffset = objectOffset;}public long getTradeId(){return unsafe.getLong(objectOffset + tradeIdOffset);}public void setTradeId(final long tradeId){unsafe.putLong(objectOffset + tradeIdOffset, tradeId);}public long getClientId(){return unsafe.getLong(objectOffset + clientIdOffset);}public void setClientId(final long clientId){unsafe.putLong(objectOffset + clientIdOffset, clientId);}public int getVenueCode(){return unsafe.getInt(objectOffset + venueCodeOffset);}public void setVenueCode(final int venueCode){unsafe.putInt(objectOffset + venueCodeOffset, venueCode);}public int getInstrumentCode(){return unsafe.getInt(objectOffset + instrumentCodeOffset);}public void setInstrumentCode(final int instrumentCode){unsafe.putInt(objectOffset + instrumentCodeOffset, instrumentCode);}public long getPrice(){return unsafe.getLong(objectOffset + priceOffset);}public void setPrice(final long price){unsafe.putLong(objectOffset + priceOffset, price);}public long getQuantity(){return unsafe.getLong(objectOffset + quantityOffset);}public void setQuantity(final long quantity){unsafe.putLong(objectOffset + quantityOffset, quantity);}public char getSide(){return unsafe.getChar(objectOffset + sideOffset);}public void setSide(final char side){unsafe.putChar(objectOffset + sideOffset, side);}} }結果
Intel i7-860 @ 2.8GHz, 8GB RAM DDR3 1333MHz, Windows 7 64-bit, Java 1.7.0_07 ============================================= java -server -Xms4g -Xmx4g TestJavaMemoryLayout Memory 4,116,054,016 total, 1,108,901,104 free 0 - duration 19334ms Memory 4,116,054,016 total, 1,109,964,752 free 1 - duration 14295ms Memory 4,116,054,016 total, 1,108,455,504 free 2 - duration 14272ms Memory 3,817,799,680 total, 815,308,600 free 3 - duration 28358ms Memory 3,817,799,680 total, 810,552,816 free 4 - duration 32487msjava -server TestDirectMemoryLayout Memory 128,647,168 total, 126,391,384 free 0 - duration 983ms Memory 128,647,168 total, 126,992,160 free 1 - duration 958ms Memory 128,647,168 total, 127,663,408 free 2 - duration 873ms Memory 128,647,168 total, 127,663,408 free 3 - duration 886ms Memory 128,647,168 total, 127,663,408 free 4 - duration 884msIntel i7-2760QM @ 2.40GHz, 8GB RAM DDR3 1600MHz, Linux 3.4.11 kernel 64-bit, Java 1.7.0_07 ================================================= java -server -Xms4g -Xmx4g TestJavaMemoryLayout Memory 4,116,054,016 total, 1,108,912,960 free 0 - duration 12262ms Memory 4,116,054,016 total, 1,109,962,832 free 1 - duration 9822ms Memory 4,116,054,016 total, 1,108,458,720 free 2 - duration 10239ms Memory 3,817,799,680 total, 815,307,640 free 3 - duration 21558ms Memory 3,817,799,680 total, 810,551,856 free 4 - duration 23074msjava -server TestDirectMemoryLayout Memory 123,994,112 total, 121,818,528 free 0 - duration 634ms Memory 123,994,112 total, 122,455,944 free 1 - duration 619ms Memory 123,994,112 total, 123,103,320 free 2 - duration 546ms Memory 123,994,112 total, 123,103,320 free 3 - duration 547ms Memory 123,994,112 total, 123,103,320 free 4 - duration 534ms
分析
讓我們將結果與上面承諾的3個好處進行比較。
1.顯著改善性能
這里的證據很明確。 使用堆外結構方法要快一個數量級以上。 最極端的情況是,在Sandy Bridge處理器上進行第五次運行,我們在完成任務上的持續(xù)時間相差43.2 倍 。 這也很好地說明了Sandy Bridge在可預測的數據訪問模式方面的表現。 性能不僅明顯更好,而且更加一致。 隨著堆變得碎片化,從而訪問模式變得更加隨機,性能會下降,這在以后使用標準Java方法運行時可以看到。
2.更緊湊的數據表示
對于我們的堆外表示,每個對象需要42個字節(jié)。 如示例所示,要存儲5000萬個字節(jié),我們需要2100,000,000字節(jié)。 JVM堆所需的內存是:
所需內存=總內存–可用內存–基本JVM需求
2,883,248,712 = 3,817,799,680 – 810,551,856 – 123,999,112
這意味著JVM需要多40%的內存來表示相同的數據。 產生這種開銷的原因是對Java對象的引用加上對象標頭的數組。 在上一篇文章中,我討論了Java中的對象布局。
當處理非常大的數據集時,此開銷可能成為重要的限制因素。
3.能夠處理非常大的數據集,同時避免令人討厭的GC暫停
上面的示例代碼在每次運行之前強制執(zhí)行GC循環(huán),并且在某些情況下可以提高結果的一致性。 隨時刪除對System.gc()的調用,并親自觀察其中的含義。 如果運行添加以下命令行參數的測試,則垃圾收集器將詳細輸出發(fā)生的情況。
-XX:+ PrintGC -XX:+ PrintGCDetails -XX:+ PrintGCDateStamps -XX:+ PrintTenuringDistribution -XX:+ PrintHeapAtGC -XX:+ PrintGCApplicationConcurrentTime -XX:+ PrintGCApplicationStoppedTime -XX:+ PrintSafepointStatistics
通過分析輸出,我可以看到該應用程序總共進行了29個GC循環(huán)。 通過從輸出中提取指示應用程序線程何時停止的行,下面列出了暫停時間。
With System.gc() before each run ================================ Total time for which application threads were stopped: 0.0085280 seconds Total time for which application threads were stopped: 0.7280530 seconds Total time for which application threads were stopped: 8.1703460 seconds Total time for which application threads were stopped: 5.6112210 seconds Total time for which application threads were stopped: 1.2531370 seconds Total time for which application threads were stopped: 7.6392250 seconds Total time for which application threads were stopped: 5.7847050 seconds Total time for which application threads were stopped: 1.3070470 seconds Total time for which application threads were stopped: 8.2520880 seconds Total time for which application threads were stopped: 6.0949910 seconds Total time for which application threads were stopped: 1.3988480 seconds Total time for which application threads were stopped: 8.1793240 seconds Total time for which application threads were stopped: 6.4138720 seconds Total time for which application threads were stopped: 4.4991670 seconds Total time for which application threads were stopped: 4.5612290 seconds Total time for which application threads were stopped: 0.3598490 seconds Total time for which application threads were stopped: 0.7111000 seconds Total time for which application threads were stopped: 1.4426750 seconds Total time for which application threads were stopped: 1.5931500 seconds Total time for which application threads were stopped: 10.9484920 seconds Total time for which application threads were stopped: 7.0707230 secondsWithout System.gc() before each run =================================== Test run times 0 - duration 12120ms 1 - duration 9439ms 2 - duration 9844ms 3 - duration 20933ms 4 - duration 23041msTotal time for which application threads were stopped: 0.0170860 seconds Total time for which application threads were stopped: 0.7915350 seconds Total time for which application threads were stopped: 10.7153320 seconds Total time for which application threads were stopped: 5.6234650 seconds Total time for which application threads were stopped: 1.2689950 seconds Total time for which application threads were stopped: 7.6238170 seconds Total time for which application threads were stopped: 6.0114540 seconds Total time for which application threads were stopped: 1.2990070 seconds Total time for which application threads were stopped: 7.9918480 seconds Total time for which application threads were stopped: 5.9997920 seconds Total time for which application threads were stopped: 1.3430040 seconds Total time for which application threads were stopped: 8.0759940 seconds Total time for which application threads were stopped: 6.3980610 seconds Total time for which application threads were stopped: 4.5572100 seconds Total time for which application threads were stopped: 4.6193830 seconds Total time for which application threads were stopped: 0.3877930 seconds Total time for which application threads were stopped: 0.7429270 seconds Total time for which application threads were stopped: 1.5248070 seconds Total time for which application threads were stopped: 1.5312130 seconds Total time for which application threads were stopped: 10.9120250 seconds Total time for which application threads were stopped: 7.3528590 seconds從輸出中可以看出,垃圾回收器花費了大量時間。 當線程停止時,您的應用程序無響應。 這些測試已使用默認GC設置完成。 可以對GC進行調整以獲得更好的結果,但這是一項非常熟練的工作。 我知道,即使在高吞吐量條件下,即使不在高吞吐量條件下也不施加較長的暫停時間,這可以很好地應對JVM。
在對該應用程序進行性能分析時,我可以看到大部分時間都花在分配對象并將它們提升到老一代,因為它們不適合年輕一代。 可以從計時中除去初始化成本,但這是不現實的。 如果采用傳統(tǒng)的Java方法,則需要先建立狀態(tài),然后才能進行查詢。 應用程序的最終用戶必須等待狀態(tài)建立和查詢執(zhí)行。
這個測試確實非常簡單。 想象使用100 GB規(guī)模的相似數據集。
注意:當垃圾收集器壓縮區(qū)域時,可以將彼此相鄰的對象移開很遠。 這可能導致TLB和其他緩存未命中。
關于序列化的旁注
以這種方式使用堆外結構的一個巨大好處是,如何通過簡單的內存副本將它們很容易地序列化到網絡或存儲中,就像我在上一篇文章中所展示的那樣。 這樣,我們可以完全繞過中間緩沖區(qū)和對象分配。
結論
如果您愿意對大型數據集進行一些C風格的編程,則可以通過脫離堆控制Java中的內存布局。 如果這樣做,那么在性能,緊湊性和避免GC問題方面的好處就非常重要。 但是,這種方法不應該用于所有應用程序。 僅對于非常大的數據集或吞吐量和/或延遲的極端性能,才注意到它的優(yōu)勢。
我希望Java社區(qū)可以共同認識到支持在堆和堆棧上的結構的重要性。 John Rose在此領域做了出色的工作 ,定義了如何將元組添加到JVM。 他今年在JVM語言峰會上發(fā)表的有關Arrays 2.0的演講確實值得關注。 約翰在演講中討論了結構數組的選擇和數組結構。 如果有John提出的元組可用,則此處描述的測試可以具有可比的性能,并且是更令人愉快的編程風格。 整個結構數組可以在一個動作中分配,因此可以繞開不同代的對象副本,并以緊湊的連續(xù)方式進行存儲。 這將消除此類嚴重的GC問題。
最近,我正在比較Java和.Net之間的標準數據結構。 在某些情況下,當.Net使用本機結構支持時,對于諸如地圖和字典之類的東西,我發(fā)現.Net的性能優(yōu)勢是6-10倍。 讓我們盡快將其納入Java!
從結果中還可以很明顯地看出,如果我們要使用Java對大數據進行實時分析,那么我們的標準垃圾收集器就需要顯著改善并支持真正的并發(fā)操作。
[1] –據我所知,唯一能夠很好地處理非常大的堆的JVM是Azul Zing祝您編程愉快,別忘了分享!
參考:來自Java的JCG合作伙伴 Martin Thompson在Mechanical Sympathy博客上的緊湊型堆外結構/堆棧在Java中 。
翻譯自: https://www.javacodegeeks.com/2012/10/compact-off-heap-structurestuples-in.html
java堆 數據結構 堆
創(chuàng)作挑戰(zhàn)賽新人創(chuàng)作獎勵來咯,堅持創(chuàng)作打卡瓜分現金大獎總結
以上是生活随笔為你收集整理的java堆 数据结构 堆_Java中的紧凑堆外结构/组合的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: Win10复制文件出错提示0x80070
- 下一篇: win7电脑麦不能说话(win7笔记本电