Optimization into a worry: String.split caused by "memory leak"

Has been praised Sun rigorous and elegant treatment technologies (bless Sun). Sun JDK source code in the Java libraries, even the notes are clear and regulatory norms Fan, javadoc comment use was also a meticulous reading comfortably familiar. Thus, in their daily work and learning, often read the source code for Java library, enjoying themselves? If you encounter strange problems, the source code to help is even greater.

Gossip talk less and return to the topic. These days, has been for the Java's "memory leak" problem entangled. Java application, the memory occupied by the constant, regular rise, eventually exceeding the monitoring thresholds. Holmes had a shot!

Speaking of Java, memory leaks, in fact, the definition is not so clear. First, if the JVM is not bug, then the theory is not a "can not be recycled heap space", ie C / C + + in the kind of memory leak in Java, does not exist. Second, if the Java program has been due to hold an object reference, but from the program logic point of view, this object can no longer be used, then we can think that this object is leaked. If this is the number of objects a lot, then it is obvious that a lot of memory space has been leaked ( "waste" more accurate) of the.

However, this article would like to say the memory leaks do not belong to the above reasons, it marked with inverted commas. The specific reason, really unexpected. For more information, please explain below.

Analysis of the general steps of memory leaks

If you find Java apps are consuming the memory leak there are signs, then we generally use the following steps to analyze

  1. The Java application uses the heap dump down the
  2. Using the Java heap analysis tool to identify memory footprint than expected (usually because of too many) of the suspect object
  3. When necessary, the need to analyze the suspect objects and other objects of the reference relationship.
  4. View the program's source code, to identify the reasons for the excessive number of suspect objects.

dump heap

If the Java application, a memory leak, do not worry with the application to kill, but to preserve the area. If it is Internet applications, you can cut the flow to other servers. The purpose is to preserve the scene in order to run the JVM's heap dump down.

JDK comes with jmap tool that can do this thing. It is the means of implementation are:

jmap -dump:format=b,file=heap.bin <pid>

format = b means that, dump out the binary file format.

file-heap.bin means that, dump out the file name is heap.bin.

<pid> is the JVM's process ID.

(In linux under) to run a ps aux | grep java, find the JVM's pid; before the implementation of jmap-dump: format = b, file = heap.bin <pid>, get heap dump file.

analyze heap

The binary heap dump file parsing into a human-readable information, tools naturally require professional help, here Recommended Memory Analyzer .

Memory Analyzer, referred to as MAT, is the Eclipse Foundation open source projects, contributions by the SAP, and IBM. Companies produce go the software giant is still very using of, MAT can be analyzed with hundreds of millions of class object heap, quickly calculate the size of the memory occupied by each object, the object reference to the relationship between the automatic detection of memory leaks suspected objects, powerful, and user-friendly easy to use.

MAT interface development based on Eclipse, released in two forms: Eclipse plug-ins and Eclipe RCP. MAT analysis results to the form of pictures and reports at a glance. In short individual is still very like this tool. The following posted two first official screenshots:

Optimization into a worry: String.split caused by "memory leak"

Optimization into a worry: String.split caused by "memory leak"

Closer to home, I used the MAT opened heap.bin, is easy to see, char [] out of their expected number of multi-occupying more than 90% of memory. In general, char [] in the JVM does take up a lot of memory, the numbers are very large, because String objects to char [] as an internal storage. However, this char [] is too greedy, and careful observation of one found that there were tens of thousands of dollars in char [], each occupied by several hundred K of memory. This phenomenon shows, Java program to save tens of thousands of large String object. Cementation process logic, this should not be, and certainly a problem somewhere.

Thus led

In the suspicious char [] in, any pick one, use the Path To GC Root feature, find the char [] reference to the path and found String object is referenced by a HashMap. This is also the expected things, Java memory leaks are mostly due to the global object is left in the HashMap in the not released. However, the HashMap is used as a cache, set the cache entry threshold, the threshold is reached after the guide will be automatically eliminated. From this logical analysis, it should not appear the memory leak. Although the cache of the String object has reached tens of thousands of dollars, but still did not meet pre-set threshold value (threshold value is set to relatively large, because it was estimated String objects are relatively small).

However, another issue caught my attention: Why such a huge cache of String objects? The internal char [] the length of the hundreds of K. Although the number of objects cached in the String has not yet reached the threshold value, but the size of String objects far exceeded our expectations, eventually leading to a large number of memory being consumed, the formation of signs of memory leaks (memory consumption and accurately said it should be too much).

Further allows for deeper investigation on this issue to see how the String large object was placed in the HashMap. By looking at the program's source code, I discovered that indeed String large objects, but did not put large objects into HashMap of String, but the large String object split (called String.split method), and then split out of String Small objects placed in the HashMap.

This is strange, it is clearly split into the HashMap in the String after a small object, how will occupy so much space? Is it split method of String class, there are problems?

View Code

With the above-mentioned questions, I looked up Sun JDK6 in the String class code, mainly yes yes split method implementation:

As can be seen, Stirng.split method called Pattern.split method. Read Pattern.split method code:

Attention to look at the first nine lines: Stirng match = input.subSequence (intdex, m.start ()). ToString ();

The match was split out of here, the String of small objects, it is actually String objects subSequence great results. Read String.subSequence code:

String.subSequence have called String.subString, continue to see:

11,12 look at the first line, we finally see the prospect of a solution, if the content is complete subString the original string, then return to the original String object; Otherwise, we will create a new String object, but this looks like using the original String object String object char []. Through String constructor to confirm this point:

In order to avoid the memory copy speed, Sun JDK directly reuse the original String objects char [], offset and length to identify the contents of the different strings. In other words, subString out to String of small objects will still be pointing to the original String Large Objects char [], split is the same situation. This explains why the HashMap of String objects char [] are so great.

Reasons to explain

In fact, out of the previous section has analyzed the cause, and then tidy up this section:

  1. Program from each request to get a String large object, the object of internal char [] the length of the hundreds of K.
  2. Program String large objects do split, the String will be split into smaller objects HashMap that is used for the cache.
  3. Sun JDK6 right String.split method is optimized, split out Stirng object directly using the original String object char []
  4. HashMap Each String object actually points to a huge char []
  5. HashMap is capped at 10000, so the cached objects Sting total size = 10000 * 100 K = G-class.
  6. G-level cache memory is occupied, and a lot of memory is wasted, resulting in signs of memory leaks.


Find the reasons for the solution, and will have. split is to use, but we should not split up into the HashMap of String objects directly, but instead call about String copy constructor String (String original), this constructor is safe, concrete can see the code:

Only, new String (string) the code is very strange, 囧 . Perhaps, subString and the split should provide an option to let the programmer control over whether or reuse String objects char [].

Does Bug

Although, subString and the split caused by the realization of the problem now, but this bug can count String class do? Personally feel that hard to say. Because such optimization is more reasonable, subString and spit the result is certainly a continuous sub-sequence of the original string. Can only say, String is not just a core class, which for the JVM is just as important as the type of the original type.

JDK implementation of the String to do all possible optimization is understandable. However, optimization has brought hardship, we have enough understanding of their programmers in order to make good use of them.

分类:Java 时间:2010-03-29 人气:370
blog comments powered by Disqus


  • How to install the Sun JDK 5 Ubuntu 10.04 2010-09-09

    Ubuntu 10.04 The Sun JDK out of the major software update database. If you want to install the Sun JDK 6 just follow how to add Ubuntu to the software update Parner Repository database , you can through the Ubuntu Software Center, apt-get or other Pa

  • [Java performance analysis] Sun JDK introduces the basic performance analysis tools 2010-04-29

    Sun JDK version also released with a number of performance analysis tools, these tools are essentially based on JVM MangeAPI and Sun JVM Attach API implementation, so it can provide what kind of features can be found in JVM Manage API description. Le

  • [Java performance analysis] Sun JDK visual performance analysis tools introduced 2010-04-29

    In addition to some basic tools, together with the Sun JDK release some visual analysis tools, including in the JDK6.0.7 version of JConsole and the introduction of the Visual VM. 1. JConsole: JConsole can be said of all the features described above

  • IBM JDK and the sun jdk difference 2010-04-19

    IBM's virtual machine in the official guidance document clearly states that prohibit the virtual machine is set to equal the maximum and minimum, otherwise it will result in the following two consequences <1> greatly increase the garbage collection

  • ubuntu10.10 install sun jdk and eclipse 2011-09-17

    One. Sun jdk installation and configuration 1 download jdk, I downloaded: jdk-6u13-linux-i586.bin (2) to set the executable permission sudo chmod + x jdk-6u13-linux-i586.bin 3 installation (I installed in / usr / java under the) Copied to the directo

  • Sun JDK 6 update 20 to support the jvmstat Monitor 2010-10-21

    To know a JVM support jvmstat monitor what in the end, as long as the line with this script: import java.lang.management.ManagementFactory import sun.jvmstat.monitor.*; name = ManagementFactory.runtimeMXBean.name pid = name[0..<name.indexOf('@')] vmI

  • Ubuntu 10.10 Development Environment configuration (c) Install Sun JDK 2010-10-24

    And most, like Linux, the general default installation of Open JDK, the Internet if you just see the film entertainment Han, and that certainly do not bother, but If you want to develop JAVA, or honestly installation Sun JDK. Fortunately, in the futu

  • Manually install the Sun JDK in Ubuntu way 2010-12-29

    Manually install the Sun JDK in Ubuntu way In the latest Ubuntu version, the official source removed sun-java-jdk software, instead recommended open-jdk, so other methods need to manually install sun-java-jdk. View online information, there are basic

  • Manually install the Sun JDK in Ubuntu method 2010-12-29

    Manually install the Sun JDK in Ubuntu method In the latest Ubuntu version, the official source to remove the sun-java-jdk software, instead recommended to use open-jdk, so other methods need to manually install sun-java-jdk. View online information,

  • java object and object reference 2010-04-12

    We use examples to illustrate the problem, first, we define a class: public class Person ( private String name; private int age; private int sex; private float height; private float weight; public Person () () ) Definition of a good class, you can cr

iOS 开发

Android 开发

Python 开发



PHP 开发

Ruby 开发






Javascript 开发

.NET 开发



Copyright (C) codeweblog.com, All Rights Reserved.

CodeWeblog.com 版权所有 黔ICP备15002463号-1

processed in 0.582 (s). 12 q(s)