StringBuilder: Difference between revisions

From Helpful
Jump to navigation Jump to search
mNo edit summary
 
mNo edit summary
 
Line 3: Line 3:
{{programming}}
{{programming}}
{{stub}}
{{stub}}
tl;dr:
* joining lots of strings may sometimes be done better than just doing lots of appends
:: particularly in OO languages (because they do not allow easy optimiztion)
:: particularly in garbage collected OO languages (because may create a lot of immediately-collectible objects)
* language-specific
* frequently overstated
:: not often benchmarked well.  Yes, appending hundreds of thousands of strings will bring this out. But for a handful of strings it may be slower. Which do you do more often?





Latest revision as of 15:10, 30 June 2024

Some fragmented programming-related notes, not meant as introduction or tutorial

Data: Numbers in computers ·· Computer dates and times ·· Data structures

Wider abstractions: Programming language typology and glossary · Generics and templating ·· Some abstractions around programming · · Computational complexity theory notes · Synchronous, asynchronous · First-class citizen

Syntaxy abstractions: Constness · Memory aliasing · Binding, assignment, and such · Hoisting · Closures · Context manager · Garbage collection

Sharing stuff: Communicated state and calls · Locking, data versioning, concurrency, and larger-scale computing notes ·· Dependency hell

Language specific: Python notes ·· C and C++ notes · Compiling and linking ·· Lua notes

Teams and products: Programming in teams, working on larger systems, keeping code healthy · Benchmarking, performance testing, load testing, stress testing, etc. · Maintainability

More applied notes: Optimized number crunching · File polling, event notification · Webdev · GUI toolkit notes · StringBuilder

Mechanics of duct taping software together: Automation, remote management, configuration management · Build tool notes · Installers


This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.


tl;dr:

  • joining lots of strings may sometimes be done better than just doing lots of appends
particularly in OO languages (because they do not allow easy optimiztion)
particularly in garbage collected OO languages (because may create a lot of immediately-collectible objects)
  • language-specific
  • frequently overstated
not often benchmarked well. Yes, appending hundreds of thousands of strings will bring this out. But for a handful of strings it may be slower. Which do you do more often?


Programming languages that have immutable strings (which includes most OO languages) may have the concept of a StringBuilder - a utility (class) that constructs a single string from many fragments of strings.


...because consider the naive way of doing that is a concatentation like "a"+"b"+"c"+"d"+'e"+"f"

since such string operations are boolean operations, every concatenaton of two creates a new string object.
so you create a bunch of temporary objects to be created in memory, almost all of which could be immediately removed (four in that example)


This results in more memory overhead than necessary (potentially on the order of eventual_length**2), and more CPU overhead, both in creation and in cleanup.

If your plan was always 'join list into single thing', then StringBuilder is the thing that lets you collect the parts, and then join them once.


That said, StringBuilder is also sometimes overstated. When concatenating together just a handful of literals or variables, and it's not in an inner loop, there is often no discernible difference, and there are even cases where StringBuilder may be more wasteful (there is often a bunch of bookkeeping. Also, sometimes a JiT compiler is smart enough to optimize especially the few-literal case, that can't apply to StringBuilders), so this may be micro-optimization and you don't really need to worry about.


Factors include:

  • the amount of strings/concatenations
  • size of the eventual string
  • your instantiation/use
  • The amount of work the compiler/interpreter has to do with a StringBuilder that it doesn't for a basic string.
  • StringBuilder's implementation
  • whether you can then use the StringBuffer itself, or have to create a new immutable String from it again
  • The cases that the code has to handle (joining two arguments? Joining an arbitrarily-sized list of strings)


Note that in your language there may be cases where a third option is better. For example, joining a list of strings into a new basic string may well be a an optimized case, certainly more efficient than piece-wise joining, but possibly also better than StringBuilder.