Here we will make an attempt at describing the non-Standard extensions to
the library. Some of these are from SGI's STL, some of these are GNU's,
and some just seemed to appear on the doorstep.
Before you leap in and use these, be aware of two things:
Non-Standard means exactly that. The behavior, and the very
existence, of these extensions may change with little or no
warning. (Ideally, the really good ones will appear in the next
revision of C++.) Also, other platforms, other compilers, other
versions of g++ or libstdc++-v3 may not recognize these names, or
treat them differently, or...
are all here; <bvector> exposes the old bit_vector
class that was used before specialization of vector<bool> was
available (it's actually a typedef for the specialization now).
<hash_map> and <hash_set>
are discussed further below. <rope> is the SGI
specialization for large strings ("rope," "large
strings," get it? love those SGI folks).
<slist> is a singly-linked list, for when the
doubly-linked list<> is too much space overhead, and
<tree> exposes the red-black tree classes used in the
implementation of the standard maps and sets.
Okay, about those hashing classes... I'm going to foist most of the
work off onto SGI's own site.
Each of the associative containers map, multimap, set, and multiset
have a counterpart which uses a
hashing
function to do the arranging, instead of a strict weak ordering
function. The classes take as one of their template parameters a
function object that will return the hash value; by default, an
instantiation of
hash.
You should specialize this functor for your class, or define your own,
before trying to use one of the hashing classes.
The hashing classes support all the usual associative container
functions, as well as some extra constructors specifying the number
of buckets, etc.
Why would you want to use a hashing class instead of the
"normal" implementations? Matt Austern writes:
[W]ith a well chosen hash function, hash tables
generally provide much better average-case performance than binary
search trees, and much worse worst-case performance. So if your
implementation has hash_map, if you don't mind using nonstandard
components, and if you aren't scared about the possibility of
pathological cases, you'll probably get better performance from
hash_map.
(Side note: for those of you wondering, "Why wasn't a hash
table included in the Standard in the first #!$@ place?"
I'll give a quick answer: it was proposed, but too late and in too
unorganized a fashion. Some sort of hashing will undoubtedly be
included in a future Standard.
Some of the classes in the Standard Library have additional
publicly-available members, and some classes are themselves not in
the standard. Of those, some are intended purely for the implementors,
for example, additional typedefs. Those won't be described here
(or anywhere else).
The extensions added by SGI are so numerous that they have
their own page. Since the SGI STL is no
longer actively maintained, we will try and keep this code working
ourselves.
filebufs have another ctor with this signature: basic_filebuf(__c_file_type*, ios_base::openmode, int_type); This comes in very handy in a number of places, such as
attaching Unix sockets, pipes, and anything else which uses file
descriptors, into the IOStream buffering classes. The three
arguments are as follows:
__c_file_type* F
// the __c_file_type typedef usually boils down to stdio's FILE
ios_base::openmode M
// same as all the other uses of openmode
int_type B
// buffer size, defaults to BUFSIZ
For those wanting to use file descriptors instead of FILE*'s, I
invite you to contemplate the mysteries of C's fdopen().
Thread-safety, space efficiency, high speed, portability... this is a
mess. Where to begin?
The Rules
The C++ standard only gives a few directives in this area:
When you add elements to a container, and the container must allocate
more memory to hold them, the container makes the request via its
Allocator template parameter. This includes adding
char's to the string class, which acts as a regular STL container
in this respect.
The default Allocator of every container-of-T is
std::allocator<T>.
The interface of the allocator<T> class is
extremely simple. It has about 20 public declarations (nested
typedefs, member functions, etc), but the two which concern us most
are:
T* allocate (size_type n, const void* hint = 0);
void deallocate (T* p, size_type n);
(This is a simplicifcation; the real signatures use nested typedefs.)
The "n" arguments in both those functions is a
count of the number of T's to allocate space for,
not their total size.
"The storage is obtained by calling
::operator new(size_t), but it is unspecified when or
how often this function is called. The use of hint
is unspecified, but intended as an aid to locality if an
implementation so desires." [20.4.1.1]/6
Problems and Possibilities
The easiest way of fulfilling the requirements is to call operator new
each time a container needs memory, and to call operator delete each
time the container releases memory. BUTthis
method is horribly slow.
Or we can keep old memory around, and reuse it in a pool to save time.
The old libstdc++-v2 used a memory pool, and so do we. As of 3.0,
it's
on by default. The pool is shared among all the containers in the
program: when your program's std::vector<int> gets cut in half
and frees a bunch of its storage, that memory can be reused by the
private std::list<WonkyWidget> brought in from a KDE library
that you linked against. And we don't have to call operator's new and
delete to pass the memory on, ether, which is a speed bonus.
BUT...
What about threads? No problem: in a threadsafe environment, the
memory pool is manipulated atomically, so you can grow a container in
one thread and shrink it in another, etc. BUT what
if threads in libstdc++-v3 aren't set up properly?
That's been answered already.
BUT what if you want to use your own allocator? What
if you plan on using a runtime-loadable version of malloc() which uses
shared telepathic anonymous mmap'd sections serializable over a
network, so that memory requests should go through malloc?
And what if you need to debug it?
Well then:
Available allocators in namespace std
First I'll describe the situation as it exists for the code which will
be released in GCC 3.1. This situation is extremely fluid. Then I'll
describe the differences for 3.0.x, which will not change much in
this respect.
As a general rule of thumb, users are not allowed to use names which
begin with an underscore. This means that to be portable between
compilers, none of the following may be used in your program directly.
(If you decide to be unportable, then you're free do do what you want,
but it's not our fault if stuff breaks.) They are presented here for
information for maintainers and contributors in addition to users, but
we will probably make them available for users in 3.1 somehow.
These classes are always available:
__new_alloc simply wraps ::operator new
and ::operator delete.
__malloc_alloc_template<int inst> simply wraps
malloc and free. There is also a hook
for an out-of-memory handler (for new/delete this is taken care of
elsewhere). The inst parameter is described below.
This class was called malloc_alloc in earlier versions.
allocator<T> has already been described; it is
The Standard Allocator for instances of T. It uses the internal
__alloc typedef (see below) to satisy its requests.
__simple_alloc<T,A> is a wrapper around another
allocator, A, which itself is an allocator for instances of T.
This is primarily used in an internal "allocator traits"
class which helps encapsulate the different styles of allocators.
__debug_alloc<A> is also a wrapper around an
arbitrary allocator A. It passes on slightly increased size
requests to A, and uses the extra memory to store size information.
When a pointer is passed to deallocate(), the stored
size is checked, and assert() is used to guarantee they match.
__allocator<T,A> is an adaptor. Many of these
allocator classes have a consistent yet non-standard interface.
Such classes can be changed to a conforming interface with this
wrapper: __allocator<T, __alloc> is thus the
same as allocator<T>.
An internal typedef, __mem_interface , is defined to be
__new_alloc by default.
Normally,
__default_alloc_template<bool thr, int inst>
is also available. This is the high-speed pool, called the default
node allocator. The reusable memory is shared among identical
instantiations of
this type. It calls through __mem_interface to obtain
new memory when its lists run out. If a client container requests a
block larger than a certain threshold size, then the pool is bypassed,
and the allocate/deallocate request is passed to
__mem_interface directly.
Its inst parameter is described below. The
thr boolean determines whether the pool should be
manipulated atomically or not. Two typedefs are provided:
__alloc is defined as this node allocator with thr=true,
and therefore is threadsafe, while __single_client_alloc
defines thr=false, and is slightly faster but unsafe for multiple
threads.
(Note that the GCC thread abstraction layer allows us to provide safe
zero-overhead stubs for the threading routines, if threads were
disabled at configuration time. In this situation,
__alloc should not be noticably slower than
__single_client_alloc.)
[Another threadsafe allocator where each thread keeps its own free
list, so that no locking is needed, might be described here.]
A cannon to swat a fly: __USE_MALLOC
If you've already read this
advice and decided to define this macro, then the situation changes
thusly:
__mem_interface, and
__alloc, and
__single_client_alloc are all typedef'd to
__malloc_alloc_template.
__default_alloc_template is no longer available.
At all. Anywhere.
Writing your own allocators
Depending on your application (a specific program, a generic library,
etc), allocator classes tend to be one of two styles: "SGI"
or "standard". See the comments in stl_alloc.h for more
information on this crucial difference.
At the bottom of that header is a helper type,
_Alloc_traits, and various specializations of it. This
allows the container classes to make possible compile-time
optimizations based on features of the allocator. You should provide
a specialization of this type for your allocator (doing so takes only
two or three statements).
Using non-default allocators
You can specify different memory management schemes on a per-container
basis, by overriding the default Allocator template
parameter. For example, an easy
(but nonportable)
method of specifying that only malloc/free should be used instead of
the default node allocator is:
behave exactly the same way. However, the memory pool for each type
(and remember that different instantiations result in different types)
remains separate.
The library uses 0 in all its instantiations. If you
wish to keep separate free lists for a particular purpose, use a
different number.
3.0.x
For 3.0.x, many of the names were incorrectly not prefixed
with underscores. So symbols such as "std::single_client_alloc"
are present. Be very careful to not depend on these names any more
than you would depend on implementation-only names.
Certain macros like _NOTHREADS and __STL_THREADS
can affect the 3.0.x allocators. Do not use them. Those macros have
been completely removed for 3.1.
Currently libstdc++-v3 uses the concept checkers from the Boost
library to perform optional
compile-time checking of template instantiations of the standard
containers. They are described in the linked-to page.
Everybody's got issues. Even the C++ Standard Library.
The Library Working Group, or LWG, is the ISO subcommittee responsible
for making changes to the library. They periodically publish an
Issues List containing problems and possible solutions. As they reach
a consensus on proposed solutions, we often incorporate the solution
into libstdc++-v3.
Here are the issues which have resulted in code changes to the library.
The links are to the specific defect reports from a partial
copy of the Issues List. You can read the full version online
at the ISO C++
Committee homepage, linked to on the
GCC "Readings"
page. If
you spend a lot of time reading the issues, we recommend downloading
the ZIP file and reading them locally.
(NB: partial copy means that not all links within
the lwg-*.html pages will work.
Specifically, links to defect reports that have not been accorded full
DR status will probably break. Rather than trying to mirror the
entire issues list on our overworked web server, we recommend you go
to the LWG homepage instead.)
If a DR is not listed here, we may simply not have gotten to it yet;
feel free to submit a patch. Search the include/bits and src
directories for appearances of _GLIBCPP_RESOLVE_LIB_DEFECTS for
examples of style. Note that we usually do not make changes to the code
until an issue has reached DR status.
An instance of ios_base::failure is constructed instead.
49:
Underspecification of ios_base::sync_with_stdio
The return type is the previous state of synchronization.
50:
Copy constructor and assignment operator of ios_base
These members functions are declared private and are
thus inaccessible. Specifying the correct semantics of
"copying stream state" was deemed too complicated.
This was not a const member function. Note that the DR says to
replace the function with a const one; we have instead provided an
overloaded version with identical contents.
117:
basic_ostream uses nonexistent num_put member functions
num_put::put() was overloaded on the wrong types.
118:
basic_istream uses nonexistent num_get member functions
Same as 117, but for num_get::get().
129:
Need error indication from seekp() and seekg()