Oct 21 2006

C++ and Threading

Published by Brian at 12:49 pm under Uncategorized

Updated: Thanks to Captain Bedsock, link is restored.

I’m reading this post on the C++ committee’s dealings on threading, and, I am pleased to announce, there is absolutely no chance what so ever of anything sane, workable, or sensible accidentally arising from these precedings. That’s saracasm, but not by much. By the very nature of their responses, most of the people participating in this design meeting had no experience what so ever with multithreaded programs, and the problems that accompany them. Worse yet, they display a gross ignorance of compiler construction and optimization. Just who you want defining what is inarguably the most tricky aspect of an imperitive language (as the Java people found out the hard way).

Let’s start with Hans Boehm’s “surprising” example. You have two differnet variables, x and y, both starting out as 0. Thread 1 executes the code “x = 1; r1 = y;”, while thread 2 executes the code “y = 1; r2 = x;”. Is it possible for both r1 and r2 to get assigned 0? Yes. Duh! It’s called load lifting, a special case of strength reduction. It’s an important optimization these days because of the long times that loads can take- and the fact that the values can not be used until the load completes. Stores can take forever and no one cares, but waiting for a load can stop a program dead in it’s tracks. So the sooner the load can be started, the less cost the load takes.

This is the problem with multithreaded imperitive code, well, one of the problems anyways. This sort of optimization is impossible to detect in single threaded code- so programmers could conviently ignore it (except that their programs ran faster). But in multithreaded code, such optimizations become visible. The response to this “revelation” is as predictable as it is stupid:

The ordering rules may be logical, but in this case, the result came as a bit of a shock to everyone, especially Herb who felt this was far too dangerous. Won’t unsuspecting programmers expect such a straightforward use of atomics to be fully ordered? Most of the room agreed, and Hans has since altered his proposal so that the conversions between atomic and T are fully ordered. You may lose some performance, but you get to keep your sanity, and the lower-level load and store are still there for all the insane hacker-types out there.

“Don’t optimize my code!” in other words. “I want it to run slower than molasses in January!” This seems to be a standard response of many programmers when they discover the compiler actually optimizes their code. The correctness of the resulting code is totally irrelevent- that the compiler dares change their beautiful code in the least way is an unforgivable transgression.

Hans’ “low level” definition is, in fact, anything but. Actually, it is low level, but only in the bad ways. Consider for a moment what an implementation of this proposal would look like on, say, the x86 architecture. The committee members seem to think that having, say, atomic would optimize down to normal int loads and stores. The problem with that is that there is no way to prevent race conditions with the standard integer loads and stores. So the only implementation I can see is that the compiler would quietly add a mutex (or at least a spin lock) to every atomic value, which is locked when load() is called, and released when store() is called.
This prevents race conditions (at least with atomic data- programmers are still allowed to screw up and introduce race conditions with non-atomic data), but at the cost of introducing hidden deadlocks among the implicit mutexes used to wrap atomic types. On the PowerPC, you might be tempted to implement the atomic types with the load locked/store conditional instructions- but this would be wrong. Consider the code

t1 = x,load();

printf “Foo!\n”;

x.store(t+1);

How often can the above code print “Foo!”? No, even on the PPC, atomic data types would have to be implemented with mutexes. Oh, and no mention of what happens if you forget to release an atomic variable you’ve acquired.
Herb’s proposal sounds somewhat less foolish- in fact, it’s probably the one I’d pick were I required to design a memory model for C++. It’s basically the Posix threads model roled into the standard- probably not a bad idea. Unfortunately, the straw poll results show where the committee is heading- adopting both. This is the fundamental problem of design by comittee. Especially a committee that obviously has so little knowledge of the problem domain.

That ignorance was on display again with the debate of what “volatile” means. Obviously, knowledge of the relevent specifications, especially the 1989 ANSI C standard, is not a prerequisite to be on the C++ standards committee. I know this because the 1989 ANSI C standard explains what volatile means. I shouldn’t be surprised by this, since knowledge of the 1989 ANSI C Standard wasn’t even required to be on the 1999 ANSI C Standard committee (see long long).

Another standard (pun not intended) problem the C++ language design comittee seems to have is a fasincination of form over substance. How new code looks is more important than how it behaves. Lots of tricky implementation details can be swept under the rug if the new code looks cool. So Lawrence Crowl’s suggestion of implementing the join calculus (not a bad idea, IMHO) has met with opposition. It didn’t look cool. But that neat way to upgrade a reader lock to a writer lock using a move constructor- now that was cool. Never mind the niggling little implementation details. Like if you have two threads, both holding a reader lock on an object and both trying to upgrade the lock to a writer lock, what happens? But, but, but- it uses a move constructor! How can you not like it? Um, yeah. Reader-writer locks are one step up from stone knives and bear skins (aka spinlocks). Of course, the only nit they could find to pick was with his spelling.

My critique about the Crowl’s idea’s reception may sound somewhat bitter, until you realize that Kevlin Henney proposed effectively the exact same idea- the join calculus. Except that he replaced “int join pending” with “joiner wait_for_value”. And we all know how cool templates are- they define cool. And they look real neat . So rather than the brush off that Crowl received, Henney had a warm reception. Note that they did not tell Crowl “Neat idea- but can you make it a library, maybe using templates, instead of keywords in the language?” No, Crowl’s idea was bad, Hennley’s was good. Since they were effectively the same idea, it has to be the form the idea was presented in, not the substance, which was the basis for the decision. And no one even mentioned the real problem with the idea, in either form. The problem with the join calculus in an imperitive language is that it makes synchronization issues, especially race conditions, an even bigger problem then when threads are relatively rare. What happens if a thread tries to get it’s own return value? Etc.
The cherry on top of the whole proceedings comes at the end. I quote:

As Beman Dawes correctly observed to me over lunch, the group really would be in bad shape without Hans’ encyclopedic knowledge of threading issues.

In other words, only one person on the committee had signifigant knowledge of, or experience with, the problem domain. And that this was obvious to at least two observers. Although I wonder if Crowl and Henney might not have more experience and knowledge than they’re letting on to.

I would be the first to admit that this problem is exceptionally difficult and tricky, with large numbers of hidden pitfalls. I have my fare share of scars from falling into them often enough. But there’s nothing like adding a hat full of ignorance to make a hard problem impossible- and this committe has ignorance enough for three. As such, the odds of anything implementable and usable, let alone sane and usefull, comming out of this group has to be about zero.

Popularity: 24% [?]

25 Responses to “C++ and Threading”

  1. Candideon 21 Oct 2006 at 2:05 pm

    I’m increasingly convinced that all the smart people jumped ship from C++ years ago.

  2. bhurt-awon 21 Oct 2006 at 2:25 pm

    Unfortunately that just leaves the other 99% of programmers.

    I’d be a lot more tolerant, even amused, by the antics of the C++ committee if I wasn’t condemned to use to use programs written in that language.

  3. ridiculous_fishon 25 Dec 2006 at 6:34 am

    >Thread 1 executes the code “x = 1; r1 = y;”, while thread 2 executes the
    > code “y = 1; r2 = x;”. Is it possible for both r1 and r2 to get assigned 0?
    > Yes. Duh! It’s called load lifting, a special case of strength reduction.

    This can happen even WITHOUT this optimization, on a processor with weakly ordered memory semantics, which is nearly anything that’s not x86 (including PowerPC, IA64, Alpha…)

  4. Social Content Headline Newson 25 Dec 2006 at 9:12 am

    C++ and Threading…

    [link] [more]…

  5. Thoughts Made Wordson 25 Dec 2006 at 10:17 am

    The Solution to C Threading is Erlang…

  6. toddon 25 Dec 2006 at 11:00 am

    Thanks for a picture of how it’s going. very interesting in car crash kind of way. What do you think of going to a higher level solution modeled after Erlang instead of bolting on low level mechanisms? I talk about this possibility at The Solution to C++ Threading is Erlang.

  7. Anonymous Cowardon 25 Dec 2006 at 12:14 pm

    In the end, it’s all just noise signifying nothing. The vendors won’t implement it and it will fail to achieve widespread adoption. It’ll be like most of the standards produced by the W3C in recent years — whose committees display an equally shocking level of experience with the technical minutiae. (Not that the vendors have a good track-record here either; most implementations of, for example, the math functions are more than 100x slower than they need to be.)

    Crappy standards and crappy code are everywhere.

  8. tobon 25 Dec 2006 at 12:36 pm

    You could get on the mailing-list an start contributing with your knowledge?!

  9. Captain Bedsockon 25 Dec 2006 at 3:29 pm

    Here’s a long response to this post, espousing Erlang-like message-queue-only communication.

  10. Captain Bedsockon 25 Dec 2006 at 3:33 pm

    Also, here’s the link to the ad-hoc meeting that bhurt-aw is responding to.

  11. Bala Natarajanon 25 Dec 2006 at 9:47 pm

    Hi

    You say “this” post, when you talk about C++ committee’s decisions on threading. DO you have a link to “this” post? It would be nice to get see what is being discussed. Your post was interesting, but I feel a lack of context.

    Thanks
    Bala

  12. maxon 26 Dec 2006 at 1:12 am

    dead in its tracks

    it’s = it is

    its = belonging to it

  13. pongbaon 26 Dec 2006 at 3:22 am

    … but at the cost of introducing hidden deadlocks among the implicit mutexes used to wrap atomic types.

    I really don’t get your point, sorry for being so dumb. But isn’t the responsibility on the very one who incorrectly introduced deadlock? You can’t just magically prevent deadlock anyway, isn’t it?

    Oh, and no mention of what happens if you forget to release an atomic variable you’ve acquired.

    Yeah, that’s absolutely right. One can forget to release, but isn’t this their fault? Besides, might there be any tool that can help detecting such problems till then?

    How often can the above code print “Foo!”? No, even on the PPC, atomic data types would have to be implemented with mutexes.

    Again, absolutely right. But what’s your point? Using mutex to implement require/release doesn’t necessarily mean that there’s anything wrong about it.

  14. digon 26 Dec 2006 at 3:51 pm

    But please (sarcasm), they want “industry strength”, “real-world” (I like these nonsenses) language and libraries, which compiles for an hours. Maybe they should take a programming classes from knowledgeable people (for example linux kernel developers:)), to show them that world is not in templates and try{}catch blocks.

    > all the smart people jumped ship from C++ years ago.
    Very smart observation :)

  15. Howard Hinnanton 27 Dec 2006 at 8:36 am

    It is so much easier to stand up, shout, point and ridicule, than it is to actually help. People have been doing that since there were people.

    Thank goodness there are some people who have the courage to put their hide on the line and make concrete proposals for a better standard. You can too. On the library side you can send proposals directly to me.

    But check the attitude at the door. You may find out that you’re not the only smart person working towards a better C++ standard.

    Howard Hinnant
    Library Working Group Chair

  16. pongbaon 27 Dec 2006 at 10:32 pm

    hello bhurt-aw, I’ve put the link to the news group:

    http://groups-beta.google.com/group/comp.lang.c++.moderated/browse_thread/thread/87d077574579f88b/015b819b5dbfdb8b#015b819b5dbfdb8b

    It’d be great if you join the discussion there. People seem to think that there’re sth. you’ve missed.

  17. dizzyon 30 Dec 2006 at 3:17 am

    Ok, I won’t comment on things I don’t fully understand but at least for the move constructor usage to upgrade a read lock to a write lock what is your exact problem with it ? I don’t see what “terrible” thing happens when 2 threads try to do it at the same time having already the read lock aquired.

    Basically, one will succeed, the other will block/sleep until the write lock is released at which point the other thread will aquire it.

    Initially I thought your problem is that using move constructors it destroys the “original” lock, but this is no problem, the “lock” object itself is on each specific thread’s stack (or not on stack but it’s thread specific) while the “mutex” object is the one which is shared between the threads and nobody said using a move constructor on that one (to alter it’s state or something like that that a concurrent thread may have issues with).

    So unless I miss something here I see the proposed move constructor to upgrade read locks to move locks quite SANE because it solves nicely the problems associated with such an “upgrade” operation:
    - once you upgraded a lock it’s not a read lock anymore (so using the move constructor that happens as the original lock is left in a “dummy” state)
    - you need a read lock before upgrading it and generally you can’t copy locks (and using the move constructor you just don’t need to copy it)

  18. bhurt-awon 31 Dec 2006 at 9:05 am

    General responses:

    First, thanks for the feed back!

    On Erlang as a solution: yes, very probably. There are two languages that have the potiential to allow us to write multithreaded code and remain sane- Erlang and Haskell + STM. Note that I didn’t include Ocaml in that list! Message passing allows for distributed apps, while STM in a purely functional setting allows for higher performance in a shared memory environment. I don’t know which paradigm will win out (or if the solution isn’t a hybrid of the two). An important point to note here: both languages are radically different from C++.

    So why don’t I dig in an help out the C++ community? A number of reasons. First, a deep seated suspicion that my help would not be appreciated. Second, and more importantly, I don’t think that C++ is even digging at the spot marked X- I don’t think that it’s fixable as a language.

    ridiculous_fish: dang good point on the memory semantics of various processors. Which just makes the problem that much worse- which compiler optimizations are allowed or not is arguably under the jurisdiction of the C++ standards committee. The memory semantics of existing processors much less so. Also: I’m not sure this isn’t also the case on the x86 as well (at least without memory barriers). And even if they don’t do now, there’s no gaureentee they won’t do it in the future.

    dizzy: when a thread is blocked waiting to upgrade a read lock to a write lock, does it release it’s read lock? If the answer to this question is no, then you have an instant deadlock situation- because you can’t grant a write lock while someone else has the read lock. So if you have two threads, neither of which are releasing their read locks, both waiting on the other thread to release their read locks so that they can acquire write locks. If the answer to this question is “yes”, then “promotion” is little different from releasing the read lock and acquiring the write lock. Including the possibility of the locked structure changing in the mean time, a behavior which will, no doubt, surprise a great number of programmers and introduce a great number of bugs.

  19. bhurt-awon 31 Dec 2006 at 9:07 am

    I could have sworn that I had hyperlinked the phrase “this post” to the right website, but it isn’t there now. In any case, Captain Bedsock (comment #8) has the right link.

  20. Candideon 31 Dec 2006 at 11:33 am

    For those new to the site that haven’t seen it, check out this post. Welcome aboard!

    ~~ Your Friendly Blog Admin.

  21. Enfranchised Mind » What the heck…?on 31 Dec 2006 at 3:18 pm

    [...] Edit: Also popular are Brian’s C++/Threading rant and OO/Relational conversations. I guess this blog is going to go technical one way or another! [...]

  22. Howard Hinnanton 30 Jan 2007 at 9:59 am

    If anyone is interested in reading the actual proposal for the read/write/upgradable locks it is here:

    http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n2094.html

    Because I’m writing into a category titled “Why C++ Sucks” I have reservations that anyone will take the time. However I wanted to at least put to rest the notion that the C++ committee is unaware of the problems concerned with upgrading two read locks.

    The paper at the url above proposes 3 types of mutexes (and associated locks): sharable, upgradable and exclusive. Call these read, upgradable and write if you wish, I’m not married to the names. There is no blocking upgrade path from sharable to exclusive. One can only upgrade from an upgradable to an exclusive. And though an upgradable mutex can share ownership with multiple sharables, it can not share with any other upgradable. Therefore there can be only one mutex at a time that is trying to upgrade to exclusive.

    Avoiding the deadlock discussed in this blog is the entire reason the upgradable mutex exists in the proposal.

    Admittedly there is a 2-word type-o / think-o in the proposal under the definition of unlock_upgradable_and_lock(). My bad. It has been noted and will be corrected should these words approach a technical report or a standard.

    > Never mind the niggling little implementation details. …Of course, the > only nit they could find to pick was with his spelling.

    Or perhaps it was because the people I presented to had the benefit of the presentation, explaining many details. And this audience was also aware of the years of research, plus actual working implementation behind the proposal.

    To assume mass-stupidity from a group of internationally recognized experts of a field, any field, appears to me to be less than a bright thing to do. Yes, groups of experts do make mistakes. It happens more often than I’d like. But I’ll take those results over groups of non-experts any day.

    And fwiw, I’ve been monitoring the technical conversation regarding the C++ memory model (Hans Boehm among the active participants) over the past several months. Trust me, these guys know what they’re doing. Publicly ridicule them at the risk of your own reputation.

  23. bhurt-awon 31 Jan 2007 at 7:18 pm

    OK. They’ve managed to avoid the single-lock deadlock situation. But at this point I have to ask: what the hell good is an upgradeable lock, as opposed to a a normal write lock? If the access pattern isn’t a (relatively) long read-only phase followed by a relatively short modification phase, it’s not worth it. Especially considering that most modifications want to start their (destructive) modifications almost immediately.

    I can only think of one situation where it’d be usefull- with hash tables, when you have to grow or shrink the entire hashtable. While populating the new hashtable, you can still allow read-only access to the old hashtable. Only at the end, when you want to update the pointers to point to the new hashtable, do you need an exclusive write lock. But even here it’s wrong to use, as the common case is that you’re not resizing the table on an insert or remove. If fact, you almost never are- only 1 time in N or less (where N is the number of elements in the table) are you resizing the table, otherwise hashtables don’t have O(1) amoritized insert/remove cost. In the common case you just want to go straight for the write lock, as the read-only phase will be almost nil. So what use is upgradeable, again?

    The next thing is that my opinion of C++ is born of actually having programmed in it professionally. Also note that I’m not that much of an elitist when it comes to languages. While I prefer Ocaml and Haskell as languages, I still very much like C, and have been known to voluntarily program in PHP, Perl, and Visual Basic. Heck, I’ve even been known to argue that for certain jobs, Perl or Visual Basic are better tools than Ocaml! So this isn’t just lisp-style snobbery. The intense dislike I bear against C++ is earned.

    To assume mass-stupidity from a group of internationally recognized experts of a field, any field, appears to me to be less than a bright thing to do. Yes, groups of experts do make mistakes. It happens more often than I’d like. But I’ll take those results over groups of non-experts any day.

    And fwiw, I’ve been monitoring the technical conversation regarding the C++ memory model (Hans Boehm among the active participants) over the past several months. Trust me, these guys know what they’re doing. Publicly ridicule them at the risk of your own reputation.

    If that group of so-called experts consistently display poor judgement and lack of fundamental knowledge, then it’s a pretty safe bet that they will continue to do so. And the fact that there was a smattering of smart people involved doesn’t matter- as generally the smart people get shouted down by the idiots. Hans Boehm is a pretty smart guy. So is Colin Powell.

    Given the shock people had when Hans showed the reordering problem, I have to conclude that the C++ committee knows as much about the reality of multithreaded programming as the neocons do of the middle east. Now, I suppose it’s possible that this report of the goings on at the meeting are completely false to fact. I wasn’t there, but I haven’t heard of anyone claiming this. But if it is even roughly or approximately true, I’m pretty safe with this opinion.

    For example- looking at the spec you posted, I notice one glaring thing. They’re using templates to express an is-a relationship. This strikes me as being a classic, gold-standard example of what normal object inheritance. And yet, they use templates. One wonders why. The only reason I can come up with is that templates are “cool”, and that the same proposal, except using object inheritance instead of templates, would have a much reduced chance of passing muster with the standards committee for stylistic reasons. “This is C++, not Java!” Either that, or the people writting the standard don’t understand object orieted design (this isn’t the first time this has happened with C++, witness iostreams).

  24. [...] “I’m reading this post on the C++ committee’s dealings on threading, and, I am pleased to announce, there is absolutely no chance what so ever of anything sane, workable, or sensible accidentally arising from these precedings.” (C++ and Threading) [...]

  25. dschibuton 29 Sep 2008 at 6:31 pm

    I started this discussion to discuss public available web proxies:

    Which are really anonymous?

    Which can be used with facebook, myspace etc, in other words: are fresh ?

    Which can you recommend?

    Thanks for your help,
    Dschibut

    P.S.: In my land, the freedom of speech is somehow constrained, please give me a hint, if you are not sure about your recommendation.

Trackback URI | Comments RSS

Leave a Reply

Green Web Hosting! This site hosted by DreamHost.