http://tleyden.github.io/blog/2014/10/30/goroutines-vs-threads/
Here are some of the advantages of Goroutines over threads:
- You can run more goroutines on a typical system than you can threads.
- Goroutines have growable segmented stacks.
- Goroutines have a faster startup time than threads.
- Goroutines come with built-in primitives to communicate safely between themselves (channels).
- Goroutines allow you to avoid having to resort to mutex locking when sharing data structures.
- Goroutines are multiplexed onto a small number of OS threads, rather than a 1:1 mapping.
- You can write massively concurrent servers withouth having to resort to evented programming.
You can run more of them
On Java you can run 1000’s or tens of 1000’s threads. On Go you can run hundreds of thousands or millions of goroutines.
Java threads map directly to OS threads, and are relatively heavyweight. Part of the reason they are heavyweight is their rather large fixed stack size. This caps the number of them you can run in a single VM due to the increasing memory overhead.
Go OTOH has a segmented stack that grows as needed. They are “Green threads”, which means the Go runtime does the scheduling, not the OS. The runtime multiplexes the goroutines onto real OS threads, the number of which is controlled by GOMAXPROCS
. Typically you’ll want to set this to the number of cores on your system, to maximize potential parellelism.
They let you avoid locking hell
One of the biggest drawback of threaded programming is the complexity and brittleness of many codebases that use threads to achieve high concurrency. There can be latent deadlocks and race conditions, and it can become near impossible to reason about the code.
Go OTOH gives you primitives that allow you to avoid locking completely. The mantra is don’t communicate by sharing memory, share memory by communicating. In other words, if two goroutines need to share data, they can do so safely over a channel. Go handles all of the synchronization for you, and it’s much harder to run into things like deadlocks.
No callback spaghetti, either
There are other approaches to achieving high concurrency with a small number of threads. Python Twisted was one of the early ones that got a lot of attention. Node.js is currently the most prominent evented frameworks out there.
The problem with these evented frameworks is that the code complexity is also high, and difficult to reason about. Rather than “straightline” coding, the programmer is forced to chain callbacks, which gets interleaved with error handling. While refactoring can help tame some of the mental load, it’s still an issue.
Thread Pooling in Go Programming
https://www.ardanlabs.com/blog/2013/05/thread-pooling-in-go-programming.html
After working in Go for some time now, I learned how to use an unbuffered channel to build a pool of goroutines. I like this implementation better than what is implemented in this post. That being said, this post still has value in what it describes.
https://github.com/goinggo/work
Introduction
In my world of server development thread pooling has been the key to
building robust code on the Microsoft stack. Microsoft has failed in
.Net by giving each Process a single thread pool with thousands of
threads and thinking they could manage the concurrency at runtime. Early
on I realized this was never going to work. At least not for the
servers I was developing.
When I was building servers in C/C++ using the Win32 API, I created a
class that abstracted IOCP to give me thread pools I could post work
into. This has always worked very well because I could define the number
of threads in the pool and the concurrency level (the number of threads
allowed to be active at any given time). I ported this code for all of
my C# development. If you want to learn more about this, I wrote an
article years ago (http://www.theukwebdesigncompany.com/articles/iocp-thread-pooling.php). Using IOCP gave me the performance and flexibility I needed. BTW, the .NET thread pool uses IOCP underneath.
The idea of the thread pool is fairly simple. Work comes into the server
and needs to get processed. Most of this work is asynchronous in nature
but it doesn’t have to be. Many times the work is coming off a socket
or from an internal routine. The thread pool queues up the work and then
a thread from the pool is assigned to perform the work. The work is
processed in the order it was received. The pool provides a great
pattern for performing work efficiently. Spawning a new thread everytime
work needs to be processed can put heavy loads on the operating system
and cause major performance problems.
So how is the thread pool performance tuned? You need to identify the
number of threads each pool should contain to get the work done the
quickest. When all the routines are busy processing work, new work stays
queued. You want this because at some point having more routines
processing work slow things down. This can be for a myriad of reasons
such as, the number of cores you have in your machine to the ability of
your database to handle requests. During testing you can find that happy
number.
I always start with looking at how many cores I have and the type of
work being processed. Does this work get blocked and for how long on
average. On the Microsoft stack I found that three active threads per
core seemed to yield the best performance for most tasks. I have no idea
yet what the numbers will be in Go.
You can also create different thread pools for the different types of
work the server will need to process. Because each thread pool can be
configured, you can spend time performance tuning the server for maximum
throughput. Having this type of command and control
to maximize performance is crucial.
In Go we don’t create threads but routines. The routines function like
multi-threaded functions but Go manages the actual use of OS level
threading. To learn more about concurrency in Go check out this
document: http://golang.org/doc/effective_go.html#concurrency.
The packages I have created are called workpool and jobpool. These use
the channel and go routine constructs to implement pooling.
Workpool
This package creates a pool of go routines that are dedicated to
processing work posted into the pool. A single Go routine is used to
queue the work. The queue routine provides the safe queuing of
work, keeps track of the amount of work in the queue and reports an
error if the queue is full.
Posting work into the queue is a blocking call. This is so the caller
can verify that the work is queued. Counts for the number of active
worker routines are maintained.
Here is some sample code on how to use the workpool:
import (
"bufio"
"fmt"
"os"
"runtime"
"strconv"
"time"
"github.com/goinggo/workpool"
)
type MyWork struct {
Name string
BirthYear int
WP *workpool.WorkPool
}
func (mw *MyWork) DoWork(workRoutine int) {
fmt.Printf("%s : %d ", mw.Name, mw.BirthYear)
fmt.Printf("Q:%d R:%d ", mw.WP.QueuedWork(), mw.WP.ActiveRoutines())
// Simulate some delay
time.Sleep(100 * time.Millisecond)
}
func main() {
runtime.GOMAXPROCS(runtime.NumCPU())
workPool := workpool.New(runtime.NumCPU(), 800)
shutdown := false // Race Condition, Sorry
go func() {
for i := 0; i < 1000; i++ {
work := MyWork {
Name: "A" + strconv.Itoa(i),
BirthYear: i,
WP: workPool,
}
if err := workPool.PostWork("routine", &work); err != nil {
fmt.Printf("ERROR: %s ", err)
time.Sleep(100 * time.Millisecond)
}
if shutdown == true {
return
}
}
}()
fmt.Println("Hit any key to exit")
reader := bufio.NewReader(os.Stdin)
reader.ReadString(’ ’)
shutdown = true
fmt.Println("Shutting Down")
workPool.Shutdown("routine")
}
If we look at main, we create a thread pool where the number of routines
to use is based on the number of cores we have on the machine. This
means we have a routine for each core. You can’t do any more work if
each core is busy. Again, performance testing will determine what this
number should be. The second parameter is the size of the queue. In this
case I have made the queue large enough to handle all the requests
coming in.
The MyWork type defines the state I need to perform the work. The member
function DoWork is required because it implements an interface required
by the PostWork call. To pass any work into the thread pool this method
must be implement by the type.
The DoWork method is doing two things. First it is displaying the state
of the object. Second it is reporting the number of items in queue and
the active number of Go Routines. These numbers can be used to
determining the health of the thread pool and for performance testing.
Finally I have a Go routine posting work into the work pool inside of a
loop. At the same time this is happening, the work pool is executing
DoWork for each object queued. Eventually the Go routine is done and the
work pool keeps on doing its job. If we hit enter at anytime the
programming shuts down gracefully.
The PostWork method could return an error in this sample program. This
is because the PostWork method will guarantee work is placed in queue or
it will fail. The only reason for this to fail is if the queue is full.
Setting the queue length is an important consideration.
Jobpool
The jobpool package is similar to the workpool package except for one
implementation detail. This package maintains two queues, one for normal
processing and one for priority processing. Pending jobs in the
priority queue always get processed before pending jobs in the normal
queue.
The use of two queues makes jobpool a bit more complex than workpool.
If you don’t need priority processing, then using a workpool is going
to be faster and more efficient.
Here is some sample code on how to use the jobpool:
import (
"fmt"
"time"
"github.com/goinggo/jobpool"
)
type WorkProvider1 struct {
Name string
}
func (wp *WorkProvider1) RunJob(jobRoutine int) {
fmt.Printf("Perform Job : Provider 1 : Started: %s ", wp.Name)
time.Sleep(2 * time.Second)
fmt.Printf("Perform Job : Provider 1 : DONE: %s ", wp.Name)
}
type WorkProvider2 struct {
Name string
}
func (wp *WorkProvider2) RunJob(jobRoutine int) {
fmt.Printf("Perform Job : Provider 2 : Started: %s ", wp.Name)
time.Sleep(5 * time.Second)
fmt.Printf("Perform Job : Provider 2 : DONE: %s ", wp.Name)
}
func main() {
jobPool := jobpool.New(2, 1000)
jobPool.QueueJob("main", &WorkProvider1{"Normal Priority : 1"}, false)
fmt.Printf("*******> QW: %d AR: %d ",
jobPool.QueuedJobs(),
jobPool.ActiveRoutines())
time.Sleep(1 * time.Second)
jobPool.QueueJob("main", &WorkProvider1{"Normal Priority : 2"}, false)
jobPool.QueueJob("main", &WorkProvider1{"Normal Priority : 3"}, false)
jobPool.QueueJob("main", &WorkProvider2{"High Priority : 4"}, true)
fmt.Printf("*******> QW: %d AR: %d ",
jobPool.QueuedJobs(),
jobPool.ActiveRoutines())
time.Sleep(15 * time.Second)
jobPool.Shutdown("main")
}
In this sample code we create two worker type structs. It’s best to
think that each worker is some independent job in the system.
In main we create a job pool with 2 job routines and support for 1000
pending jobs. First we create 3 different WorkProvider1 objects and post
them into the queue, setting the priority flag to false. Next we create
a WorkProvider2 object and post that into the queue, setting the
priority flag to true.
The first two jobs that are queued will be processed first since the job
pool has 2 routines. As soon as one of those jobs are completed, the
next job is retrieved from the queue. The WorkProvider2 job will be
processed next because it was placed in the priority queue.
To get a copy of the workpool and jobpool packages, go to github.com/goinggo
As always I hope this code can help you in some small way.