Moving thread activation policies to userspace using kfutex Helge - PowerPoint PPT Presentation

Moving thread activation policies to userspace using kfutex Helge Bahmann <hcb@chaoticmind.net> Google Zürich

Pop quiz: Which class of operations do processes spend >99% of their time in?

Introduction What are threads? ● Answer I: A "parallelism abstraction" A piece of a program running sequentially with respect to itself, and running with unspecified parallelism to the remainder of the program. A means to expressing "conceptual parallelism". ● Answer II: An "operating system concept" A virtualized instance of a CPU, mapped dynamically to physical CPUs. A means to achieving "factual parallelism".

Introduction Linux event waiting "primitives" (not exhaustive) ● select/poll/epoll_wait/epoll_pwait/... ● sigsuspend/sigtimedwait/sigwaitinfo ● waitpid ● sleep/usleep/nanosleep ● ioctl(..., DRM_IOCTL_WAIT_VBLANK, ...) ● pthread_mutex_lock/pthread_cond_wait ● ... Observation: Combined event notification/delivery

epoll-based notification kernel space setup steady state processing

Common event handling patterns ● "Edge client" – Many kinds of event sources (peripherals, user interaction, network, ...) – ~1 instance each – almost no "intended" parallelism ● Service – Single dominant kind of event source (usually network) – many instances each – maximize throughput through parallelism ● Reality usually somewhere between these extremes

Leader/followers (classical) ● Design constraints: ● Solution: – Single (logical) event – "Leader" dequeues event source – Promotes new "leader" – Handling any event may from pool of followers take arbitrary (varying) – Handles event amount of time – Joins pool of followers Goal: Maximise throughput through parallelism Simplest possible implementation for (;;) { relies on thread activation policy by std::unique_lock<std::mutex> "mutex" to select new leader. lock(m); Event ev = get_event_from_queue(); m.unlock(); handle_event(ev); }

Leader/followers (classical) user space kernel space epoll device queue driver thread 1 thread 2 irq

Leader/followers (classical) ● Literature: more fancy^Wsophisticated leader selection This does not change two fundamental facts: – The promoted follower will be temporarily woken, just to put itself back to sleep again – The last active thread cannot become leader again without another pointless wake up of the current leader to displace it ● Due to thread/CPU affinity one IPI per operation ● Particularly pathological for #threads = #CPUs

futex Linux system call for suspending/waking up threads based on an address ● futex(addr, FUTEX_WAIT, value) Atomically verifies that *addr == value and puts calling thread to sleep in "waiting at addr " state. Returns 0 if thread was put to sleep (and woken later). ● futex(addr, FUTEX_WAKE, count) Wakes up at most count threads in "waiting at addr " state ● futex(addr, FUTEX_REQUEUE, new_addr) Changes all threads currently in "waiting at addr " state into "waiting at new_addr " state

futex Implementing a mutex class mutex { void mutex::lock() { public : state_type current = state_.load(); for (;;) { void lock(); switch (current) { void unlock(); case unlocked: { private : if (state_.compare_exchange_weak( enum state_type { current, locked)) { return ;} unlocked = 0, break ; locked = 1, } locked_contention = 2 case locked: { }; if (!state_.compare_exchange_weak( std::atomic<state_type> current, locked_contetion) { state_; break ; ... } // fallthrough }; } case locked_contention { void mutex::unlock() { if (!futex(&state, FUTEX_WAIT, if (state_.exchange(unlocked) locked_contention)) { return ;} == locked_contention) { state_type current = state_.load(); futex(&state, FUTEX_WAKE, break ; 1); } } } } } }

futex FUTEX_REQUEUE comes into play to avoid a "thundering herd" problem with condition variables "Naive" wake up will template < typename X> class synchronized_queue { cause all threads to race public : template < typename Iter> acquiring the mutex, void enqueue_many(Iter begin, Iter end) { blocking all but one again std::unique_lock<std::mutex> lock(m_); at just this point. queue.insert(queue.end(), begin, end); c.notify_all(); "Requeue" allows to lock.unlock(); change the woken threads } from "waiting at condition X dequeue() { variable" state to "waiting std::unique_lock<std::mutex> lock(m_); at mutex" state and thus while (queue.empty()) { c.wait(lock);} avoids the thundering X result = std::move(queue.front();) herd. queue.pop_front(); lock.unlock(); return result; } private : std::mutex m_; std::condition_variable c_; std::list<X> queue_; };

kfutex ● Extension to allow futex signalling from kernel space – User space defines... ● Address of an atomic variable (doubles as futex location) ● Mutation protocol: Single parameterized atomic operation ● Wake up criterion: Single parameterized test of pre/post value – Kernel acts on these directives when signalling a kfutex ● Extension to bind kfutex signalling to kernel events – e.g. I/O readiness ● Peripherally related: Extension for event ringbuffer

kfutex-based notification user space kernel space setup steady state processing

leader/followers (futex) ● Bind event source to kfutex – "Leader" FUTEX_WAITs on this event futex – "Followers" FUTEX_WAIT on a private signalling futex each ● When leader receives an event – it FUTEX_REQUEUEs one of the followers to the event futex – begins handling event ● When thread finishes handling an event – either: waits on its private signalling futex – or : FUTEX_REQUEUEs current leader to its private signalling futex ("demotes") and becomes leader itself ● Leader selection policy in user space

Summary ● kfutex unifies inter-thread and kernel notification ● kfutex separates event notification/delivery – delivery suitably possible through e.g. lock-free ring buffers ● allows moving activation policy decisions to user space; avoids "useless" task wake ups ● efficiency gain by avoiding kernel entry in fast paths ● kernel implementation complexity to avoid "abuse" of kfutex side effects – futex key hash collisions, page pinning ● synchronization implementation complexity – lock-free kernel/user-space synchronization protocol

Moving thread activation policies to userspace using kfutex Helge - PowerPoint PPT Presentation

Moving thread activation policies to userspace using kfutex Helge Bahmann <hcb@chaoticmind.net> Google Zrich Pop quiz: Which class of operations do processes spend >99% of their time in? Introduction What are threads? Answer

Chapter 8 Applying Thread Pools Magnus Andersson Execution policies Not all task are

1 Benefits of multithreading What are Java threads? 1. to modularize the system by defining

ACTIVATION POLICIES FOR MORE INCLUSIVE LABOUR MARKETS Paul Swaim Senior Economist & Editor

Internalization, Dimerization, and Activation of CD38 during mNOX Activation: - and Ca 2+

13 IN THIS CHAPTER Benefits of Thread Pooling 308 Considerations and Costs of Thread

AddressSanitizer/ThreadSanitizer for Linux Kernel and userspace. Konstantin Serebryany, Dmitry

Compilers Activation Records Alex Aiken Activation Records The information needed to manage

What is a Thread? A thread lives within a process; A process can have several threads.

MULTITREADING What is a thread? A thread is a concurrent unit of execution Threads share

Runtime Considerations Were moving towards actually producing target code. This means we need

2 Berkeley Socket Userspace Kernel Hardware Time 1983 2 Berkeley TCP Arrakis &

ACTIVATION MARCH 26, 2020 ACTIVATION OF POWER Against COVID-19 ZOOM Room opens at 6:30 pm

Java Threads 2020/5/16 What is Thread? Process vs. Thread Process: Any computer

already discussed, is a powerful, wonderful, ridiculous thing, capable of moving mountains.

1 Thread Context Switch Thread Context Switch Thread States and Transitions Thread States and

Roadmap for Section 4.3. Windows Process and Thread Internals Thread Block, Process Block Flow

To thread or not to thread? Why PETSc favors MPI-only Plenary Discussion PETSc User Meeting 2016

XtreemFS: high- performance network file system clients and servers in userspace Minor Gordon,

Patient activation and self-management of chronic conditions A/Prof Ben Harris-Roxas Director,

Eastbourne Reserve Activation Project Project description The Eastbourne Reserve Activation

Multithreading Horstmann ch.9 Multithreading Threads Thread states Thread

Sidan 1 The activation concept The activation concept 1. Wakes up, Son starts executing

The Crimson Thread John 19:16-18 The next mountain brings us to the climax and the MOST

Design of Thread-Safe Classes 1 Topic Outline Thread-Safe Classes Principles Confinement

Moving thread activation policies to userspace using kfutex Helge - PowerPoint PPT Presentation

Moving thread activation policies to userspace using kfutex Helge Bahmann <hcb@chaoticmind.net> Google Zrich Pop quiz: Which class of operations do processes spend >99% of their time in? Introduction What are threads? Answer

Chapter 8 Applying Thread Pools Magnus Andersson Execution policies Not all task are

1 Benefits of multithreading What are Java threads? 1. to modularize the system by defining

ACTIVATION POLICIES FOR MORE INCLUSIVE LABOUR MARKETS Paul Swaim Senior Economist &amp; Editor

Internalization, Dimerization, and Activation of CD38 during mNOX Activation: - and Ca 2+

13 IN THIS CHAPTER Benefits of Thread Pooling 308 Considerations and Costs of Thread

AddressSanitizer/ThreadSanitizer for Linux Kernel and userspace. Konstantin Serebryany, Dmitry

Compilers Activation Records Alex Aiken Activation Records The information needed to manage

What is a Thread? A thread lives within a process; A process can have several threads.

MULTITREADING What is a thread? A thread is a concurrent unit of execution Threads share

Runtime Considerations Were moving towards actually producing target code. This means we need

2 Berkeley Socket Userspace Kernel Hardware Time 1983 2 Berkeley TCP Arrakis &amp;

ACTIVATION MARCH 26, 2020 ACTIVATION OF POWER Against COVID-19 ZOOM Room opens at 6:30 pm

Java Threads 2020/5/16 What is Thread? Process vs. Thread Process: Any computer

already discussed, is a powerful, wonderful, ridiculous thing, capable of moving mountains.

1 Thread Context Switch Thread Context Switch Thread States and Transitions Thread States and

Roadmap for Section 4.3. Windows Process and Thread Internals Thread Block, Process Block Flow

To thread or not to thread? Why PETSc favors MPI-only Plenary Discussion PETSc User Meeting 2016

XtreemFS: high- performance network file system clients and servers in userspace Minor Gordon,

Patient activation and self-management of chronic conditions A/Prof Ben Harris-Roxas Director,

Eastbourne Reserve Activation Project Project description The Eastbourne Reserve Activation

Multithreading Horstmann ch.9 Multithreading Threads Thread states Thread

Sidan 1 The activation concept The activation concept 1. Wakes up, Son starts executing

The Crimson Thread John 19:16-18 The next mountain brings us to the climax and the MOST

Design of Thread-Safe Classes 1 Topic Outline Thread-Safe Classes Principles Confinement

ACTIVATION POLICIES FOR MORE INCLUSIVE LABOUR MARKETS Paul Swaim Senior Economist & Editor

2 Berkeley Socket Userspace Kernel Hardware Time 1983 2 Berkeley TCP Arrakis &