CPL 2016, week 7 Performance considerations Oleg Batrashev Institute of Computer Science, Tartu, Estonia March 21, 2016
Overview Studied so far: 1. Inter-thread visibility : JMM 2. Inter-thread synchronization : locks and monitors 3. Thread management : executors, tasks, cancelation 4. Inter-thread communication : confinements, queues, back pressure 5. Inter-thread collaboration : actors, inboxes, state diagrams 6. Asynchronous execution : callbacks, Pyramid of Doom, Java 8 promises. Today: ◮ Performance considerations : asynchronous IO, Java 8 streams.
Performance considerations 140/160 Context switch - Outline Performance considerations Context switch Green threads Asynchronous IO Java NIO Declarative concurrency Java 8 streams
Performance considerations 141/160 Context switch - Variants of context switch Context switch may refer to different things ◮ application changes CPU priority level (kernel/user) of a running code ◮ system calls – set of basic operations supported by OS that applications use to open file/socket, write/read it, ... ◮ CPU registers, stack, ... are reloaded in the core ◮ OS changes the thread that runs on a core ◮ OS changes the process that runs on a core Our main interest is in switching threads, e.g.: ◮ if lock is taken, the thread must be suspended until it is released ◮ if queue is empty, then consumer must be suspended ◮ if no more data in a socket, reader must be suspended Too many context switches may degrade the performance!
Performance considerations 142/160 Context switch - Context switch test Thread/process context switch is 1-10 microseconds (system dependent). 1. Two actors with own thread: producer writes 1 million integer values to the consumer actor, which sums them up. 2. Two actors with own thread: ping-pong of 0.5 million values. 3. Two actors with shared thread: ping-pong of 0.5 million values. Case Total time Two actors: producer-consumer 0.41 s Two actors with own threads 6 s Two actors with shared thread 0.18 s ◮ ping-pong between two threads causes expected decline in efficiency (6 µ s per context switch, i.e. p(io)ng)
Performance considerations 143/160 Context switch - Solutions to context switch 1. Let the same thread do most of the work ◮ from queue/actor model back to wandering threads 2. Make sure single thread does enough work before switching ◮ make message processing work expensive (in terms of computation) ◮ keep queues full enough for consumers/transducers/actors – handle several in a row before switching off to another thread ◮ not always possible 3. Do not switch thread when switching actors, consumers, and/or transducers ◮ use green threads This problem is only relevant in case of many actors and/or many batch non-applicapable messages!
Performance considerations 144/160 Green threads - Outline Performance considerations Context switch Green threads Asynchronous IO Java NIO Declarative concurrency Java 8 streams
Performance considerations 145/160 Green threads - Idea Green threads (library threads, user-level threads): ◮ user-level thread is maintained outside OS, on the user level ◮ implemented by library or VM ◮ 1 kernel-level (OS) thread per n user-level threads ◮ OS resources are allocated for 1 thread ◮ cheaper scheduling – no context switch needed ◮ m kernel threads per n user threads Problems: ◮ need a way to suspend execution and save/restore thread stack ◮ i.e. preempt executing thread ◮ non-preemptable threads need to yield periodically ◮ IO may block OS thread, which is needed by other green threads
Performance considerations 146/160 Green threads - Implementations Languages/VMs: ◮ Java 1.1 had green threads as the main implementation ◮ Erlang VM uses green threads with no shared state ◮ Go, Smalltalk Libraries/frameworks/engines: ◮ Akka (Java) uses m - n model (specify dispatcher for an actor) ◮ CPython greenlet, eventlet, gevent ◮ Quasar (Java) modifies your code to save the stack (location and local variables) See also: ◮ fibers, coroutines
Performance considerations 147/160 Asynchronous IO - Outline Performance considerations Context switch Green threads Asynchronous IO Java NIO Declarative concurrency Java 8 streams
Performance considerations 148/160 Asynchronous IO - Blocking IO problem ◮ IO may block OS thread that is used for many green threads Solutions: 1. Use dedicated thread pool for blocking IO (Clojure) 2. Use asynchronous IO (Erlang) Some frameworks: ◮ Netty is a non-blocking I/O (NIO) client-server framework for the development of Java network applications ◮ Asynchronous servlets in Servlet 3.0
Performance considerations 149/160 Asynchronous IO - Ideas ◮ Synchronous IO suspends if no data is yet available ◮ Asynchronous IO – use callbacks that are executed when IO is readable/writeable ◮ does not block on IO operations ◮ may read multiple sockets by single thread ( selectors ) Advantages: ◮ avoids context switch when reading from multiple sockets ◮ solves green thread blocking IO problem Disadvantages: ◮ requires more code to handle IO ◮ code becomes more scattered
Performance considerations 150/160 Asynchronous IO - Java NIO Buffers and channels http://tutorials.jenkov.com/java-nio/index.html ◮ buffers are much like arrays ◮ provide typical write-flip-read sequence ◮ used for Java NIO channels ◮ ByteBuffer.allocate(100) ◮ channels are much like streams, but ◮ both readable/writeable ◮ support asynchronous operation, read AsynchronousByteChannel: Future <Integer > read(ByteBuffer dst) void read(ByteBuffer dst , A attachment , CompletionHandler <Integer ,? super A> handler) ◮ write also supports these 2 forms: future and callback
Performance considerations 151/160 Asynchronous IO - Java NIO Selectors ◮ may register callback for each channel we are interested ◮ easier way is to use selectors ◮ register as many channels as we want, select desired operation: channel. configureBlocking (false ); SelectionKey key = channel.register(selector , SelectionKey .OP_READ ); ◮ supported operations OP_CONNECT, OP_ACCEPT, OP_READ, OP_WRITE ◮ use selector.select() – blocks until at least one channel is ready for the events you registered for ◮ selector.selectedKeys() – returns the channels that are ready
Performance considerations 152/160 Summary - ◮ context switch is changing executing mode, thread or process ◮ context switch is quite expensive on OS (kernel) level ◮ green threads (user-level threads) may mitigate the cost ◮ green threads have problems with preemption, saving stack and blocking IO ◮ blocking IO may be solved by: ◮ using dedicated thread pool ◮ using asynchronous IO
Declarative concurrency 153/160 - Ideas ◮ Java <8 lacked functional style ◮ declarative = pure functional (see later Erlang,Clojure) ◮ single assignment variables, lock-step execution ◮ deterministic, no side effects, no race conditions ◮ lazyness, dataflow programming ◮ interest in performance (utilizing cores) ◮ structured declarative concurrency ◮ parallel map/filter/reduce
Declarative concurrency 154/160 Java 8 streams - Outline Performance considerations Context switch Green threads Asynchronous IO Java NIO Declarative concurrency Java 8 streams
Declarative concurrency 155/160 Java 8 streams - Java8 streams Like usual streams: ◮ sequence of values. Unlike usual streams: ◮ do not have state, only for data transformation ◮ support map/filter/reduce transformations ◮ lazy – do not execute until data is needed Create stream: Stream <E> Collection <E>. stream () Arrays.stream(Object []) Stream.of(Object []) static <T> Stream <T> generate(Supplier <T> s) static <T> Stream <T> iterate(T seed , UnaryOperator <T> f) ◮ last 2 produce infinite streams
Declarative concurrency 156/160 Java 8 streams - Collecting stream ◮ streams are not executed until their results are needed ◮ terminal operation – one that produces the result Some terminal operations: long count () Optional <T> max(Comparator <? super T> comparator) Optional <T> reduce(BinaryOperator <T> accumulator ) void forEach(Consumer <? super T> action) Object [] toArray () <R,A> R collect(Collector <? super T,A,R> collector) ◮ Collector interface is very general ◮ Collectors class contains a lot of standard implementations ◮ toList() , toSet() , ...
Declarative concurrency 157/160 Java 8 streams - Transforming stream ◮ map – transform each element and return new stream <R> Stream <R> map(Function <? super T,? extends R> mapper) ◮ filter – select only some elements from the stream Stream <T> filter(Predicate <? super T> predicate) ◮ reduce – aggregate stream into the final result Optional <T> reduce(BinaryOperator <T> accumulator ) T reduce(T identity , BinaryOperator <T> accumulator ) ◮ flatMap – like map but combining resulting streams <R> Stream <R> flatMap(Function < ? super T, ? extends Stream <? extends R>> mapper) ◮ analogue of compose in CompletableFuture
Recommend
More recommend