Many languages have technologies like virtual threads
, such as Go, C#, Erlang, Lua, etc, which they call concurrent threads
. Whether they are virtual threads or concurrent threads, they are all lightweight threads that aim to improve concurrency. This section details the Java platform’s virtual threads
technology, JEP 425: Virtual Threads (Preview)
.
The Java platform plans to introduce virtual threads that will significantly reduce the effort of writing, maintaining, and observing high-throughput concurrent applications. The JEP 425: Virtual Threads (Preview)
project is a preview API.
Objectives
- Enable server applications written in a simple thread-per-request style to scale with near-optimal hardware utilization.
- Enable existing code that uses the java.lang.Thread API to adopt virtual threads with minimal changes.
- Easily troubleshoot, debug, and analyze virtual threads using existing JDK tools.
Non-targeting
- The goal is not to remove legacy implementations of threads or to silently migrate existing applications to use virtual threads.
- Changing Java’s basic concurrency model is not the goal.
- Providing new data parallelism structures in the Java language or Java libraries is not the goal. stream APIs are still the preferred way to parallelize large data sets.
Motivation
For nearly 30 years, Java developers have relied on threads as the building blocks of concurrent server applications. Every statement in every method is executed within a thread, and because Java is multi-threaded, multiple threads of execution occur simultaneously. A thread is Java’s concurrent unit: a piece of sequential code that runs concurrently with, and largely independent of, other such units. Each thread provides a stack to store local variables and coordinate method calls, as well as the context when something goes wrong: exceptions are thrown and caught by methods in the same thread, so developers can use a thread’s stack trace to find out what happened. Threads are also a core concept of the tool: the debugger steps through statements in threaded methods, and the analyzer visualizes the behavior of multiple threads to help understand their performance.
Thread per request style
Server applications typically handle concurrent user requests independently of each other, so it makes sense for applications to process requests by dedicating threads to that request for the entire duration of the request. This thread-per-request style is easy to understand, easy to program, and easy to debug and analyze because it uses the concurrency units of the platform to represent the concurrency units of the application.
The scalability of server applications follows Little’s Law, which relates latency, concurrency, and throughput: for a given request processing duration (i.e., latency), the number of requests processed simultaneously by the application (i.e., concurrency) must grow proportionally to the arrival rate (i.e., throughput). For example, assume that an application with an average latency of 50ms achieves a throughput of 200 requests per second by processing 10 requests simultaneously. In order for the application to scale to a throughput of 2000 requests per second, it needs to process 100 requests simultaneously. If each request is processed in a thread for the duration of the request, then for the application to keep up, the number of threads must grow as the throughput grows.
Unfortunately, the number of available threads is limited because the JDK implements threads as wrappers for operating system (OS) threads. OS threads are costly, so we cannot have too many threads, which makes the implementation unsuitable for a threaded per-request style. If each request consumes one thread for its duration, and thus one OS thread, the number of threads typically becomes a limiting factor long before other resources, such as CPU or network connections, are exhausted.The JDK’s current threaded implementation limits application throughput to well below what the hardware can support. This can happen even with thread pooling, which helps avoid the high cost of starting new threads but does not increase the total number of threads.
Using asynchronous styles to improve scalability
Some developers who want to take full advantage of the hardware have abandoned the per-request thread style and moved to a thread-sharing style. Instead of processing a request on a thread from start to finish, request processing code returns its thread to the pool while waiting for the I/O operation to complete so that the thread can service other requests. This fine-grained thread sharing - in which the code reserves threads only while the thread performs a computation, not while waiting for I/O - allows for a large number of concurrent operations without consuming a large number of threads. While it removes the throughput limitations imposed by OS thread scarcity, it comes at a high cost: it requires what is called an asynchronous programming style, using a separate set of I/O methods that do not wait for I/O operations to complete but, instead, indicate their completion to a callback later. Without dedicated threads, developers must break their request processing logic into small phases, usually written as lambda expressions, and then use APIs to combine them into sequential pipelines (see the CompletableFuture, or so-called “reactive” framework. Thus, they abandon the language’s basic sequential composition operators, such as loops and try/catch blocks.
In the asynchronous style, each phase of the request may be executed on a different thread, each running the phases belonging to a different request in an interleaved fashion. This has profound implications for understanding program behavior: stack traces do not provide usable context, debuggers cannot step through request processing logic, and analyzers cannot correlate the cost of an operation with its caller. Writing lambda expressions is manageable when using Java Stream APIs process data in short pipelines, but it is problematic when all request processing code in an application must be written in this manner. This style of programming is inconsistent with the Java platform because the application’s unit of concurrency - the asynchronous pipeline - is no longer the platform’s unit of concurrency.
Use virtual threads to preserve thread per-request style
To enable applications to scale while remaining in harmony with the platform, we should strive to preserve per-request thread styles by implementing threads more efficiently so they can be richer. Operating systems cannot implement OS threads more efficiently because different languages and runtimes use the thread stack in different ways. However, the Java runtime can implement Java threads in a way that separates the one-to-one correspondence between Java threads and OS threads. Just as the operating system gives the illusion of abundant memory by mapping a large amount of virtual address space to a limited amount of physical RAM, so the Java runtime can give the illusion of abundant threads by mapping a large number of virtual threads to a small number of operating system threads.
A virtual thread is an instance of java.lang.Thread that is not bound to a specific operating system thread. In contrast, a platform thread is an instance of java.lang.Thread, implemented in the traditional way as a streamlined wrapper around the operating system thread.
Application code in the thread-per-request style can run for the entire request duration in a virtual thread, but the virtual thread only consumes the OS thread while the computation is performed on the CPU. The result is the same scalability as the asynchronous style, except that it is transparent: when code running in the virtual thread calls the java.*
API for blocking I/O operations in java, the runtime executes a non-blocking OS call and automatically hangs the virtual thread until it can be resumed later. For Java developers, virtual threads are simply cheap to create and almost infinitely rich in threads. Hardware utilization is near optimal, allowing high levels of concurrency and thus high throughput, while the application remains in harmony with the multi-threaded design of the Java platform and its tools.
The meaning of virtual threads
Virtual threads are both cheap and abundant and therefore should never be pooled: a new virtual thread should be created for each application task. As a result, most virtual threads are ephemeral and have shallow call stacks, executing for as little as a single HTTP client call or a single JDBC query. In contrast, platform threads are heavyweight and expensive, and therefore usually must be pooled. They tend to be long-lived, have deep call stacks, and are shared across many tasks.
In summary, virtual threads retain a reliable per-request thread style that is harmonious with the design of the Java platform while optimally utilizing the hardware. Using virtual threading does not require learning new concepts, although it may require getting into the habit of not learning to cope with the high cost of threading today. Virtual threading will help not only application developers, but also framework designers with easy-to-use APIs that are compatible with the platform’s design without compromising scalability.
Description
Today, per example java.lang.Thread in the JDK, is a platform thread. A platform thread runs Java code on the underlying operating system thread and captures the operating system thread for the entire life of the code. The number of platform threads is limited to the number of operating system threads.
A virtual thread is an instance of java.lang.Thread that runs Java code on the underlying operating system thread, but does not capture operating system threads for the entire life cycle of the code. This means that many virtual threads can run their Java code on the same OS thread, effectively sharing it. While platform threads have a monopoly on valuable OS threads, virtual threads do not. The number of virtual threads can be much larger than the number of operating system threads.
Virtual threads are lightweight implementations of threads provided by the JDK rather than the operating system. They are a form of user-mode threads that have been successful in other multi-threaded languages (e.g., goroutines in Go and processes in Erlang). User state threads were even called “green threads” in early versions of Java, when OS threads were not yet mature and widespread. However, Java’s green threads all share a single OS thread (M:1 scheduling), and platform threads are implemented as wrappers for OS threads (1:1 scheduling). Virtual threads use M:N scheduling, where a large number (M) of virtual threads are scheduled to run on a smaller number (N) of OS threads.
Using virtual threads vs. platform threads
Developers can choose whether to use virtual threads or platform threads. Here is an example program that creates a large number of virtual threads. The program first gets an ExecutorService which will create a new virtual thread for each task submitted. It then submits 10,000 tasks and waits for all of them to complete.
The task in this example is simple code - sleep for a second - and modern hardware can easily support 10,000 virtual threads running such code simultaneously. Behind the scenes, the JDK runs the code on a small number of OS threads, perhaps just one.
If this program uses an ExecutorService that creates a new platform thread for each task, such as Executor.newCachedThreadPool(), the situation is very different. the ExecutorService will try to create 10,000 platform threads, thus creating 10,000 OS threads, and the program will crash on most operating systems.
Conversely, the situation is not much better if the program uses an ExecutorService that gets its platform threads from a pool, such as Executor.newFixedThreadPool(200). 200 platform threads will be created by the ExecutorService to be shared by all 10,000 tasks, so many tasks will run sequentially instead of concurrently, and the program will need to run sequentially. Instead of running concurrently, the program will take a long time to complete. For this program, a pool with 200 platform threads would only achieve a throughput of 200 tasks per second, while the virtual threads would have a throughput of about 10,000 tasks per second (after a full warm-up). In addition, if 10_000
in the sample program is changed to 1_000_000
, the program will submit 1,000,000 tasks, create 1,000,000 simultaneous virtual threads, and achieve (after full warm-up) a throughput of about 1,000,000 tasks per second.
If the tasks in this program perform a second of computation (e.g., sorting a huge array) and not just sleep, then increasing the number of threads beyond the number of processor cores will not help, regardless of whether they are virtual or platform threads. Virtual threads are not faster threads-they do not run code faster than platform threads. They exist to provide scale (higher throughput), not speed (lower latency). They may be much more numerous than platform threads, so by Little’s law they achieve the higher concurrency required for higher throughput.
Put another way, virtual threads can significantly increase application throughput when
- the number of concurrent tasks is high (more than a few thousand), and
- workloads are not CPU-bound, because in this case, a much larger number of threads than processor cores cannot improve throughput.
Virtual threads help improve the throughput of typical server applications precisely because such applications consist of a large number of concurrent tasks that spend a lot of time waiting.
Virtual threads can run any code that platform threads can run. In particular, virtual threads support thread-local variables and thread interrupts, just like platform threads. This means that existing Java code that handles requests will easily run in a virtual thread. Many server frameworks will choose to do this automatically, starting a new virtual thread for each incoming request and running the application’s business logic in it.
The following is an example of a server application that aggregates the results of two other services. The hypothetical server framework (not shown) creates a new virtual thread for each request and runs the application’s handle code in that virtual thread. The application code in turn creates two new virtual threads that obtain resources concurrently through the same ExecutorService as in the first example.
|
|
A server application like this, with simple blocking code, can scale well because it can use a large number of virtual threads. Executor.newVirtualThreadPerTaskExecutor() is not the only way to create virtual threads. The new java.lang.Thread.BuilderAPI, discussed below, can create and start virtual threads. In addition, structured concurrency provides a more powerful API for creating and managing virtual threads, especially in code like this server example, through which the relationships between threads will be known to the platform and its tools.
Virtual Threads is a preview API, disabled by default
The above programs use Executors.newVirtualThreadPerTaskExecutor() method, so to run them on JDK XX, the preview API must be enabled as follows.
- Compile the program using
javac --release XX --enable-preview Main.java
and run it usingjava --enable-preview Main
; or - When using the source launcher, run the program with
java --release XX --enable-preview Main.java;
; or - When using jshell, start with
jshell --enable-preview
Not pooling virtual threads
Developers often migrate application code from a traditional thread pool-based ExecutorService to a virtual thread per-task ExecutorService. thread pools, like all resource pools, are designed to share expensive resources, but virtual threads are not expensive, and there is never a need to pool them.
Developers sometimes use thread pooling to limit concurrent access to limited resources. For example, if a service cannot handle more than 20 concurrent requests, performing all accesses to that service through tasks submitted to a pool of size 20 will ensure this. This idiom has become ubiquitous due to the high cost of platform threads that make thread pools ubiquitous, but developers should not be tempted to pool virtual threads to limit concurrency. Constructs designed specifically for this purpose, such as semaphores, should be used to protect access to limited resources. This is more efficient and convenient than thread pooling, and safer too, since thread-local data is not at risk of accidental leakage from one task to another.
Observing virtual threads
Writing clear code is not the complete story. A clear representation of the state of a running program is also essential for troubleshooting, maintenance, and optimization, and the JDK has long provided mechanisms for debugging, analyzing, and monitoring threads. These tools should perform the same operations on virtual threads-perhaps with some reconciliation of their extensive operations-because they are, after all, instances of java.lang.
The Java debugger can step through virtual threads, display call stacks, and examine variables in stack frames.JDK Flight Recorder (JFR) is the JDK’s low-overhead analysis and monitoring mechanism that associates events in application code, such as object allocations and I/O operations, with the correct virtual threads. These tools cannot perform these operations for applications written in an asynchronous style. In this style, the task is thread-independent, so the debugger cannot display or manipulate the state of the task, and the parser cannot determine how long the task is waiting for I/O.
Thread dumps are another popular tool for troubleshooting applications written in a thread-per-request style. Unfortunately, the JDK’s legacy thread dumps, obtained using jstack or jcmd, provide a flat list of threads. This is suitable for tens or hundreds of platform threads, but not for thousands or millions of virtual threads. Therefore, instead of extending the traditional thread dump to include virtual threads, we will introduce a new thread dump in jcmd to present virtual threads along with platform threads, all of which are grouped in a meaningful way. When used by programs, a richer relational structured concurrency between threads can be displayed.
Since visualizing and analyzing a large number of threads can benefit from the tool, jcmd can issue new thread dumps in JSON format in addition to plain text as follows.
|
|
The new thread dump format lists virtual threads that are blocked in network I/O operations, as well as virtual threads created by the new-thread-per-task (one-task-one-thread) ExecutorService shown above. It does not include object addresses, locks, JNI statistics, heap statistics, and other information that appears in traditional thread dumps. In addition, because it may need to list a large number of threads, generating a new thread dump does not suspend the application. The following is an example of such a thread dump, taken from an application similar to the second example above, presented in the JSON viewer (see below).
Since virtual threads are implemented in the JDK and are not bound to any specific OS thread, they are invisible to the OS, which is unaware of their existence. OS-level monitoring will observe that JDK processes use fewer OS threads than virtual threads.
Scheduling virtual threads
To do useful work, threads need to be scheduled, i.e., assigned to execute on the processor core. For platform threads, which are implemented as operating system threads, the JDK relies on the scheduler in the operating system. In contrast, for virtual threads, the JDK has its own scheduler. the JDK’s scheduler assigns virtual threads to platform threads instead of assigning virtual threads directly to processors (this is the M:N scheduling of virtual threads mentioned earlier). The operating system will then schedule the platform threads as usual.
The JDK’s virtual thread scheduler is a steal job tool ForkJoinPool works in FIFO mode. The parallelism of the scheduler is the number of platform threads available for scheduling virtual threads. By default, it is equal to the available processors, but it can be adjusted using the system property jdk.virtualThreadScheduler.parallelism. Note that this ForkJoinPool is different from the public pool, for example, it is used for parallel stream implementations and works in LIFO mode.
The platform thread on which the scheduler allocates virtual threads is called a carrier of virtual threads. Virtual threads can be scheduled on different carriers during their lifetime; in other words, the scheduler does not maintain an affinity between virtual threads and any particular platform thread. However, from the perspective of Java code, a running virtual thread is logically independent of its current carrier: the
- The virtual thread cannot use the identity of the carrier. the value returned by Thread.currentThread() is always the virtual thread itself.
- The carrier and the stack trace of the virtual thread are separate. Exceptions thrown in the virtual thread will not include the carrier’s stack frame. Thread dumps will not display the carrier’s stack frame on the virtual thread’s stack, and vice versa.
- Thread-local variables of carriers are not available to the virtual thread, and vice versa.
Furthermore, from the perspective of Java code, the fact that the virtual thread and its carrier temporarily share the operating system thread is invisible. In contrast, from the perspective of native code, both the virtual thread and its carrier run on the same native thread. Therefore, native code that is called multiple times on the same virtual thread may observe a different OS thread identifier on each call.
Schedulers do not currently implement time-sharing for virtual threads. Time-sharing is a forced preemption of threads that take up an allocated amount of CPU time. While time-sharing can be effective using a few hundred platform threads, it is unclear whether time-sharing would be as effective using a million virtual threads.
Executing virtual threads
To take advantage of virtual threads, there is no need to rewrite the program. Virtual threads do not require or expect the application code to explicitly hand control back to the scheduler; in other words, virtual threads are not cooperative. User code must not assume how or when virtual threads are assigned to platform threads, just as it assumes how or when platform threads are assigned to processor cores.
To run code in a virtual thread, the JDK’s virtual thread scheduler assigns virtual threads to execute on platform threads by mounting them on platform threads. This makes the platform thread a vehicle for the virtual thread. Later, after running some code, the virtual thread can be unmounted from its carrier. At this point, the platform thread is idle, so the scheduler can load a different virtual thread on it, thus making it a carrier again.
Typically, virtual threads are unloaded when they block I/O or other blocking operations in the JDK, such as BlockingQueue.take(). When the blocking operation is ready to complete (e.g., bytes have been received on a socket), it commits the virtual thread back to the scheduler, which will load the virtual thread on the carrier to resume execution.
The loading and unloading of virtual threads occurs frequently and transparently, and does not block any OS threads. For example, the server application shown earlier includes the following lines of code that contain calls to the blocking operation.
|
|
These operations will cause the virtual thread to load and unload multiple times, usually once per call to get() and possibly multiple times during the execution of I/O in send(…).
The vast majority of blocking operations in the JDK will unload the virtual thread, freeing its carrier and the underlying OS thread to take on new work. However, some blocking operations in the JDK do not offload virtual threads, and therefore block their carriers and the underlying OS threads. This is due to the operating system level (e.g., many file system operations) or the JDK level (e.g., Object.wait()). The implementation of these blocking operations will compensate for the capture of OS threads by temporarily extending the parallelism of the scheduler. As a result, the number of platform threads in the scheduler’s ForkJoinPool may temporarily exceed the number of available processors. The maximum number of platform threads available to the scheduler can be adjusted using the system property jdk.virtualThreadScheduler.maxPoolSize.
A virtual thread cannot be unloaded during a blocking operation because it is fixed to its carrier in two cases.
- when it executes code in a synchronous block or method, or
- when it executes a native method or a foreign function.
Fixation does not make the application incorrect, but it may hinder its scalability. If a virtual thread performs a blocking operation, such as I/O or BlockingQueue.take(), while being fixed, its carrier and the underlying OS thread will be blocked for the duration of the operation. Frequent fixes over long periods of time may harm the scalability of the application by capturing operators.
Schedulers do not compensate for fixes by extending their parallelism. Instead, frequent and long term fixes are avoided by modifying frequently run synchronization blocks or methods and protecting potentially long I/O operations to be used java.util.concurrent.locks.ReentrantLock Instead, there is no need to replace infrequently used synchronization blocks and methods (e.g., executed only at startup) or to protect memory operations. As always, efforts are made to keep locking strategies simple and straightforward.
The new diagnostics help to migrate code to virtual threads and to assess whether a specific use of synchronization should be replaced with java.util.concurrentLock.
- JDK Flight Recorder (JFR) events will be raised when a thread blocks while being fixed.
- The system property
jdk.tracePinnedThreads
triggers a stack trace when a thread is blocked at a fixed time. Use-Djdk.tracePinnedThreads=full
to print a full stack trace when the thread is blocked at fixed time, with native frames and frames holding monitors highlighted. Use-Djdk.tracePinnedThreads=short
to limit the output to only problematic frames.
We may be able to remove the first restriction above in a future release. The second restriction is necessary for proper interaction with native code.
Memory usage and interaction with garbage collection
The virtual thread’s stack is stored in Java’s garbage collection heap as a stack block object. The stack grows and shrinks as the application runs, both for memory efficiency and to accommodate stacks of arbitrary depth (up to the JVM-configured platform thread stack size). This efficiency allows for a large number of virtual threads, which in turn allows for the continuous per-request style survivability of threads in server applications.
In the second example above, keep in mind that the framework is assumed to process each request by creating a new virtual thread and calling the handle method; even if it calls the handle at the end of the deep call stack (after authentication, transactions, etc.), the handle itself generates multiple virtual threads that perform only short-term tasks. Thus, for each virtual thread with a deep call stack, there will be multiple virtual threads with shallow call stacks consuming very little memory.
In general, the amount of heap space and garbage collector activity required by virtual threads is difficult to compare with asynchronous code. Application code that handles requests must typically maintain data across I/O operations; threaded per-request code can keep data in local variables, which are stored on a virtual thread stack in the heap. Asynchronous code, on the other hand, must retain the same data in heap objects that are passed from one stage of the pipeline to the next. On the one hand, the layout of stack frames required by virtual threads is more wasteful than compact objects; on the other hand, virtual threads can mutate and reuse their stacks in many cases (depending on low-level GC interactions), while asynchronous pipelines always require the allocation of new objects. Overall, heap consumption and garbage collector activity should be roughly similar for threads per request and asynchronous code. Over time, we want to make the internal representation of the virtual thread stack more compact.
Unlike the platform thread stack, the virtual thread stack is not the GC root, so the references contained in it are not traversed by the garbage collector performing concurrent scans in a stop-world pause. This also means that if a virtual thread is blocked on, say, BlockingQueue.take(), and no other thread can get a reference to the virtual thread or queue, then that thread can be garbage collected-which is good because the virtual thread is never interrupted or unblocked. Of course, if the virtual thread is running, or if it is blocked and may be unblocked, it will not be garbage collected.
The current limitation of virtual threads is that G1 GC does not support huge stack block objects. A StackOverflowError may be raised if the virtual thread’s stack reaches half the size of the region (typically 512KB).
Detailed changes
The remaining subsections detail the changes proposed in the Java platform and its implementation.
- java.lang.Thread
- Thread-local variables
- java.util.concurrent
- Networking
- java.io
- Java Native Interface (JNI)
- Debugging (JVM TI, JDWP, and JDI)
- JDK Flight Recorder (JFR)
- Java Management Extensions (JMX)
- java.lang.ThreadGroup
java.lang.Thread
Update the java.lang.Thread API as follows.
- Thread.Builder, Thread.ofVirtual() and Thread.ofPlatform() are new APIs for creating virtual and platform threads. for example,
Thread thread = Thread.ofVirtual().name("duke"). unstarted(runnable);
Creates a new unstarted virtual thread named “duke”. - ``Thread.startVirtualThread(Runnable)` is a convenient method for creating and starting virtual threads.
- Thread.Builder can create a single thread, or a ThreadFactory, which can then create multiple threads with the same properties.
- Thread.isVirtual() tests if a thread is virtual.
- Thread.join and Thread.sleep accept wait and sleep times as java.time.
- The new final method Thread.threadId() returns the thread’s identifier. The existing non-final method Thread.getId() is now deprecated.
- Thread.getAllStackTraces() now returns a mapping of all platform threads instead of all threads.
The java.lang.Thread API remains otherwise unchanged. The constructor for the thread class definition creates platform threads, as before. There is no new public constructor.
The main API differences between virtual threads and platform threads are.
- The public thread constructor cannot create virtual threads.
- A virtual thread is always a daemon thread. The Thread.setDaemon(boolean) method cannot change a virtual thread to a non-daemon thread.
- Virtual threads have a fixed priority of Thread.NORM_PRIORITY. The Thread.setPriority(int) method has no effect on virtual threads. This restriction may be revisited in a future release.
- A virtual thread is not an active member of a thread group. The Thread.getThreadGroup() returns a placeholder thread group named “VirtualThreads” when called on a virtual thread.
- Virtual threads do not have permissions when run with the SecurityManager set.
- The stop(), suspend(), and resume() methods are not supported for virtual threads. These methods raise an exception when called on a virtual thread.
Thread Local Variables
Virtual threads support ThreadLocal and InheritableThreadLocal variables, just like platform threads, so they can run existing code that uses thread-local programs. However, since virtual threads can be very numerous, please use thread locals after careful consideration. In particular, do not use thread locals to pool expensive resources between multiple tasks that share the same thread in a thread pool. Virtual threads should not be pooled, as each thread runs only one task during its lifetime. We have removed much of the use of thread local from the java.base module to prepare virtual threads to reduce memory usage when running with millions of threads.
In addition.
- The Thread.Builder API defines methods to opt out of thread local when creating threads. It also defines methods for opting out of inheriting thread-local initial values that can be inherited. ThreadLocal.get() returns the initial value and ThreadLocalSet(T) throws an exception when called from a thread that never supports thread locals.
- The legacy context class loader is now specified to work like a local inheritable thread. If Thread.setContextClassLoader(ClassLoader) is called on a thread that does not support thread locality, then an exception is thrown.
scope-local variables may prove to be a better alternative to thread-local for some use cases.
java.util.concurrent
LockSupport, now supports virtual threads: the resident virtual thread releases the base carrier thread to perform other work, and the unresident virtual thread plans for it to continue. This change to LockSupport allows all APIs that use it (locks, semaphores, blocking queues, etc.) to be gracefully parked when called in a virtual thread.
In addition.
- Executors.newThreadPerTaskExecutor(ThreadFactory) and Executors.newVirtualThreadPerTaskExecutor() create an ExecutorService that creates a new thread for each task. These methods support migration and interoperability with existing code that uses thread pools and ExecutorService.
- ExecutorService is now extended to be automatically closed, thus allowing this API to be used with the try-with-resource construct shown in the example above.
- Future now defines methods to get the result or exception of a completed task and to get the status of the task. Combined, these additions make it easy to use the Future object as an element of a stream, filter the Future stream to find completed tasks, and then map to get the result stream. These methods will also help for structured concurrency.
Networking
The implementation of the networking API in the java.net and java.nio.channels packages now works with virtual threads: an operation on the virtual thread that prevents the establishment of a network connection or reading from a socket, releasing the underlying carrier thread to perform other work.
To allow interrupts and cancellations, blocking I/O methods defined as java.net.Socket, ServerSocket and DatagramSocket are now specified as interruptible when called in a virtual thread: a virtual thread blocking on an interrupted socket will unlock the thread and close the socket. Blocking I/O operations on these types of sockets when fetching from an InterruptibleChannel is always interruptible, so this change aligns the behavior of these APIs at creation time with the behavior when fetching from a channel.
java.io
The java.io package provides APIs for byte streams and character streams. the implementation of these APIs is highly synchronous and changes are needed to avoid fixing them when they are used in virtual threads.
As background, byte-oriented input/output streams are not specified as thread-safe, nor is the expected behavior when close() is called when a thread is blocked in a read or write method. In most cases, it does not make sense to use a specific input or output stream from multiple concurrent threads. Character-oriented readers/writers are also not specified as thread-safe, but they do expose lock objects for subclasses. In addition to being fixed, synchronization in these classes is problematic and inconsistent; for example, the stream decoders and encoders used by InputStreamReader and OutputStreamWriter are synchronized on the stream object, not on the lock object.
To prevent fixes, the implementation now works as follows.
- BufferedInputStream, BufferedOutputStream, BufferedReader, BufferedWriter, PrintStream, and PrintWriter now use explicit locks instead of monitors when used directly. These classes are synchronized when subclassing as before.
- Stream decoders and encoders used by InputStreamReader and OutputStreamWriter now use the same locks as the closed InputStreamReader or OutputStreamWriter.
Further, eliminating all of these normally unnecessary locks is beyond the scope of this JEP.
In addition, the initial size of the buffers used by the stream encoders of BufferedOutputStream, BufferedWriter, and OutputStreamWriter are now smaller to reduce memory usage when there are many streams or writers in the heap - if there are a million virtual threads, each with a buffer on the socket connection streams.
Java Native Interface (JNI)
JNI defines a new function, IsVirtualThread, to test whether an object is a virtual thread.
The JNI specification remains otherwise unchanged.
Debugging
The debugging architecture consists of three interfaces: the JVM Tool Interface (JVM TI), the Java Debug Wire Protocol (JDWP), and the Java Debug Interface (JDI). All three interfaces now support virtual threads.
Updates to the JVM TI are.
- Most functions called with jthread (i.e., a JNI reference to a thread object) can be called with a reference to a virtual thread. Virtual threads do not support a small number of functions, namely PopFrame, ForceEarlyReturn, StopThread, AgentStartFunction, and GetThreadCpuTime. The
SetLocal*
function is limited to setting local variables in the topmost frame of a virtual thread that is hung at a breakpoint or single-step event. - The GetAllThreads and GetAllStackTraces functions are now specified to return all platform threads, not all threads.
- Event callbacks can be called in the context of a virtual thread for all events except those posted during early VM startup or heap iterations.
- The hang/resume implementation allows the debugger to hang and resume virtual threads, and allows carrier threads to be hung when a virtual thread is hung.
- A new function, can_support_virtual_threads, allows the agent to more finely control the thread start and end events of virtual threads.
- New functions to support bulk hangs and resumes of virtual threads; these functions require the can_support_virtual_threads feature.
Existing JVM TI agents will work as before, but they may encounter errors if they call functions that are not supported by virtual threads. These occur when agents that are not aware of virtual threads are used with applications that use virtual threads. For some agents, changing GetAllThreads to return an array containing only platform threads may be a problem. Existing agents with ThreadStart and ThreadEnd events enabled may encounter performance issues because they are unable to restrict these events to platform threads.
Updates to JDWP are.
- A new command that allows the debugger to test if a thread is a virtual thread.
- A new modifier on the EventRequest command allows the debugger to restrict thread start and end events to platform threads.
Updates to JDI are.
- A new method com.sun.jdi.ThreadReference to test if a thread is virtual.
- The new methods com.sun.jdi.request.ThreadStartRequest and com.sun.jdi.request.ThreadDeathRequest restrict the events generated for the request to the platform thread.
As mentioned above, virtual threads are not considered to be active threads in the thread group. Therefore, the list of threads returned by the JVM TI function GetThreadGroupCalt, the JDWP command ThreadGroupReference/Children, and the JDI method com.sun.jdi.ThreadGroupReference.threads() includes only platform threads.
JDK Flight Recorder (JFR)
JFR supports virtual threads with multiple new events:.
- jdk.VirtualThreadStart and jdk.VirtualThreadEnd indicate the start and end of a virtual thread. By default, these events are disabled.
- jdk.VirtualThreadPinned indicates that a virtual thread is resident while being fixed, i.e., not releasing its carrier thread (see Restrictions). This event is enabled by default and has a threshold value of 20ms.
- jdk.VirtualThreadSubmitFailed indicates that starting or canceling a resident virtual thread failed, possibly due to a resource issue. By default, this event is enabled.
Java Management Extensions (JMX)
A new method com.sun.management.HotSpotDiagnosticsMXBean generates a new style of thread dump described above. The method can also be called indirectly via the platform MBeanServer from a local or remote JMX tool.
java.lang.management.ThreadMXBean supports monitoring and management of platform threads only.
java.lang.ThreadGroup
java.lang.ThreadGroup is an old API for grouping threads that is rarely used in modern applications and is not suitable for grouping virtual threads. We do not recommend and downgrade it now, and hope to introduce a new thread organization structure as structured concurrency in the future.
As background, the ThreadGroup API dates back to Java 1.0. it was originally intended to provide job control operations such as stopping all threads in a group. Modern code prefers to use the thread pool API of java.util.concurrent (introduced in Java 5.) ThreadGroup supported the isolation of applets in early Java versions, but the Java security architecture evolved significantly in Java 1.2 and ThreadGroup no longer plays an important role. ThreadGroup was also intended to be used for diagnostic purposes, but that role has been replaced by monitoring and management features introduced in Java 5, including the java.lang. management API. In addition to being largely irrelevant now, there are a number of significant issues with the ThreadGroup API and implementation.
- The API and mechanism for destroying thread groups is flawed.
- The API requires the implementation to reference all active threads in the group. This adds synchronization and contention overhead to thread creation, thread startup, and thread termination.
- The API defines enumerate() methods, which are valid in their own right.
- The API defines the suspend(), resume(), and stop() methods, which are inherently deadlock-prone and unsafe.
Now specify, discourage, and demote ThreadGroup , as follows.
- Removed the ability to explicitly destroy the thread group: the ultimately not recommended destroy() method does nothing.
- The concept of a daemon thread group has been removed: the setDaemon(boolean) and isDaemon() methods, which were set by the daemon state and eventually deprecated, are ignored.
- The implementation no longer retains strong references to subgroups. Thread groups are now eligible for garbage collection when there are no active threads in the thread group and no other any other threads keep the thread group active.
- The suspend(), resume(), and stop() methods, which are ultimately not recommended, always throw exceptions.