|
|
Created:
8 years, 6 months ago by Ryan Sleevi Modified:
7 years, 3 months ago CC:
chromium-reviews, cbentzel+watch_chromium.org, darin-cc_chromium.org Base URL:
svn://svn.chromium.org/chrome/trunk/src Visibility:
Public. |
DescriptionMove the core state machine of SSLClientSocketNSS into a thread-safe Core
NSS SSL functions may block on the underlying PKCS#11 modules or on
user input. On ChromeOS, which has a hardware TPM, calls may take upwards
of several seconds, preventing any IPC due to the I/O thread being
blocked.
To avoid blocking the I/O thread on ChromeOS, move the core SSL
implementation to a dedicated worker thread, so that only SSL sockets
are blocked.
BUG=122355
TEST=existing net_unittests + see bug.
Committed: http://src.chromium.org/viewvc/chrome?view=rev&revision=140697
Patch Set 1 #
Total comments: 63
Patch Set 2 : Fix thread-safety #
Total comments: 5
Patch Set 3 : Review feedback #
Total comments: 28
Patch Set 4 : Review feedback 2 #Patch Set 5 : Rebase to ToT #Patch Set 6 : Speculative remoting fixes #Patch Set 7 : Speculative remoting fixes #Patch Set 8 : Actually quit the loop #
Total comments: 5
Patch Set 9 : Cleanup #Patch Set 10 : Fix win_rel by not caching the current threads task runner. See added comment #
Messages
Total messages: 27 (0 generated)
wtc: This is a rather large review, but should be largely functionally identical. Unfortunately, because of the amount of code movement (moving the ::Core:: methods to match their declaration order and all before any SSLClientSocketNSS functions), the patch may be a bit hard to review. I wanted to begin the high level review process though, and will see about working on a "wrongly ordered but easier to compare methods" version as well. Suggested Review Order: - Core declaration (lines 542-833) - Core::DoPayloadWrite, Core::DoPayloadRead - Core::BufferRecv, Core::BufferSend (NSS thread) - Core::DoBufferRecv, Core::DoBufferSend (network thread) - Core::DomainBoundClientAuthHandler - Core::DoGetDomainBoundCert - Core::DoGetDomainBoundCertComplete - Core::DoHandshake/DoHandshakeLoop Some high level points to call out: - If nss_task_runner and network_task_runner both execute on the same thread, all functions are meant to complete synchronously (unless the underlying IO itself blocks). This should mirror the exact functionality that exists today. - When working with IO/network-thread-owned data, you'll typically see three functions: - BufferRecv (called on the NSS thread - does all the NSS work before an IO operation) - DoBufferRecv (called on the network thread - does the IO operation) - BufferRecvComplete (called on both threads - if called on network thread, switches itself over to the NSS thread - does all the NSS work after an IO operation) The same pattern applies for Sending and for getting domain bound certs. - PostOrRunCallback() is just a helper function that ensures certain base::Closures are always run on the network thread, and only executed if Detach() hasn't been called. It avoids the need of creating multiple helper functions for each operation that needs to execute on the IO thread (typically: net logging) I've tried to annotate within each function via DCHECK()s what the pre-conditions are, to make it easier to understand. Finally, review the change to SSL_ExportKeyingMaterial. Since the code currently using it (Chromoting & SPDY CREDENTIAL frames) are not trivially adapted to use Callbacks, because they're edge conditions unlikely to be hit, and because I'd like to merge this once we're comfortable to M20, I've instead corrected the locking on ExportKeyingMaterial. This should only ever block during a renegotiation handshake, which should be acceptable.
http://codereview.chromium.org/10454066/diff/1/net/socket/client_socket_facto... File net/socket/client_socket_factory.cc (right): http://codereview.chromium.org/10454066/diff/1/net/socket/client_socket_facto... net/socket/client_socket_factory.cc:42: #endif Design context: I debated very heavily on where this belongs. One option was to pass it through the SSLClientSocketContext, so that it goes all the way up from the HttpNetworkParams to the IO thread (chrome/browser/io_thread.h/.cc). The benefit of sticking it there is that we can reliably shut down the child thread when the IO thread goes out of scope - currently, it leaks (by way of the DefaultClientSocketFactory being leaky). However, that seems a wrong layering, since we really want to ensure all tests (as well as things like test_shell, content_shell, anything that depends on net/) has the same /tested/ and reliable behaviour. Another thought was to expose a global function to set the NSS thread. If it wasn't set, simply default to MessageLoop::current(). This is similar to what the OCSP/CRL/AIA code uses (net::SetMessgeLoopForNSSHttpIO). This as well is something only called from the io_thread, so it means that some of the multi-threaded behaviours are never tested. It also seems the wrong sort of encapsulation, because it leaks the NSS-specific knowledge from being within the DefaultClientSocketFactory all the way up to the Chrome layer. I'm not convinced that it's best to put it here - in fact, I have serious reservations. Think of this part as "in flux"
http://codereview.chromium.org/10454066/diff/1/net/socket/client_socket_facto... File net/socket/client_socket_factory.cc (right): http://codereview.chromium.org/10454066/diff/1/net/socket/client_socket_facto... net/socket/client_socket_factory.cc:102: nss_task_runner = nss_thread_->message_loop_proxy(); Further design context: Note that I'm not checking |ssl_config.send_client_cert| or |ssl_config.client_cert| here. Originally, we discussed only running in multi-threaded mode when client certs are involved. However, as implemented, both domain bound certs and /locating/ client certs may cause blocking to occur, due to the module locks during C_Login, and the session lock during C_SignFinal. This also applies to creating the peer cert, where it may try to scan the module list again. It's these latter points that had me move all I/O off, rather than just client-auth handshakes.
Patch set 1 LGTM. Thanks for the excellent work. High-level comments: 1. In the CL's description, you said: NSS SSL functions may block on the underlying PKCS#11 modules or on user input. It is confusing to mention the blocking on user input because that kind of blocking cannot be addressed by using a dedicated worker thread. (User input may block indefinitely.) 2. I highly recommend also running the multi-threaded mode on some other platforms, such as Linux. This will exercise the code more and make it easier to debug. 3. I suggest copying to the source file some of your comments that explain the code for me. http://codereview.chromium.org/10454066/diff/1/net/socket/client_socket_facto... File net/socket/client_socket_factory.cc (right): http://codereview.chromium.org/10454066/diff/1/net/socket/client_socket_facto... net/socket/client_socket_factory.cc:38: #if defined(OS_CHROMEOS) It may be a good idea to do this on more platforms (at least on Linux) as a way to exercise this code more. http://codereview.chromium.org/10454066/diff/1/net/socket/client_socket_facto... net/socket/client_socket_factory.cc:42: #endif I agree with the design decision of creating the NSS SSL thread here. http://codereview.chromium.org/10454066/diff/1/net/socket/client_socket_facto... net/socket/client_socket_factory.cc:101: if (g_use_dedicated_nss_thread && nss_thread_->message_loop_proxy()) If g_use_dedicated_nss_thread is true, nss_thread_->message_loop_proxy() cannot be NULL, right? http://codereview.chromium.org/10454066/diff/1/net/socket/client_socket_facto... net/socket/client_socket_factory.cc:102: nss_task_runner = nss_thread_->message_loop_proxy(); I agree with the design decision of moving all NSS functions to the NSS SSL thread, rather than just client-auth handshakes. http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_n... File net/socket/ssl_client_socket_nss.cc (right): http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_n... net/socket/ssl_client_socket_nss.cc:392: // enough to log the event from the network thread. I don't see anything in this helper function that acquires a reference to |buffer|, so I don't understand how this helper function keeps |buffer| alive. The reference is acquired by the callers of this helper function, right? http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_n... net/socket/ssl_client_socket_nss.cc:412: // If |nss_fd| is NULL, frees the current certificate chain. Nit: this comment seems to imply that the current certificate chain is freed only if nss_fd_ is NULL. http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_n... net/socket/ssl_client_socket_nss.cc:432: *this = other; Nit: indent by just two. http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_n... net/socket/ssl_client_socket_nss.cc:465: return; This should be a CHECK or DCHECK. NSS doesn't support anonymous servers. http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_n... net/socket/ssl_client_socket_nss.cc:478: for (unsigned i = 0; i < size(); i++) { Nit: use either size() or certs_.size() consistently on these two lines. http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_n... net/socket/ssl_client_socket_nss.cc:487: // ConnectionState is a helper struct used to pass handshake state between If this structure holds the handshake state, perhaps it should be named HandshakeState? http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_n... net/socket/ssl_client_socket_nss.cc:492: // written at any time on the network task runner. Copying avoids the need for Should the "network task runner" in this line be "NSS task runner"? http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_n... net/socket/ssl_client_socket_nss.cc:493: // lock contention, Did you mean "lock contention" or "lock protection"? "the need for lock contention" sounds strange. Perhaps you meant "Copying avoids lock contention"? http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_n... net/socket/ssl_client_socket_nss.cc:527: // net::X509Certificate object (|server_cert|). It's possible for some Nit: you can omit net:: in this file. http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_n... net/socket/ssl_client_socket_nss.cc:535: bool predicted_cert_chain_correct; Please keep the original comment, after updating it to reflect the new code: // True iff |ssl_host_info_| contained a predicted certificate chain and // that we found the prediction to be correct. bool predicted_cert_chain_correct_; http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_n... net/socket/ssl_client_socket_nss.cc:549: // continue executing tasks. This block comment is important to the understanding of this file. It should be moved to a more prominent place, such as the beginning of this file or even to the .h file. It's also important to point out that network_task_runner and nss_task_runner may be the same. This is important to understand why some functions first assert OnNSSTaskRunner() but then still test OnNetworkTaskRunner(). Some of your comment below can be copied to the source file: Some high level points to call out: - If nss_task_runner and network_task_runner both execute on the same thread, all functions are meant to complete synchronously (unless the underlying IO itself blocks). This should mirror the exact functionality that exists today. - When working with IO/network-thread-owned data, you'll typically see three functions: - BufferRecv (called on the NSS thread - does all the NSS work before an IO operation) - DoBufferRecv (called on the network thread - does the IO operation) - BufferRecvComplete (called on both threads - if called on network thread, switches itself over to the NSS thread - does all the NSS work after an IO operation) The same pattern applies for Sending and for getting domain bound certs. http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_n... net/socket/ssl_client_socket_nss.cc:573: SECStatus Init(PRFileDesc* socket, memio_Private* buffers); Why do you use SECStatus here? This is not an NSS function, so it seems better for it to return int or bool. Also, the comment should clarify whether this function returns an NSS error code or a network error code on failure. Nit: rename the |socket| parameter to |nss_fd|. http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_n... net/socket/ssl_client_socket_nss.cc:608: int Write(IOBuffer* buf, int buf_len, const CompletionCallback& callback); Nit: it may be good to summarize somewhere that all the public methods of Core are called on the network task runner. http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_n... net/socket/ssl_client_socket_nss.cc:633: // the handshake had completed. Nit: it's also caused by not having a working SSL_RestartHandshakeAfterCertReq function. (We only made that function work when we added the OBC code.) http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_n... net/socket/ssl_client_socket_nss.cc:743: // Methods that are accessed on both the network task runner and the NSS Nit: accessed => called. http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_n... net/socket/ssl_client_socket_nss.cc:749: void PostOrRunCallback(const base::Closure& callback); Document that the callback runs on the network task runner. It can be based on your comment: - PostOrRunCallback() is just a helper function that ensures certain base::Closures are always run on the network thread, and only executed if Detach() hasn't been called. It avoids the need of creating multiple helper functions for each operation that needs to execute on the IO thread (typically: net logging) http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_n... net/socket/ssl_client_socket_nss.cc:752: // Variables that are ONLY accessed on the network task runner: Nit: Variables => Members? Also on lines 771 and 814. http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_n... net/socket/ssl_client_socket_nss.cc:762: base::WeakPtrFactory<BoundNetLog> weak_factory_; weak_factory_ is a confusing name. I thought this was a weak ptr factory for Core. Please rename this member weak_net_log_factory_. http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_n... net/socket/ssl_client_socket_nss.cc:776: // NSS state machine Nit: it is a little confusing to call nss_fd_ the NSS state machine, because it doesn't expose a state mashine interface. I think it's better to call it "NSS SSL socket". http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_n... net/socket/ssl_client_socket_nss.cc:863: weak_net_log_(weak_factory_.GetWeakPtr()), Is this cheaper than calling weak_factory_.GetWeakPtr() every time we need to add a NetLog event? http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_n... net/socket/ssl_client_socket_nss.cc:1962: make_scoped_refptr(new SSLErrorParams(rv, 0)))); Nit: align these parameters with the opening parenthesis of base::Bind. http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_n... net/socket/ssl_client_socket_nss.cc:2186: void SSLClientSocketNSS::Core::DoWriteCallback(int rv) { Why does DoWriteCallback not use PostOrRunCallback? http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_n... net/socket/ssl_client_socket_nss.cc:2402: return ERR_UNEXPECTED; Nit: ERR_ABORTED may be a better error code. Same for line 2424. http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_n... net/socket/ssl_client_socket_nss.cc:2407: scoped_refptr<IOBuffer>(read_buffer))); Why do you need to use scoped_refptr here? It seems that read_buffer can be made a member of Core. http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_n... net/socket/ssl_client_socket_nss.cc:2447: return ERR_FAILED; ERR_ABORTED? http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_n... net/socket/ssl_client_socket_nss.cc:2449: weak_net_log_->BeginEvent(NetLog::TYPE_SSL_GET_DOMAIN_BOUND_CERT, NULL); Should we check weak_net_log_ != NULL? http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_n... net/socket/ssl_client_socket_nss.cc:2550: FROM_HERE, task); Nit: move this to the previous line. http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_n... File net/socket/ssl_client_socket_nss.h (right): http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_n... net/socket/ssl_client_socket_nss.h:58: base::SingleThreadTaskRunner* nss_task_runner, Please document the two new input arguments. http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_n... net/socket/ssl_client_socket_nss.h:105: class Core; It would be nice to document what goes into Core and what stays in this class. This will help in the future when someone needs to add a new member. It may not be clear whether the new member should be added to Core or this class. http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_n... net/socket/ssl_client_socket_nss.h:149: scoped_refptr<base::SingleThreadTaskRunner> nss_task_runner_; Document these members? http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_n... net/socket/ssl_client_socket_nss.h:176: bool completed_handshake_; Keep the original comment for this member? // True if the SSL handshake has been completed. bool completed_handshake_; http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_u... File net/socket/ssl_client_socket_unittest.cc (right): http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_u... net/socket/ssl_client_socket_unittest.cc:656: rv = callback.WaitForResult(); Nit: there is a convenience function for doing this in one line: rv = callback.GetResult(rv); http://codereview.chromium.org/10454066/diff/1/net/third_party/nss/ssl/sslinfo.c File net/third_party/nss/ssl/sslinfo.c (right): http://codereview.chromium.org/10454066/diff/1/net/third_party/nss/ssl/sslinf... net/third_party/nss/ssl/sslinfo.c:379: ssl_GetRecvBufLock(ss); The changes to this file should be moved to a separate CL. Are you sure this function needs to get RecvBufLock? It doesn't read data from the transport. http://codereview.chromium.org/10454066/diff/1/net/third_party/nss/ssl/sslinf... net/third_party/nss/ssl/sslinfo.c:406: ssl_ReleaseSSL3HandshakeLock(ss); BUG: this function releases SSL3HandshakeLock twice, here and on line 428.
http://codereview.chromium.org/10454066/diff/1/net/socket/client_socket_facto... File net/socket/client_socket_factory.cc (right): http://codereview.chromium.org/10454066/diff/1/net/socket/client_socket_facto... net/socket/client_socket_factory.cc:38: #if defined(OS_CHROMEOS) On 2012/05/30 22:54:29, wtc wrote: > > It may be a good idea to do this on more platforms (at least > on Linux) as a way to exercise this code more. I agree, but as a possible merge candidate, I wanted to keep the scope of the change somewhat limited, even though that means less testing via the channels. http://codereview.chromium.org/10454066/diff/1/net/socket/client_socket_facto... net/socket/client_socket_factory.cc:101: if (g_use_dedicated_nss_thread && nss_thread_->message_loop_proxy()) On 2012/05/30 22:54:29, wtc wrote: > > If g_use_dedicated_nss_thread is true, nss_thread_->message_loop_proxy() > cannot be NULL, right? If the thread fails to start, it'll be NULL. With this check, rather than crash, we'll block the IO thread, which at least offers reliable functionality, at the cost of user experience. http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_n... File net/socket/ssl_client_socket_nss.cc (right): http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_n... net/socket/ssl_client_socket_nss.cc:392: // enough to log the event from the network thread. On 2012/05/30 22:54:29, wtc wrote: > > I don't see anything in this helper function that acquires a > reference to |buffer|, so I don't understand how this helper > function keeps |buffer| alive. The reference is acquired by > the callers of this helper function, right? Yes. I could change |buffer| to be "const scoped_refptr<IOBuffer>& buffer", which makes it part of the signature, but base::Bind already requires that because |buffer| is a ref-counted type, callers MUST pass in a scoped_refptr<>, which is kept alive by the base::Closure until this runs. I'll see if I can word this better. http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_n... net/socket/ssl_client_socket_nss.cc:412: // If |nss_fd| is NULL, frees the current certificate chain. On 2012/05/30 22:54:29, wtc wrote: > > Nit: this comment seems to imply that the current certificate > chain is freed only if nss_fd_ is NULL. Will update. http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_n... net/socket/ssl_client_socket_nss.cc:465: return; On 2012/05/30 22:54:29, wtc wrote: > > This should be a CHECK or DCHECK. NSS doesn't support > anonymous servers. Really this was meant to handle when rv == SECFailure, which seems to be a condition handled within SSL_PeerCertificateChain. http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_n... net/socket/ssl_client_socket_nss.cc:492: // written at any time on the network task runner. Copying avoids the need for On 2012/05/30 22:54:29, wtc wrote: > > Should the "network task runner" in this line be "NSS task runner"? Yes, thanks, fixed. http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_n... net/socket/ssl_client_socket_nss.cc:493: // lock contention, On 2012/05/30 22:54:29, wtc wrote: > > Did you mean "lock contention" or "lock protection"? > "the need for lock contention" sounds strange. Perhaps > you meant "Copying avoids lock contention"? You're absolutely correct. It was meant to say "lock protection", but I like your wording better. http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_n... net/socket/ssl_client_socket_nss.cc:863: weak_net_log_(weak_factory_.GetWeakPtr()), On 2012/05/30 22:54:29, wtc wrote: > > Is this cheaper than calling weak_factory_.GetWeakPtr() > every time we need to add a NetLog event? It's more subtle than that, so I'll add appropriate comments. A WeakPtr can only be dereferenced on the thread it was created, but it can be copied/assigned on any thread (which is important for base::Callback's currying of arguments). This ensures the WeakPtr is created on the network task runner, so that it can be de-referenced on the network task runner. However, we can copy the member var around (or bind it via base::Bind) on the NSS task runner, for tasks destined for the network task runner. http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_n... net/socket/ssl_client_socket_nss.cc:2186: void SSLClientSocketNSS::Core::DoWriteCallback(int rv) { On 2012/05/30 22:54:29, wtc wrote: > > Why does DoWriteCallback not use PostOrRunCallback? BUG. http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_n... net/socket/ssl_client_socket_nss.cc:2407: scoped_refptr<IOBuffer>(read_buffer))); On 2012/05/30 22:54:29, wtc wrote: > > Why do you need to use scoped_refptr here? It seems that > read_buffer can be made a member of Core. This was to avoid any races on read_buffer_ between the read request coming in on the network task runner, being dispatched to the NSS task runner, going back to the network task runner (to do the transport IO), the callback being executed on the network task runner, which then reposts to the NSS task runner, which then processes the buffer. By currying the buffer within the Callback, we never have to worry about when or where to reset the state. http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_n... net/socket/ssl_client_socket_nss.cc:2449: weak_net_log_->BeginEvent(NetLog::TYPE_SSL_GET_DOMAIN_BOUND_CERT, NULL); On 2012/05/30 22:54:29, wtc wrote: > > Should we check weak_net_log_ != NULL? Do. By virtue of !detached_, we're guaranteed to have weak_net_log_. http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_u... File net/socket/ssl_client_socket_unittest.cc (right): http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_u... net/socket/ssl_client_socket_unittest.cc:656: rv = callback.WaitForResult(); On 2012/05/30 22:54:29, wtc wrote: > > Nit: there is a convenience function for doing this in one > line: > rv = callback.GetResult(rv); Right, I was following the pattern that existed within this file already (eg: 646/647) http://codereview.chromium.org/10454066/diff/1/net/third_party/nss/ssl/sslinfo.c File net/third_party/nss/ssl/sslinfo.c (right): http://codereview.chromium.org/10454066/diff/1/net/third_party/nss/ssl/sslinf... net/third_party/nss/ssl/sslinfo.c:379: ssl_GetRecvBufLock(ss); On 2012/05/30 22:54:29, wtc wrote: > > The changes to this file should be moved to a separate CL. No, this is important to avoiding the need to modify SSLClientSocketNSS::ExportKeyingMaterial. > > Are you sure this function needs to get RecvBufLock? > It doesn't read data from the transport. RecvBufLock is the only guard for ss->version, which is meant to be one of the 'atomic longs' but that is dependent on platform and thus not a safe assumption. SSL3HandshakeLock guards ss->ssl3 http://codereview.chromium.org/10454066/diff/1/net/third_party/nss/ssl/sslinf... net/third_party/nss/ssl/sslinfo.c:406: ssl_ReleaseSSL3HandshakeLock(ss); On 2012/05/30 22:54:29, wtc wrote: > > BUG: this function releases SSL3HandshakeLock twice, here > and on line 428. Well spotted.
http://codereview.chromium.org/10454066/diff/1/net/socket/client_socket_facto... File net/socket/client_socket_factory.cc (right): http://codereview.chromium.org/10454066/diff/1/net/socket/client_socket_facto... net/socket/client_socket_factory.cc:38: #if defined(OS_CHROMEOS) If you plan to merge this CL to the Chrome 20 branch, then testing is especially important. We can leave this part as is in this CL so that you can simply use drover to merge. On the trunk, after you check in this CL, please create another CL to enable this mode for Linux at least. (I would enable it for all platforms.) http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_n... File net/socket/ssl_client_socket_nss.cc (right): http://codereview.chromium.org/10454066/diff/1/net/socket/ssl_client_socket_n... net/socket/ssl_client_socket_nss.cc:465: return; I see. SSL_PeerCertificateChain has this code: if (ss->sec.peerCert == NULL) { *numCerts = 0; return SECSuccess; } I guess that can happen if SSL_PeerCertificateChain is called before the handshake is completed. http://codereview.chromium.org/10454066/diff/1/net/third_party/nss/ssl/sslinfo.c File net/third_party/nss/ssl/sslinfo.c (right): http://codereview.chromium.org/10454066/diff/1/net/third_party/nss/ssl/sslinf... net/third_party/nss/ssl/sslinfo.c:379: ssl_GetRecvBufLock(ss); On 2012/05/30 23:20:10, Ryan Sleevi wrote: > > RecvBufLock is the only guard for ss->version, which is meant to be one of the > 'atomic longs' but that is dependent on platform and thus not a safe assumption. I studied the functions in sslsecur.c as examples. They showed that ss->version seems to be protected by ssl_Get1stHandshakeLock rather than ssl_GetRecvBufLock. So I think we should use ssl_Get1stHandshakeLock instead. We need to add an NSS patch file for these changes.
http://codereview.chromium.org/10454066/diff/1/net/socket/client_socket_facto... File net/socket/client_socket_factory.cc (right): http://codereview.chromium.org/10454066/diff/1/net/socket/client_socket_facto... net/socket/client_socket_factory.cc:38: #if defined(OS_CHROMEOS) On 2012/05/31 01:23:42, wtc wrote: > > If you plan to merge this CL to the Chrome 20 branch, then > testing is especially important. > > We can leave this part as is in this CL so that you can > simply use drover to merge. On the trunk, after you > check in this CL, please create another CL to enable > this mode for Linux at least. (I would enable it for > all platforms.) Agreed, that's the goal :) http://codereview.chromium.org/10454066/diff/1/net/third_party/nss/ssl/sslinfo.c File net/third_party/nss/ssl/sslinfo.c (right): http://codereview.chromium.org/10454066/diff/1/net/third_party/nss/ssl/sslinf... net/third_party/nss/ssl/sslinfo.c:379: ssl_GetRecvBufLock(ss); On 2012/05/31 01:23:42, wtc wrote: > On 2012/05/30 23:20:10, Ryan Sleevi wrote: > > > > RecvBufLock is the only guard for ss->version, which is meant to be one of the > > 'atomic longs' but that is dependent on platform and thus not a safe > assumption. > > I studied the functions in sslsecur.c as examples. They > showed that ss->version seems to be protected by > ssl_Get1stHandshakeLock rather than ssl_GetRecvBufLock. > So I think we should use ssl_Get1stHandshakeLock instead. 1stHandshakeLock is itself guarded by RecvBufLock. I'll dig through the notes again, but there were writes to ss->version /before/ 1stHandshakeLock (in particular, renegotiation, IIRC), hence the broad lock. > > We need to add an NSS patch file for these changes. Agreed. Wanted to make the changes stick first.
wtc: Some BUG fixes now that I've banged around with it for a bit, with the TPM actually being sluggish for us. Just as an FYI, annotated where the BUGs were. I also changed PostOrRunCallback to take a tracked_object::Location, to make it easier to debug. Since these capture the instruction pointer (in release mode) or the file and line (in debug), if I did miss anything (and I don't think I did), it should be a lot easier to debug what the call sequence was. I'll address your further comments tomorrow. http://codereview.chromium.org/10454066/diff/3003/net/socket/ssl_client_socke... File net/socket/ssl_client_socket_nss.cc (right): http://codereview.chromium.org/10454066/diff/3003/net/socket/ssl_client_socke... net/socket/ssl_client_socket_nss.cc:1054: PostOrRunCallback(FROM_HERE, base::Bind(callback, rv)); BUG: Here, |callback| was getting invoked regardless if Detach() had been called (which, in practice, Detach() was getting called before this post, if a Write failed while a read was pending) http://codereview.chromium.org/10454066/diff/3003/net/socket/ssl_client_socke... net/socket/ssl_client_socket_nss.cc:1094: PostOrRunCallback(FROM_HERE, base::Bind(callback, rv)); BUG: Same as above - if a read failed, Detach() might get called before this callback ran. http://codereview.chromium.org/10454066/diff/3003/net/socket/ssl_client_socke... net/socket/ssl_client_socket_nss.cc:1364: core->nss_connection_state_)); BUG: |network_connection_state_.client_certs| wasn't getting updated with the detected client certs. Previously, it was only being set in HandshakeCallback(), which is only called after receiving the Finished message (or false starting) http://codereview.chromium.org/10454066/diff/3003/net/socket/ssl_client_socke... net/socket/ssl_client_socket_nss.cc:1712: core->nss_connection_state_)); BUG: If the handshake failed after NPN, we weren't recording it. I never saw this happen, but it seems like it could (even if the actual NPN exchange happens later) http://codereview.chromium.org/10454066/diff/3003/net/socket/ssl_client_socke... net/socket/ssl_client_socket_nss.cc:2617: location, task)); Turns out this was defeating the whole purpose of PostOrRunCallback :-) Whenever this was called from the NSS task runner, it was directly posting the callback, never checking detached_ - meaning callbacks would get executed regularly after their parents went away. The proper call chain is to re-enter ourselves. I didn't notice this in the testing, because the use of weak_net_log_ hid most of the egregious cases. It wasn't until SPDY (which really pushes the full duplex case) that it became a situation where a read callback would lead to teardown (and Detach()), while a write callback may still be posted and waiting in the message loop. Same reasons as the above changes to the direct calls to network_task_runner_->PostTask.
will: I added you as a reviewer since I saw you recently posted on chromium-dev talking about different strategies for objects that live and work on multiple threads. This approach seems in line with what you posed as option (3) [a RefCounted core], with a few hints of (2) [the weak pointers]. While not asking you to do a detailed review, I just was hoping you might take a high level glance, especially given your familiarity with base::Bind/base::Closure-et-all, and check if there's anything you think is anti-idiomatic with what your preferred approach is.
On 2012/05/31 06:30:21, Ryan Sleevi wrote: > will: I added you as a reviewer since I saw you recently posted on chromium-dev > talking about different strategies for objects that live and work on multiple > threads. > > This approach seems in line with what you posed as option (3) [a RefCounted > core], with a few hints of (2) [the weak pointers]. > > While not asking you to do a detailed review, I just was hoping you might take a > high level glance, especially given your familiarity with > base::Bind/base::Closure-et-all, and check if there's anything you think is > anti-idiomatic with what your preferred approach is. Oh, and definitely read comments 1 - 3 that I posted to Wan-Teh ( http://codereview.chromium.org/10454066/#msg1 / http://codereview.chromium.org/10454066/#msg2 / http://codereview.chromium.org/10454066/#msg3 ), they explain a bit more. I'll be putting in ASCII art threading diagrams and inline documentation in a future patchset, now that the overall approach (mostly) seems good, but if you just look at the diffs now, it may be a bit much without those three messages.
wtc: PTAL at Patchset 3. You originally reviewed Patchset 1, so it might be easier just to look at the diffs between the two.
Patch set 3 LGTM. http://codereview.chromium.org/10454066/diff/1/net/third_party/nss/ssl/sslinfo.c File net/third_party/nss/ssl/sslinfo.c (right): http://codereview.chromium.org/10454066/diff/1/net/third_party/nss/ssl/sslinf... net/third_party/nss/ssl/sslinfo.c:379: ssl_GetRecvBufLock(ss); On 2012/05/31 01:31:14, Ryan Sleevi wrote: > > 1stHandshakeLock is itself guarded by RecvBufLock. I'll dig through the notes > again, but there were writes to ss->version /before/ 1stHandshakeLock (in > particular, renegotiation, IIRC), hence the broad lock. 1stHandshakeLock is broader than RecvBufLock. I found places where NSS reads or writes ss->version while holding only 1stHandshakeLock. The place where NSS writes ss->version while holding only 1stHandshakeLock is in ssl3_SendClientHello, when it copies the version from the SID structure. Unless we are modifying ss->version based on info just received from the peer, we don't need to hold RecvBufLock. http://codereview.chromium.org/10454066/diff/7004/net/socket/client_socket_fa... File net/socket/client_socket_factory.cc (right): http://codereview.chromium.org/10454066/diff/7004/net/socket/client_socket_fa... net/socket/client_socket_factory.cc:51: nss_thread_->Start(); This creates a thread with MessageLoop::TYPE_DEFAULT, right? Just wanted to confirm it. http://codereview.chromium.org/10454066/diff/7004/net/socket/ssl_client_socke... File net/socket/ssl_client_socket_nss.cc (right): http://codereview.chromium.org/10454066/diff/7004/net/socket/ssl_client_socke... net/socket/ssl_client_socket_nss.cc:444: PeerCertificateChain& PeerCertificateChain::operator=( Because OnHandshakeStateUpdated calls this operator=, the NSS functions CERT_DestroyCertificate and CERT_DupCertificate will still be called on the network task runner. Hopefully that's OK. http://codereview.chromium.org/10454066/diff/7004/net/socket/ssl_client_socke... net/socket/ssl_client_socket_nss.cc:469: // The handshake on |nss_fd| may not have completed. I believe we never call PeerCertificateChain::Reset() with a non-null nss_fd before the handshake is fiished, so this probably can be a DCHECK. http://codereview.chromium.org/10454066/diff/7004/net/socket/ssl_client_socke... net/socket/ssl_client_socket_nss.cc:496: // It contains variables that may be read or written on the NSS task runner, Nit: variables => members http://codereview.chromium.org/10454066/diff/7004/net/socket/ssl_client_socke... net/socket/ssl_client_socket_nss.cc:544: // True if the peer predicted a certificate chain (via "the peer predicted a certificate chain" is wrong. It is us rather than the peer that predicted a certificate chain. I think this should say "True if we predicted a peer certificate chain (via ..". http://codereview.chromium.org/10454066/diff/7004/net/socket/ssl_client_socke... net/socket/ssl_client_socket_nss.cc:552: // The negotiated security parameters (TLS version, cipher, extensions) Nit: add "of the SSL connection" at the end. http://codereview.chromium.org/10454066/diff/7004/net/socket/ssl_client_socke... net/socket/ssl_client_socket_nss.cc:557: Nit: delete this blank line. http://codereview.chromium.org/10454066/diff/7004/net/socket/ssl_client_socke... net/socket/ssl_client_socket_nss.cc:666: base::SingleThreadTaskRunner* nss_task_runner, Why do we use different types of task runners? network_task_runner is always base::ThreadTaskRunnerHandle::Get(), which is a SingleThreadTaskRunner. It's not clear why you throw away that bit of info. I think our code requires that the network_task_runner be single-threaded. http://codereview.chromium.org/10454066/diff/7004/net/socket/ssl_client_socke... net/socket/ssl_client_socket_nss.cc:872: // The current connection state. Mirrors |nss_handshake_state_|. Nit: connection state => handshake state Please fix all occurrences in the file. http://codereview.chromium.org/10454066/diff/7004/net/socket/ssl_client_socke... net/socket/ssl_client_socket_nss.cc:1161: PostOrRunCallback(FROM_HERE, base::Bind(callback, rv)); Nit: it is slightly clearer to do: if (!OnNetworkTaskRunner()) { PostOrRunCallback(FROM_HERE, base::Bind(callback, rv)); return ERR_IO_PENDING; } I know this doesn't matter because the return value of this function is ignored in this case, but this is a subtle point and the current code seems as if it reported |rv| twice. This also applies to lines 1200-1201 for Core::Write. http://codereview.chromium.org/10454066/diff/7004/net/socket/ssl_client_socke... net/socket/ssl_client_socket_nss.cc:1212: return network_task_runner_->RunsTasksOnCurrentThread(); Is this test cheap? We do these tests a lot. http://codereview.chromium.org/10454066/diff/7004/net/socket/ssl_client_socke... net/socket/ssl_client_socket_nss.cc:2727: DCHECK(OnNetworkTaskRunner()); Remove this DCHECK. It will never fail because of the check on line 2719. http://codereview.chromium.org/10454066/diff/7004/net/socket/ssl_client_socke... net/socket/ssl_client_socket_nss.cc:2729: if (detached_ || task.is_null()) It seems that |task| should not be null, right? Do we ever call PostOrRunCallback with a null |task|? Perhaps we can DCHECK(!task.is_null()) at the beginning of this function?
http://codereview.chromium.org/10454066/diff/7004/net/socket/client_socket_fa... File net/socket/client_socket_factory.cc (right): http://codereview.chromium.org/10454066/diff/7004/net/socket/client_socket_fa... net/socket/client_socket_factory.cc:51: nss_thread_->Start(); On 2012/06/01 01:02:38, wtc wrote: > > This creates a thread with MessageLoop::TYPE_DEFAULT, right? > Just wanted to confirm it. Yes. This is why I need to test on the other platforms (will Windows require a MessageLoopForUI, because of the possible prompting, and will OS X require a MessageLoopForUI so that there is a CFRunLoop to receive Security.framework notifications/UI prompts) http://codereview.chromium.org/10454066/diff/7004/net/socket/ssl_client_socke... File net/socket/ssl_client_socket_nss.cc (right): http://codereview.chromium.org/10454066/diff/7004/net/socket/ssl_client_socke... net/socket/ssl_client_socket_nss.cc:444: PeerCertificateChain& PeerCertificateChain::operator=( On 2012/06/01 01:02:38, wtc wrote: > > Because OnHandshakeStateUpdated calls this operator=, > the NSS functions CERT_DestroyCertificate and > CERT_DupCertificate will still be called on the network > task runner. Hopefully that's OK. Yes. They are thread-safe atomic ops, with no locking/blocking. http://codereview.chromium.org/10454066/diff/7004/net/socket/ssl_client_socke... net/socket/ssl_client_socket_nss.cc:469: // The handshake on |nss_fd| may not have completed. On 2012/06/01 01:02:38, wtc wrote: > > I believe we never call PeerCertificateChain::Reset() > with a non-null nss_fd before the handshake is fiished, > so this probably can be a DCHECK. This would either be a CHECK() (because vector_as_array() is undefined behaviour when num_certs == 0, and this is a library failure, not a programmer failure) or a soft check. Preferring a soft-check seemed correct, and it matches our existing code (which explicitly checked that server_cert_nss != NULL before using it) http://codereview.chromium.org/10454066/diff/7004/net/socket/ssl_client_socke... net/socket/ssl_client_socket_nss.cc:666: base::SingleThreadTaskRunner* nss_task_runner, On 2012/06/01 01:02:38, wtc wrote: > > Why do we use different types of task runners? > network_task_runner is always base::ThreadTaskRunnerHandle::Get(), > which is a SingleThreadTaskRunner. It's not clear why > you throw away that bit of info. I think our code > requires that the network_task_runner be single-threaded. willchan can probably provide more, but the preference is to generally use a SequencedTaskRunner (of which different times exist, including worker pools) whenever possible, and use STTR sparingly. Because NSS and NSPR use thread-local-storage, the NSS task runner is kept as a single-threaded task runner to ensure that all tasks are invoked on the same thread (ergo, the same TLS slot). Will asked on IRC that I add comments to mention this, since it's subtle. http://codereview.chromium.org/10454066/diff/7004/net/socket/ssl_client_socke... net/socket/ssl_client_socket_nss.cc:1212: return network_task_runner_->RunsTasksOnCurrentThread(); On 2012/06/01 01:02:38, wtc wrote: > > Is this test cheap? We do these tests a lot. These are backed by MessageLoopProxyImpl::RunsTasksOnCurrentThread, which requires a lock acquisition for comparison. The same lock is acquired by PostTaskHelper. Once all platforms use a multi-threaded impl, we can remove most of the checks, only paying the cost when posting the tasks. http://codereview.chromium.org/10454066/diff/7004/net/socket/ssl_client_socke... net/socket/ssl_client_socket_nss.cc:2727: DCHECK(OnNetworkTaskRunner()); On 2012/06/01 01:02:38, wtc wrote: > > Remove this DCHECK. It will never fail because of the check > on line 2719. Note that I pretty liberally sprinkled DCHECKs as a way of documenting the behaviour (and in part to test I got the threading right). For that reason, I think it's a good indicator of the pre-conditions for the remainder of this function - mroe so than the if statement. http://codereview.chromium.org/10454066/diff/7004/net/socket/ssl_client_socke... net/socket/ssl_client_socket_nss.cc:2729: if (detached_ || task.is_null()) On 2012/06/01 01:02:38, wtc wrote: > > It seems that |task| should not be null, right? Do we > ever call PostOrRunCallback with a null |task|? Perhaps > we can DCHECK(!task.is_null()) at the beginning of this > function? task.is_null() if bound to a WeakPtr that has been invalidated. This was a hold-over from where I used WeakPtrs more liberally than the RCTS, but by adding the detached_ member, it's probably not necessary to check this at all, and let line 2731 crash if so.
Hm, looks like I had these draft comments leftover from last week. We chatted a lot offline and I think rsleevi understands my opinions well enough now. I'm satisfied with whatever structure he picks for the Core stuff. http://codereview.chromium.org/10454066/diff/7004/net/socket/client_socket_fa... File net/socket/client_socket_factory.cc (right): http://codereview.chromium.org/10454066/diff/7004/net/socket/client_socket_fa... net/socket/client_socket_factory.cc:96: base::ThreadTaskRunnerHandle::Get()); How about changing this to be acquired once in the constructor? I think this makes more sense, since I could foresee in the future actually passing this into the constructor. http://codereview.chromium.org/10454066/diff/7004/net/socket/ssl_client_socke... File net/socket/ssl_client_socket_nss.h (right): http://codereview.chromium.org/10454066/diff/7004/net/socket/ssl_client_socke... net/socket/ssl_client_socket_nss.h:61: // behaviour is desired, for performance or compatibility, the current task behavior http://codereview.chromium.org/10454066/diff/7004/net/socket/ssl_client_socke... net/socket/ssl_client_socket_nss.h:155: scoped_refptr<base::SingleThreadTaskRunner> nss_task_runner_; const? http://codereview.chromium.org/10454066/diff/7004/net/socket/ssl_client_socke... net/socket/ssl_client_socket_nss.h:160: scoped_refptr<Core> core_; const?
http://codereview.chromium.org/10454066/diff/1/net/third_party/nss/ssl/sslinfo.c File net/third_party/nss/ssl/sslinfo.c (right): http://codereview.chromium.org/10454066/diff/1/net/third_party/nss/ssl/sslinf... net/third_party/nss/ssl/sslinfo.c:379: ssl_GetRecvBufLock(ss); On 2012/06/01 01:02:38, wtc wrote: > > On 2012/05/31 01:31:14, Ryan Sleevi wrote: > > > > 1stHandshakeLock is itself guarded by RecvBufLock. I'll dig through the notes > > again, but there were writes to ss->version /before/ 1stHandshakeLock (in > > particular, renegotiation, IIRC), hence the broad lock. > > 1stHandshakeLock is broader than RecvBufLock. > > I found places where NSS reads or writes ss->version > while holding only 1stHandshakeLock. The place where > NSS writes ss->version while holding only 1stHandshakeLock > is in ssl3_SendClientHello, when it copies the version from the > SID structure. > > Unless we are modifying ss->version based on info just > received from the peer, we don't need to hold RecvBufLock. ssl3_SendClientHello expects HaveSSL3HandshakeLock. http://mxr.mozilla.org/security/source/security/nss/lib/ssl/ssl3con.c?mark=39... ssl3_HandleV2ClientHello (really only for servers) is the one I was thinking of that didn't have it locked appropriately, because it originates in the SSL2 layer, and thus only has RecvBuf. Thus, I think this is the correct locking sequence, even though SSLServerSocketNSS has (not yet?) been ported to use a ref-counted core. http://codereview.chromium.org/10454066/diff/7004/net/socket/ssl_client_socke... File net/socket/ssl_client_socket_nss.h (right): http://codereview.chromium.org/10454066/diff/7004/net/socket/ssl_client_socke... net/socket/ssl_client_socket_nss.h:155: scoped_refptr<base::SingleThreadTaskRunner> nss_task_runner_; On 2012/06/04 16:50:38, willchan wrote: > const? Is this a new thing, trying to document const pointers like this? It certainly seems abnormal in code reviews, and I seem to recall seeing other reviews going by that disagreed with 'const'antly making these 'const'. For now, left alone, to be consistent with the majority of existing code, although I'm certainly open to change if there is a new One True Way. http://codereview.chromium.org/10454066/diff/7004/net/socket/ssl_client_socke... net/socket/ssl_client_socket_nss.h:160: scoped_refptr<Core> core_; On 2012/06/04 16:50:38, willchan wrote: > const? This is not const - we do reset on Disconnect()
http://codereview.chromium.org/10454066/diff/7004/net/socket/ssl_client_socke... File net/socket/ssl_client_socket_nss.h (right): http://codereview.chromium.org/10454066/diff/7004/net/socket/ssl_client_socke... net/socket/ssl_client_socket_nss.h:155: scoped_refptr<base::SingleThreadTaskRunner> nss_task_runner_; On 2012/06/04 21:51:51, Ryan Sleevi wrote: > On 2012/06/04 16:50:38, willchan wrote: > > const? > > Is this a new thing, trying to document const pointers like this? It certainly > seems abnormal in code reviews, and I seem to recall seeing other reviews going > by that disagreed with 'const'antly making these 'const'. > > For now, left alone, to be consistent with the majority of existing code, > although I'm certainly open to change if there is a new One True Way. There's no One True Way. My stylistic preference is for more const though. Indeed, I'm a proud member of const-grouplet@. But my degree of preference is such that, if anyone disagreed in a code review, I would immediately drop it :) It's not worth insisting on.
http://codereview.chromium.org/10454066/diff/1/net/third_party/nss/ssl/sslinfo.c File net/third_party/nss/ssl/sslinfo.c (right): http://codereview.chromium.org/10454066/diff/1/net/third_party/nss/ssl/sslinf... net/third_party/nss/ssl/sslinfo.c:379: ssl_GetRecvBufLock(ss); I am not suggesting that we remove the ssl_GetSSL3HandshakeLock call. I am suggesting that we either remove the ssl_GetRecvBufLock call or replace the ssl_GetRecvBufLock call with a ssl_Get1stHandshakeLock call. The RecvBufLock has to be held during ssl3_HandleV2ClientHello because ssl3_HandleV2ClientHello is using data in ss->gs.buf to set ss->version. The RecvBufLock is used to protect ss->gs.buf, as documented in this comment: PZMonitor * recvBufLock; /* locks low level recv buffers. */ All the ssl3_Handle<HandshakeMessate> functions need to hold the RecvBufLock for this reason. I think it is enough for SSL_ExportKeyingMaterial to hold just the SSL3HandshakeLock.
http://codereview.chromium.org/10454066/diff/1/net/third_party/nss/ssl/sslinfo.c File net/third_party/nss/ssl/sslinfo.c (right): http://codereview.chromium.org/10454066/diff/1/net/third_party/nss/ssl/sslinf... net/third_party/nss/ssl/sslinfo.c:379: ssl_GetRecvBufLock(ss); On 2012/06/04 23:44:58, wtc wrote: > > I am not suggesting that we remove the ssl_GetSSL3HandshakeLock > call. I am suggesting that we either remove the > ssl_GetRecvBufLock call or replace the ssl_GetRecvBufLock > call with a ssl_Get1stHandshakeLock call. > > The RecvBufLock has to be held during ssl3_HandleV2ClientHello > because ssl3_HandleV2ClientHello is using data in > ss->gs.buf to set ss->version. The RecvBufLock is > used to protect ss->gs.buf, as documented in this > comment: > > PZMonitor * recvBufLock; /* locks low level recv buffers. */ > > All the ssl3_Handle<HandshakeMessate> functions need to > hold the RecvBufLock for this reason. > > I think it is enough for SSL_ExportKeyingMaterial to > hold just the SSL3HandshakeLock. Apologies, but I still don't understand why you're suggesting this. SSL3HandshakeLock is too narrow by itself (doesn't cover the server case) 1stHandshakeLook is too narrow by itself (doesn't cover renegotiation handshakes) The only one that covers the entire state machine is RecvBufLock. As you note, /all/ ssl[23]_Handle<HandshakeMessage> functions need to hold this lock. The remaining locking is to ensure that the lock ordering (eg: must hold HS lock, must hold SpecReadLock) is respected, even though RecvBufLock should cover all of this.
+cc sergeyu for the remoting changes. With this change, the remoting/ FakeSocket attempts to push a message to the ChromeOS NSS thread during message_loop_.RunAllPending(). This works fine, but the main thread (the test thread) ends up finishing executing .RunAllPending() before the NSS thread can post the response back. The solution is to use .Run(), except now there has to be a way to abort the task. An alternative (which I'm not convinced would be much more elegant), was discussed on https://groups.google.com/a/chromium.org/group/chromium-dev/browse_thread/thr... , would be to do class SomeHelper : public base::RefCounted<SomeHelper> { private: ScopedClosureRunner scoped_closure_; base::OneShotTimer<...> timer_; }; timer_ would be wired to run scoped_closure_ when it fires. Alternatively, scoped_closure will run once the last RefCount to SomeHelper goes out of scope (eg: after the last callback). scoped_closure_ would be bound to MessageLoopProxy::current()->QuitNow(), thus ensuring that the current message loop will be aborted iff both callbacks are run or the timer fires. Then there would be some updated method signature so that we could bind a RefCounted SomeHelper while still forwarding the args to the original callback. All of this seems much more complex and harder to read than the above solution, but I mentioned it as a consideration for will if it's believed to be a better approach or if I've missed something fundamental.
LGTM, but see my nits http://codereview.chromium.org/10454066/diff/20003/remoting/protocol/authenti... File remoting/protocol/authenticator_test_base.cc (right): http://codereview.chromium.org/10454066/diff/20003/remoting/protocol/authenti... remoting/protocol/authenticator_test_base.cc:30: class ChannelDoneHelper { In jingle_session_unittest.cc there is QuitThreadOnCounter gtest action, which implements something similar, but without timeouts. Maybe reuse it here? Timeout is easy to add by defining Timer directly in RunChannelAuth(). http://codereview.chromium.org/10454066/diff/20003/remoting/protocol/authenti... remoting/protocol/authenticator_test_base.cc:33: void(net::Error, scoped_ptr<net::StreamSocket>)> ConnectedCallback; Can we just reuse ChannelAuthenticator::DoneCallback? http://codereview.chromium.org/10454066/diff/20003/remoting/protocol/authenti... remoting/protocol/authenticator_test_base.cc:35: ChannelDoneHelper() : expected_calls_(1), actual_calls_(0) {} nit: why initialize expected_calls_ to 1? http://codereview.chromium.org/10454066/diff/20003/remoting/protocol/authenti... remoting/protocol/authenticator_test_base.cc:37: // Starts the shutdown timer, expecting |OnConnected()| to be called at nit: OnConnected(), || are redundant here. http://codereview.chromium.org/10454066/diff/20003/remoting/protocol/ssl_hmac... File remoting/protocol/ssl_hmac_channel_authenticator_unittest.cc (right): http://codereview.chromium.org/10454066/diff/20003/remoting/protocol/ssl_hmac... remoting/protocol/ssl_hmac_channel_authenticator_unittest.cc:41: class SslHmacChannelAuthenticatorTest : public AuthenticatorTestBase { AuthenticatorTestBase was supposed to be a base class for unittests for implementation of the remoting::Authenticator interface. SslHmacChannelAuthenticator implements remoting::ChannelAuthenticator, which is related, but not the same interface. So this is confusing now. Can you please add a comment in the AuthenticatorTestBase header to explain that it is used to test ChannelAuthenticator too? Ideally we should also split the channel-related parts to ChannelAuthenticatorTestBase, and then AuthenticatorTestBase could inherit from it, but there is no need to do it in this CL.
Thanks very much for the feedback Sergey. Rather than violate the design and force clean-up, I went with the ACTION_P duplication, given that it's incredibly minor code. I should get more familiarity with gmock predicated actions, as this definitely makes it simpler.
LGTM On 2012/06/05 18:30:48, Ryan Sleevi wrote: > Thanks very much for the feedback Sergey. > > Rather than violate the design and force clean-up, I went with the ACTION_P > duplication, given that it's incredibly minor code. I should get more > familiarity with gmock predicated actions, as this definitely makes it simpler.
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/rsleevi@chromium.org/10454066/38001
Try job failure for 10454066-38001 (retry) on win_rel for step "content_unittests". It's a second try, previously, step "content_unittests" failed. http://build.chromium.org/p/tryserver.chromium/buildstatus?builder=win_rel&nu...
http://codereview.chromium.org/10454066/diff/7004/net/socket/client_socket_fa... File net/socket/client_socket_factory.cc (right): http://codereview.chromium.org/10454066/diff/7004/net/socket/client_socket_fa... net/socket/client_socket_factory.cc:96: base::ThreadTaskRunnerHandle::Get()); On 2012/06/04 16:50:38, willchan wrote: > How about changing this to be acquired once in the constructor? I think this > makes more sense, since I could foresee in the future actually passing this into > the constructor. Undid this change. Unit tests may change the current thread's task runner. Turns out this happens in content_unittests, but not net_unittests. Go figure. I am caching the thread's MLP, and I agree, we may wish to pass this in the ctor in the future, although as my comments to wtc (comment #2/#3) indicate, I think the higher up we pass the buck, the more complicated the state becomes, and the whole "NSS on one thread or two" is really implementation-specific.
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/rsleevi@chromium.org/10454066/40002
Change committed as 140697 |