为什么多线程似乎没有加速我的Web应用程序?

时间:2022-09-20 13:58:33
class ApplicationContext{
    private final NetworkObject networkObject = new networkObject();

    public ApplicationContext(){
      networkObject.setHost("host");
      networkObject.setParams("param");
    }

    public searchObjects(ObjectType objType){
        networkObject.doSearch(buildQuery(objType));
    }
}

class NetworkObject{
    private final SearchObject searchObject = new SearchObject();

    public doSearch(SearchQuery searchQuery){
        searchObject.search(searchQuery); //threadsafe, takes 15(s) to return
    }
}

Consider a webserver running a web application which creates only one ApplicationContext instance (singleton) and uses the same applicationInstance to call searchObjects e.g.

考虑运行Web应用程序的Web服务器,该Web应用程序仅创建一个ApplicationContext实例(单例)并使用相同的applicationInstance来调用searchObjects,例如

 ApplicationContext appInstance = 
                  ApplicationContextFactory.Instance(); //singleton

Every new request to a webpage say 'search.jsp' makes a call

每个对网页的新请求都说“search.jsp”会拨打电话

 appInstance.searchObjects(objectType);

I am making 1000 requests to 'search.jsp' page. All the threads are using the same ApplicationContext instance, and searchObject.search() method takes 15 seconds to return. My Question is Do all other threads wait for their turn (15 sec) to execute when one is already executing the searchObject.search() function or All threads will execute searchObject.search() concurrently, Why??

我正在向'search.jsp'页面发出1000个请求。所有线程都使用相同的ApplicationContext实例,而searchObject.search()方法需要15秒才能返回。我的问题是当一个人已经在执行searchObject.search()函数或所有线程同时执行searchObject.search()时,所有其他线程是否等待轮到他们执行(15秒),为什么?

I hope I have made my question very clear??

我希望我的问题非常清楚?

Update: Thanks all for clarifying my doubt. Here is my second Question, what difference in performance should be observe when I do:

更新:感谢所有人澄清我的疑问。这是我的第二个问题,当我这样做时应该观察到性能的差异:

public synchronized doSearch(SearchQuery searchQuery){
    searchObject.search(searchQuery); //threadsafe, takes 15(s) to return
}

OR

public doSearch(SearchQuery searchQuery){
    searchObject.search(searchQuery); //threadsafe, takes 15(s) to return
}

I believe using the function 'doSearch' without synchronized keyword should be giving more performance. But, when I tested it today, the results came out the other way. The performance was similar or sometimes better when I use synchronized keyword.

我相信使用没有synchronized关键字的'doSearch'功能应该可以提供更多性能。但是,当我今天测试它时,结果出现了另一种方式。当我使用synchronized关键字时,性能相似或有时更好。

Can anyone explain the behavior. How should I debug such cases.

谁能解释这种行为。我应该如何调试这种情况。

Regards,

Perry

7 个解决方案

#1


Well you haven't specified any synchronization in the code, so without any other evidence I'd suspect that all the threads will run concurrently. If SearchObject.search contains some synchronization though, that would obvious limit the concurrency.

那么你没有在代码中指定任何同步,所以没有任何其他证据我怀疑所有线程将同时运行。如果SearchObject.search包含一些同步,那么这显然会限制并发性。

Mind you, your JSP container is probably using a thread pool to service the 1000 requests, rather than creating 1000 threads.

请注意,您的JSP容器可能正在使用线程池来处理1000个请求,而不是创建1000个线程。

EDIT: As for why it may be faster with synchronized: sometimes concurrency isn't actually helpful to throughput. Things like context switching, disk bottlenecks, cache misses etc have that effect. It's usually not a good idea to have more running threads than cores.

编辑:至于为什么它可能更快与同步:有时并发实际上并没有帮助吞吐量。上下文切换,磁盘瓶颈,缓存未命中等等都会产生这种影响。拥有比核心更多的运行线程通常不是一个好主意。

For a real-life example, suppose you have a thousand shoppers who all want to buy things from a fairly small shop. How would you go about it? Put all 1000 in the shop at the same time, or keep it down to a fairly small number in the shop at any one time, and a queue outside?

对于一个真实的例子,假设你有一千个购物者都想从一个相当小的商店购买东西。你会怎么做?将所有1000个同时放入商店,或者在任何时间将其保持在商店中相当小的数量,并在外面排队?

#2


It is wise to realise that performance is with respect to a given environment. In this case, it probably is the performance of the software on your laptop or test server. It is wise to check performance on something similar to a production environment before even considering to optimize the code because the bottleneck there could be quite different than on the development machine.

明智地认识到性能是针对特定环境的。在这种情况下,它可能是您的笔记本电脑或测试服务器上的软件性能。在考虑优化代码之前检查类似于生产环境的性能是明智的,因为那里的瓶颈可能与开发机器上的瓶颈完全不同。

As an example; when I test my software with a large database on my laptop I always end up being harddisk-IO bound. In production however the database server has plenty memory and speedy disks so it would not be wise to optimise my software for IO.

举个例子;当我在笔记本电脑上使用大型数据库测试我的软件时,我总是被硬盘IO限制。然而,在生产中,数据库服务器具有足够的内存和快速的磁盘,因此优化我的IO软件是不明智的。

Similar with threading; the processor in my laptop can run one or two processes simultaneously. Having 8 threads does not speed things up. The production machine however might very well be able to handle 8 threads simultaneously.

与线程类似;我的笔记本电脑中的处理器可以同时运行一个或两个进程。拥有8个线程并不会加快速度。然而,生产机器可以很好地同时处理8个螺纹。

What I think is more important than performance is semantics. Using a keyword like synchronous is not only instructive to the compiler, but also to the (next) developer.

我认为比性能更重要的是语义。使用像synchronous这样的关键字不仅对编译器有用,而且对(下一个)开发人员也有帮助。

By using synchronous you share a lock with all other synchronous methods on ApplicationContext, also the methods which might have nothing to do with searchObject. Personally I doubt very much you wish to synchronize on an object called ApplicationContext.

通过使用synchronized,您可以与ApplicationContext上的所有其他同步方法共享锁,也可以使用与searchObject无关的方法。就个人而言,我非常怀疑你希望在一个名为ApplicationContext的对象上进行同步。

If searchObject were not thread safe I would probably recommend a locking object. This comes in to flavors:

如果searchObject不是线程安全的,我可能会推荐一个锁定对象。这涉及到口味:

public void doSearch(SearchQuery searchQuery){
   synchronized(searchObject) {// Only if searchObject is guaranteed to be null
       searchObject.search(searchQuery); //threadsafe, takes 15(s) to return
  }
}

or

public class ApplicationContext {
    private SearchObject searchObject = null;
    private final Object searchObjectLock = new Object();    

    public void doSearch(SearchQuery searchQuery){
       synchronized(searchObjectLock) {
           searchObject.search(searchQuery); //threadsafe, takes 15(s) to return
      }
    }
}

Don't forget to lock every use of searchObject to prevent threading trouble. Using this fine-grained locking mechanism you can at least keep the ApplicationContext available to classes which do not need searchObject related functionality.

不要忘记锁定每次使用searchObject以防止线程故障。使用这种细粒度锁定机制,您至少可以将ApplicationContext保留给不需要searchObject相关功能的类。

In your case I would not use any synchronisation as it is not required, and check on production-like hardware before determining bottlenecks.

在您的情况下,我不会使用任何同步,因为它不是必需的,并在确定瓶颈之前检查类似生产的硬件。

And if searchObject uses a database make sure the database is property indexed and uses that index. If it needs to do 1000 full table scans it won't be fast anyhow...

如果searchObject使用数据库,请确保数据库属性已编入索引并使用该索引。如果需要进行1000次全表扫描,无论如何都不会很快......

#3


If you've got no synchronisation, then each thread will run concurrently and not block on locks.

如果你没有同步,那么每个线程将同时运行而不是阻塞锁。

Note that

// threadsafe

(as commented) means it'll work properly with multiple threads accessing it - not that it'll block threads.

(如评论所示)意味着它可以正常工作,多个线程访问它 - 而不是它会阻止线程。

#4


They can all execute it at the same time unless it is declared as synchronized,regardless of the fact that your class is a singleton IIRC.

他们都可以同时执行它,除非它被声明为同步,无论你的班级是单身IIRC。

#5


If SearchObject.search is synchronized, then yes. Otherwise, just try it and see.

如果SearchObject.search已同步,则为yes。否则,试试看吧。

#6


Why does it take 15 seconds? If it's waiting for disk access and you only have one disk, then no matter how many threads you have you're limited by the disk seek speed. More threads may even be slower in this situation.

为什么需要15秒?如果它正在等待磁盘访问并且您只有一个磁盘,那么无论您拥有多少线程,您都受到磁盘搜索速度的限制。在这种情况下,更多线程甚至可能更慢。

#7


In your case, they will all execute concurrently.

在您的情况下,它们将同时执行。

If you wanted to prevent this, you would need to have some kind of synchronization in place to prevent that (e.g. declaring the method as synchronized or using locks).

如果要阻止这种情况,则需要进行某种同步以防止这种情况(例如,将方法声明为同步或使用锁定)。

ETA:

If you declare the doSearch() method as synchronized, only one thread at a time will be able to call it. Other threads will block until the first thread finishes and the waiting threads will be "let in" one at a time. As you can imagine, this will kill your performance if you have lots of threads calling that function.

如果将doSearch()方法声明为synchronized,则一次只能有一个线程调用它。其他线程将阻塞,直到第一个线程完成,等待线程将一次“放入”一个。可以想象,如果你有很多线程调用该函数,这将会破坏你的性能。

#1


Well you haven't specified any synchronization in the code, so without any other evidence I'd suspect that all the threads will run concurrently. If SearchObject.search contains some synchronization though, that would obvious limit the concurrency.

那么你没有在代码中指定任何同步,所以没有任何其他证据我怀疑所有线程将同时运行。如果SearchObject.search包含一些同步,那么这显然会限制并发性。

Mind you, your JSP container is probably using a thread pool to service the 1000 requests, rather than creating 1000 threads.

请注意,您的JSP容器可能正在使用线程池来处理1000个请求,而不是创建1000个线程。

EDIT: As for why it may be faster with synchronized: sometimes concurrency isn't actually helpful to throughput. Things like context switching, disk bottlenecks, cache misses etc have that effect. It's usually not a good idea to have more running threads than cores.

编辑:至于为什么它可能更快与同步:有时并发实际上并没有帮助吞吐量。上下文切换,磁盘瓶颈,缓存未命中等等都会产生这种影响。拥有比核心更多的运行线程通常不是一个好主意。

For a real-life example, suppose you have a thousand shoppers who all want to buy things from a fairly small shop. How would you go about it? Put all 1000 in the shop at the same time, or keep it down to a fairly small number in the shop at any one time, and a queue outside?

对于一个真实的例子,假设你有一千个购物者都想从一个相当小的商店购买东西。你会怎么做?将所有1000个同时放入商店,或者在任何时间将其保持在商店中相当小的数量,并在外面排队?

#2


It is wise to realise that performance is with respect to a given environment. In this case, it probably is the performance of the software on your laptop or test server. It is wise to check performance on something similar to a production environment before even considering to optimize the code because the bottleneck there could be quite different than on the development machine.

明智地认识到性能是针对特定环境的。在这种情况下,它可能是您的笔记本电脑或测试服务器上的软件性能。在考虑优化代码之前检查类似于生产环境的性能是明智的,因为那里的瓶颈可能与开发机器上的瓶颈完全不同。

As an example; when I test my software with a large database on my laptop I always end up being harddisk-IO bound. In production however the database server has plenty memory and speedy disks so it would not be wise to optimise my software for IO.

举个例子;当我在笔记本电脑上使用大型数据库测试我的软件时,我总是被硬盘IO限制。然而,在生产中,数据库服务器具有足够的内存和快速的磁盘,因此优化我的IO软件是不明智的。

Similar with threading; the processor in my laptop can run one or two processes simultaneously. Having 8 threads does not speed things up. The production machine however might very well be able to handle 8 threads simultaneously.

与线程类似;我的笔记本电脑中的处理器可以同时运行一个或两个进程。拥有8个线程并不会加快速度。然而,生产机器可以很好地同时处理8个螺纹。

What I think is more important than performance is semantics. Using a keyword like synchronous is not only instructive to the compiler, but also to the (next) developer.

我认为比性能更重要的是语义。使用像synchronous这样的关键字不仅对编译器有用,而且对(下一个)开发人员也有帮助。

By using synchronous you share a lock with all other synchronous methods on ApplicationContext, also the methods which might have nothing to do with searchObject. Personally I doubt very much you wish to synchronize on an object called ApplicationContext.

通过使用synchronized,您可以与ApplicationContext上的所有其他同步方法共享锁,也可以使用与searchObject无关的方法。就个人而言,我非常怀疑你希望在一个名为ApplicationContext的对象上进行同步。

If searchObject were not thread safe I would probably recommend a locking object. This comes in to flavors:

如果searchObject不是线程安全的,我可能会推荐一个锁定对象。这涉及到口味:

public void doSearch(SearchQuery searchQuery){
   synchronized(searchObject) {// Only if searchObject is guaranteed to be null
       searchObject.search(searchQuery); //threadsafe, takes 15(s) to return
  }
}

or

public class ApplicationContext {
    private SearchObject searchObject = null;
    private final Object searchObjectLock = new Object();    

    public void doSearch(SearchQuery searchQuery){
       synchronized(searchObjectLock) {
           searchObject.search(searchQuery); //threadsafe, takes 15(s) to return
      }
    }
}

Don't forget to lock every use of searchObject to prevent threading trouble. Using this fine-grained locking mechanism you can at least keep the ApplicationContext available to classes which do not need searchObject related functionality.

不要忘记锁定每次使用searchObject以防止线程故障。使用这种细粒度锁定机制,您至少可以将ApplicationContext保留给不需要searchObject相关功能的类。

In your case I would not use any synchronisation as it is not required, and check on production-like hardware before determining bottlenecks.

在您的情况下,我不会使用任何同步,因为它不是必需的,并在确定瓶颈之前检查类似生产的硬件。

And if searchObject uses a database make sure the database is property indexed and uses that index. If it needs to do 1000 full table scans it won't be fast anyhow...

如果searchObject使用数据库,请确保数据库属性已编入索引并使用该索引。如果需要进行1000次全表扫描,无论如何都不会很快......

#3


If you've got no synchronisation, then each thread will run concurrently and not block on locks.

如果你没有同步,那么每个线程将同时运行而不是阻塞锁。

Note that

// threadsafe

(as commented) means it'll work properly with multiple threads accessing it - not that it'll block threads.

(如评论所示)意味着它可以正常工作,多个线程访问它 - 而不是它会阻止线程。

#4


They can all execute it at the same time unless it is declared as synchronized,regardless of the fact that your class is a singleton IIRC.

他们都可以同时执行它,除非它被声明为同步,无论你的班级是单身IIRC。

#5


If SearchObject.search is synchronized, then yes. Otherwise, just try it and see.

如果SearchObject.search已同步,则为yes。否则,试试看吧。

#6


Why does it take 15 seconds? If it's waiting for disk access and you only have one disk, then no matter how many threads you have you're limited by the disk seek speed. More threads may even be slower in this situation.

为什么需要15秒?如果它正在等待磁盘访问并且您只有一个磁盘,那么无论您拥有多少线程,您都受到磁盘搜索速度的限制。在这种情况下,更多线程甚至可能更慢。

#7


In your case, they will all execute concurrently.

在您的情况下,它们将同时执行。

If you wanted to prevent this, you would need to have some kind of synchronization in place to prevent that (e.g. declaring the method as synchronized or using locks).

如果要阻止这种情况,则需要进行某种同步以防止这种情况(例如,将方法声明为同步或使用锁定)。

ETA:

If you declare the doSearch() method as synchronized, only one thread at a time will be able to call it. Other threads will block until the first thread finishes and the waiting threads will be "let in" one at a time. As you can imagine, this will kill your performance if you have lots of threads calling that function.

如果将doSearch()方法声明为synchronized,则一次只能有一个线程调用它。其他线程将阻塞,直到第一个线程完成,等待线程将一次“放入”一个。可以想象,如果你有很多线程调用该函数,这将会破坏你的性能。